All posts by IEP Author

Mary Astell (1666-1731)

The English writer Mary Astell is widely known today as an early feminist pioneer, but not so well known as a philosophical thinker. Her feminist reputation rests largely on her impassioned plea to establish an all-female college in England, an idea first put forward in her Serious Proposal to the Ladies (1694). She is also remembered for her harsh but witty indictment of early modern marriage in her Some Reflections upon Marriage (1700). Underlying Astell’s feminist ideas, however, are strong philosophical foundations in the form of Cartesian epistemological and metaphysical principles. These principles play an important strategic role in her writings: to raise an awareness in women of their inherent ability to bring themselves to moral and intellectual perfection—to “pull themselves up by their bootstraps,” so to speak—regardless of their external circumstances. Toward this end, Astell urges her fellow women to embrace René Descartes’ “clear and distinct ideas” as the hallmarks of truth and certainty. In accordance with Cartesian rationalism, she teaches her readers that all knowledge can be founded on reason rather than the senses, and she urges them to practice Cartesian rules for thinking in order to attain knowledge of both moral and metaphysical truths. As a dualist, she encourages women to regard their souls as thinking substances distinct from their bodies and as capable of attaining mastery over bodily sensations and passions. In all her major writings, these philosophical themes are so prevalent that Astell might be justly regarded as one of the earliest feminist philosophers of the modern age.

Astell is an unorthodox Cartesian, however, insofar as she breaks from a number of Descartes’ classic doctrines, such as his theory of innate ideas and his views about the essence of the soul. And while Astell is indebted to Descartes’ ethical theory of the passions, her moral-theological viewpoint also closely resembles the Augustinian outlook of her English contemporary John Norris and the French thinker Nicolas Malebranche. As with these men, the intensely religious aspects of her thought cannot be ignored. The same deep religiosity permeates her political writings, and is arguably the main driver behind her critiques of the Whig philosophy of John Locke.

This article covers six key areas of Astell’s philosophy: her theory of knowledge, her metaphysics of mind and body, her philosophy of religion, her moral views, her feminist ideas, and her political thought.

Table of Contents

  1. Life
  2. Theory of Knowledge
  3. Metaphysics of Mind and Body
  4. Philosophy of Religion
  5. Moral Theory
  6. Feminism
    1. Education
    2. Marriage
  7. Political Thought
  8. Legacy
  9. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Astell was born in Newcastle-upon-Tyne, England, on November 12, 1666, and died in Chelsea, London, on May 9, 1731. She was the eldest of two children born to Peter Astell and Mary Errington, both of whom belonged to respected Northumberland families with strong royalist leanings. The most important influence on Astell’s early intellectual development appears to have been her uncle Ralph Astell, a clergyman-poet who was educated at the University of Cambridge in the mid-seventeenth century. Under his tuition, it is likely that Astell gained a strong familiarity with Anglican theology. The works of a number of popular Anglican theologians can be found in the remains of Astell’s library, now held in the Northamptonshire Records Office. Through her uncle’s influence, she may have also become acquainted with the ideas of the Cambridge Platonist Henry More, an early adherent of Cartesian philosophy in England. Ralph Astell attended both St John’s and Emmanuel College in the 1650s, just as More’s career at Cambridge was taking off, and Astell later cites More’s writings in her works.

In 1678, Astell’s father died and her life trajectory took an unexpected turn. As a result of her father’s untimely death, Astell’s financial and social situation grew precarious: her mother had to borrow money to keep the family afloat, and it seems that they could never have afforded Astell’s dowry, even if she had wanted to marry. Though there were rumors that Astell had once been engaged to a clergyman, she remained unmarried and childless all her life, choosing instead to lead the life of a writer.

At some point, probably in the late 1680s, Astell made the bold decision to leave her childhood home and migrate to London, seemingly without any family support. Soon after her arrival in the city, she made the acquaintance of Archbishop William Sancroft; and then in 1689 she dedicated a book of manuscript poetry to him, out of gratitude for his counsel and assistance in her time of need. A few years after completing this manuscript, Astell turned her hand to philosophy. In 1693, she embarked on a correspondence with John Norris, the author of a series of popular religio-philosophical works called the Practical Discourses. Their letters discuss Norris’s appropriation of the moral and metaphysical ideas of Nicolas Malebranche, a French philosopher best known for his doctrine of occasionalism, the theory that God is the only true causal agent in the universe. Their correspondence continued for one year and was eventually published as Letters Concerning the Love of God (1695).

In the mid-1690s, Astell’s writing career began in earnest. In 1694, she published her first Proposal. A few years later, she followed up this original work with a second part offering a method for the improvement of women’s reason, heavily indebted to the ideas of Descartes and his followers Antoine Arnauld and Pierre Nicole. Together with the Letters, the first and second Proposals made Astell something of a minor celebrity in London. She was publicly celebrated for her wit and eloquence, and openly commended by the likes of John Evelyn and Daniel Defoe. At the height of her career, Astell also had the support of several female benefactors of high social standing, including Lady Catherine Jones, Lady Elizabeth Hastings, Lady Ann Coventry, and Elizabeth Hutcheson. As a result, Astell was able to sustain her career as a writer, at least for a decade or so.

In 1700, Astell published her most popular feminist work, Some Reflections upon Marriage, a response to the scandalous marriage of Hortense Mancini, the duchess of Mazarin. Following this, her bookseller Richard Wilkin seems to have commissioned her to write several Tory political pamphlets. In 1704, she published three short tracts: Moderation Truly Stated, An Impartial Inquiry, and A Fair Way with the Dissenters. Then in 1705, Astell published her longest and most sophisticated work of moral philosophy, The Christian Religion, as Profess’d by a Daughter of the Church of England, a work that builds on the same feminist themes as her earlier treatises. In her final publication, Bart’lemy Fair (1709), Astell targets the third earl of Shaftesbury’s defense of free speech in his Letter Concerning Enthusiasm (1708).

After 1709, Astell did not publish any new works. But there is evidence that until her death she kept writing and also diligently editing her previous publications, not only her Christian Religion and Bart’lemy Fair (published in second editions in 1717 and 1720 respectively), but also the second part of her Proposal. In her later years, in keeping with her life-long interest in female education, Astell also took on the practical task of running a charity school for poor girls in her beloved neighborhood of Chelsea.

2. Theory of Knowledge

Astell’s guidelines on how to attain knowledge can be found in the second part of her Proposal (1697). In this work, Astell’s epistemological approach is distinctly rationalist insofar as she regards knowledge as founded on reason alone, and denies that sensory experience can be trusted as a reliable guide to truth. Her strict definition of knowledge is “that clear Perception which is follow’d by a firm assent to Conclusions rightly drawn from Premises of which we have clear and distinct Ideas” (SPL II 149). Like Descartes in his Principles of Philosophy (1644), Astell regards a perception as “clear” when it is accessible to the mind’s eye and the mind’s attention is firmly fixed on it. A perception is “distinct” when it is not only clear but also “particular” and distinguished from all other things. If an idea is both clear and distinct, then in Astell’s opinion we cannot withhold our assent from it (we cannot but affirm that it is true), without offending against reason.

Astell claims that we can attain knowledge by affirming only those ideas that are clear and distinct. To do so, we must learn to regulate the will, the mind’s active faculty of affirming or denying the ideas of the understanding. The will is to blame when we fall into erroneous judgements. We only really go astray because the will foolishly assents to more than it perceives; instead of carefully attending to the ideas of the understanding, it hurries on and makes rash judgments, beyond the scope of its ideas. We cannot successfully regulate the will, according to Astell, until we have learnt to moderate our passions or emotions. Certain emotions, such as pride and vanity, can prevent us from properly engaging in the search for truth. When we are faced with a truth that contradicts our mistaken idea of self-interest, for example, we shut our eyes against it and unreasonably refuse to entertain it.

Accordingly, in Astell’s view, a healthy disengagement from worldly things is an important first step toward the attainment of clarity and distinctness. Toward this end, in both her Proposals, she argues for the necessity of an academic retreat for women, so that they might withdraw from the hurry and noise of the everyday world (temporarily, at least) and focus their attention on nobler subjects. Importantly, she is not so concerned that women acquire knowledge for its own sake, but rather as a means for them to attain enduring happiness in both this life and the next. In her view, reason is the natural light that God has set up in our minds so that we might conform ourselves to his will and come to join him.

To attain both truth and happiness, a woman must follow reliable rules for thinking. Astell’s six rules bear a notable resemblance to Descartes’ own set of rules in his Discourse on the Method (1637), as well as those of his followers Arnauld and Nicole in their Logic, or the Art of Thinking (1662). She states that in any given inquiry, (i) we must acquire a distinct notion of our subject and a precise understanding of any key terms. Then (ii) we must avoid straying into any unnecessary or irrelevant subject matters, and conduct our thoughts in a natural, logical order. It follows that (iii) we must examine the simplest subjects first, before progressing to the study of more complex matters. (iv) We must take care to examine our subject thoroughly, according to each of its parts, and be sure not to leave any part unexamined. And (v) we must keep our focus firmly fixed on the subject at hand. (vi) Finally, and most importantly, we must not judge any further than we perceive, and we must not affirm anything as true unless it is incontestably known to be so.

In her later work, The Christian Religion, Astell deviates from Descartes’ epistemology by suggesting that the perception of truth is a participation in the mind of God (§262). In this respect, Astell comes closer to her unorthodox Cartesian contemporaries Norris and Malebranche, both of whom deny Descartes’ view that our ideas are innate, born within us, in our minds. Instead, her view has more in common with Augustine’s illuminationist theory that the human mind is capable of understanding ideas only by means of the divine light.

3. Metaphysics of Mind and Body

Astell’s argument for the soul-body distinction can be found in section 228 of her Christian Religion, embedded within a larger argument against Locke’s doctrine of “thinking matter.” Astell begins her critique of Locke with an inquiry about the nature of “the thing in us that thinks”: is it immaterial? Or could it be material, as Locke appears to suggest in his Essay Concerning Human Understanding (1690)? In response, she points to the fact that the mind has entirely different properties and affections to the body, and that we can have a complete idea of mind as a thinking thing without considering it as dependent on, or related to, our idea of body as extended substance. But if we can have a complete idea of something in independence of a complete idea of another thing, she says, then those two things are really distinct. The mind and body are therefore distinct. Contra Locke, she says that we can affirm that the idea of thinking being excludes extension, and the idea of extended being excludes thought.

Like Descartes, Astell maintains that the human person is composed of two substances: the soul (or mind), which is a thinking thing, and the body, which is extended substance. However, she makes few explicit statements about how the soul moves the body (soul-body causation) or how the body causes sensations (body-soul causation). Some of her statements appear to suggest that she upholds an occasionalist theory of body-soul causation. According to a Malebranchean occasionalist, neither bodies nor souls have any genuine causal efficacy; only God has the causal power to bring about modifications in the human mind. In one passage of her Christian Religion, Astell suggests that God is the true efficient cause of all sensation, and she seemingly denies that material objects have any power to produce modifications in our souls (§378). These remarks, however, must be placed in the context of Astell’s response to Damaris Cudworth Masham, a Lockean philosopher who had vehemently attacked the Malebranchean moral and metaphysical ideals of the Astell-Norris Letters. In the passage in question, Astell’s main point is that even if we were to embrace those Malebranchean ideals without criticism, it’s not clear that they are as harmful to morality as Masham would suggest.

Other statements indicate that Astell holds an orthodox Cartesian interactionist position on soul-body and body-soul causation. In the Letters, she raises two objections to Norris’s view that God’s will is the only true cause of our sensations and that bodies are incapable of exerting a causal influence on souls. First, she points to the fact that if sensible objects are redundant features of God’s creation, as Norris suggests, then this offends against our idea of God as a supremely wise and perfect creator. Second, she points out that the existence of genuine secondary causes is more befitting of God’s majesty, because if such causes do exist, then he need not continually interfere in his own creation. As an alternative to occasionalism, Astell supports the view that there is a natural power, a “sensible congruity,” in bodies that enables them to cause sensations in the soul. In the second Proposal, Astell also takes an orthodox Cartesian stance by suggesting that the body is disposed to make impressions on the soul and that the soul has an active power to effect changes in the body.

Astell’s philosophical concept of the self as a thinking thing informs her feminist thought. She advises her fellow women that they must learn the value of proper selflove and self-esteem: the love and esteem of their souls and not their bodies. They must cease to live like animals or Cartesian machines, those purely material beings devoid of rationality; they must pursue what is conducive to their perfection as thinking, immaterial beings.

4. Philosophy of Religion

The Christian conception of God plays a crucial role in Astell’s wider project to bring women to the knowledge of the true source of their happiness. We can be assured, she says, that God always does what is best and most becoming of his infinite perfection; and so, we can be assured that the world and everything in it is created according to the eternal and immutable standards of rectitude. It therefore becomes women to live their lives in accordance with the law of God and reason—this is the surest route to their happiness.

Astell presents at least three different types of argument for the existence of God. In her second Proposal, she develops an ontological proof, an argument for God’s existence based on premises that can be known independently of experience. In the same work, immediately following this proof, she formulates a cosmological argument for the existence of God based upon empirical observations about the created world. In The Christian Religion, she once again takes a blended approach by presenting an ontological argument followed by a cosmological proof. Then in her final work Bart’lemy Fair, she offers yet another causal argument, this time based on the principle that a cause must have either the same or higher qualities than its effect.

In her second Proposal, Astell echoes the English translation of Descartes’ Meditations (1680) when she begins her ontological argument with an idea of God as “a being infinitely perfect.” She then asks the question: does this infinitely perfect being exist? Her answer is that, according to our intuitions, the idea of God and the idea of existence are compatible, because existence is a perfection and the necessary foundation of all other perfections (since what doesn’t exist can’t have any perfections). Moreover, if any being is infinite in all perfections, then we cannot deny that that being exists; therefore, we cannot deny that God, an infinitely perfect being, exists. In sections 7–8 of The Christian Religion, Astell strengthens this argument by asserting that an infinitely perfect being would have the perfection of self-existence, rather than ordinary everyday existence. She asserts that God could not derive his being from anyone but himself; if God had derived his existence from someone or something else, then he would not be supremely perfect. So, God must have ontological independence or self-existence; he must exist by his own nature.

A similar appeal to God’s ontological independence lies at the heart of Astell’s cosmological arguments for God. In the second Proposal, her argument begins with the idea of created or contingent beings. In her view, this idea naturally suggests to us the idea of “the power of giving being” to something. How were these contingent beings created? They cannot have had the power of giving being to themselves, because this would imply a contradiction; it would imply, that is, that they could both exist and not exist at the same time. The thing that created these contingent beings would therefore have to be self-existent. It could not be another created, contingent being because this would lead to an infinite regress of such beings. Yet an infinite regress without a last resort offends our basic intuition that something cannot come from nothing (ex nihilo nihil fit). It follows that there must be a last resort or a first cause: there must be a self-existent being who created those contingent beings—and this being is God. Astell presents a similar causal argument in her Christian Religion (§10).

In Bart’lemy Fair, Astell takes a different tack in order to explain why we must regard this self-existent being as the traditional theistic God. Here she implicitly appeals to the principle that a cause must have qualities that are similar to, or higher in perfection than, those contained in its effect. Her proof begins with the empirical observation that there is gravitation or “mutual attraction” between physical bodies in the created world. She then asks, how do we explain this phenomenon? If gravity is not an essential property of matter, then we must say that gravity proceeds from the will and power of a superior cause. But this superior cause cannot be material in nature, for that would imply that matter is superior to matter in general (a contradiction); so, the cause must be immaterial. This superior immaterial cause, moreover, must have the will and power to sustain mutual attraction between bodies. In short, this cause must be the theistic God.

5. Moral Theory

In terms of her moral approach, Astell might best be described as a Christian deontologist; in her view, all human beings have a duty to live in accordance with the law of God. Nevertheless, she is also a virtue theorist to the extent that she thinks that we ought to develop a disposition to obey the divine law, and developing this disposition requires us to cultivate virtue. These moral views can be found in all her works, but especially in her Letters, the second Proposal, and The Christian Religion.

According to Astell’s strict definition, virtue consists in the soul gaining mastery over the bodily impressions and directing its passions toward the right objects, in the right “pitch” (or intensity), according to the dictates of reason (SPL II 214). She warns that the bodily passions of love, hate, fear, desire, and joy can have a disturbing and disquieting effect on the human mind. When we are in the thrall of such passions, we can get carried away and zealously pursue the wrong objects, often to our moral and spiritual destruction. The proper regulation of the passions thus plays an important role in the attainment of virtue.

Astell thinks that the passions need not be obstacles on the path to virtue, provided that they are “hallowed” or purified in some way. As a long-term strategy toward purification, we should meditate carefully on what is truly good and truly bad, and follow only those moral judgements that proceed from knowledge. Crucial to this endeavor, we must learn to focus our attention on the right objects, including our own nature as thinking things, the true nature of material beings, and the nature of an infinitely perfect being. Moral agents often go astray, according to Astell, because they have mistaken or erroneous judgements about the nature and value of these objects.

There are a number of virtues (excellences of character) that feature prominently in Astell’s moral theory; the most significant are benevolence, generosity, and friendship. Benevolence is a wishing well toward others purely for the sake of promoting their well-being, and not for selfish motives. In her writings, the love of benevolence is often contrasted with the love of desire, a selfish egoistic kind of love for others, in which we desire to possess them. On this topic, her views have much in common with the Augustinian outlook of her correspondent John Norris. Like Norris, she maintains that a virtuous agent has properly ordered love. In their Letters, they agree that human beings ought to cultivate an exclusive love of desire for God, an infinitely perfect being, because he is the only being who is truly capable of satisfying our desire. Toward our fellow human beings, we should feel only a love of benevolence; we should cultivate a disinterested goodwill rather than a selfish desire. Unlike Norris, Astell emphasizes that an exclusive desire for God can have the added benefit of helping us to regulate our passions and cultivate a non-possessive attitude toward others.

In Astell’s view, the virtue of generosity (or having “a generous soul” and “a generous temper”) also provides a remedy for our selfish desires. Like Descartes in his Passions of the Soul (1649), she regards the virtue of generosity as a species of self-esteem, a valuing ourselves on the basis of some noble or worthy characteristic. More than this, generosity consists in recognizing that our moral worth consists in exercising our free will, plus a firm commitment always to do our best. Those who have the virtue of generosity eventually cease to desire the approbation of others, because they do not really care what the rest of the world thinks of their choices and actions. So long as they themselves always endeavor to do what is best in their own minds, they are impervious to censure and ridicule.

The difficulty for women, Astell says in her second Proposal, is that they have been culturally conditioned to value themselves on accidental properties such as their looks and their clothing. They have acquired a mistaken sense of self-esteem because they have not been encouraged to value themselves as rational, thinking beings with freedom of will. To cultivate justified self-esteem, according to Astell, women must be permitted to train their reason and to study philosophy and religion. She thinks that Christianity in particular facilitates the cultivation of generosity, because it teaches them that what is truly valuable does not depend on the transient things of this world.

Finally, the virtue of friendship (a species of the love of benevolence) plays an important role in Astell’s moral thought. In her view, one of the chief benefits of her female academy is that it will enable virtuous friendships to flourish among women. These friends will then watch over each other’s moral and intellectual advancement, with the aim of advising and encouraging each other toward perfection.

6. Feminism

a. Education

Astell’s first Proposal is essentially an exercise in consciousness-raising, for the purpose of bringing about the moral and intellectual reformation of early modern women. The “proposal” of Astell’s title is an all-female academic institute, where like-minded scholars of a similar age and social status might live and study together for a number of years. Although a wealthy gentlewoman expressed interest in funding Astell’s proposal, an academy never materialized in her lifetime—possibly due to the suspicion that it sounded like a Catholic nunnery.

Throughout her works, Astell appeals to different philosophical ideas to argue that women should receive a higher education, and to undermine the belief that women are naturally intellectually inferior to men. These ideas include an egalitarian conception of reason, the Cartesian concept of the thinking self, and certain teleological principles.

To challenge the idea that women are mentally inferior, Astell’s historical predecessors traditionally pointed to empirical evidence or famous instances of exemplary women. By contrast, Astell appeals only to an inward consciousness of thought. In her view, the fact that women are thinking things needs no proof or argument; a woman simply has to turn within herself and see that she is capable of exercising her mental faculties. Astell emphasizes that the search for knowledge does not require the mastery of languages, such as Greek and Latin, nor does it require an extensive library or an intimate acquaintance with ancient authorities and obscure terminology. It simply requires the capacity to discern the truth for oneself, and the freedom to affirm or deny the ideas of the mind. In terms of their capacity for rational judgement, Astell says, women are no different to men; they are on a par.

While Astell never articulates the cogito (Descartes’ famous insight that “I think therefore I am”), she does rely on a similar logic. She relies on the idea that if a woman is capable of entertaining a thought in her mind, then it is true that she thinks; it cannot be denied. To improve their reason, according to Astell, women need only familiarize themselves with their own internal “natural logic.” Can they reason about the everyday management of household affairs, can they make informed judgments about the course of a romance or the design of a petticoat? If so, then this provides indisputable evidence of their ability to reason. If women exhibit any defect in reasoning, Astell says, this defect is acquired rather than natural, and can be corrected through proper training and meditation. They can improve their reasoning skills by following simple Cartesian rules for thinking (see the “Theory of Knowledge” section above).

It should be noted that Astell differs from Descartes in emphasizing that we can never have a distinct idea of the self as a thing whose essence consists solely in thinking (SPL II 173). She also differs from Descartes by appealing to God’s final causality in order to bolster her arguments for women’s education. In her writings, she repeatedly emphasizes that an infinitely perfect being does nothing in vain; there can be no feature of his intelligent design that is redundant or superfluous in nature. It follows that if God has bestowed rational minds upon women, then they ought to be permitted to use their minds toward the best ends. When a woman is taught that her duty is to serve a man, or to live a life devoted solely to bodily and material concerns, she is taught to disregard her sacred duty to God. A woman must therefore be educated to use her reason to raise herself toward perfection, just as her creator intended.

b. Marriage

In Some Reflections upon Marriage, Astell examines women’s disadvantages within the early modern marriage state. This work was ostensibly a response to Hortense Mancini’s much-publicized separation from her abusive and unstable husband, the duke of Meilleraye.  Although Astell regards marriage as a sacred institution ordained by God, she complains that in her day it has greatly degenerated from its original blessed state. In the Reflections, her explicit purpose is to analyze why this degeneration has occurred and to see how it might be rectified. She traces the core problem to the moral failings of human beings—but to the failings of men in particular. She highlights the fact that most men do not marry from a love of benevolence toward women but rather from base and selfish motives, such as lust and greed. Marriage would be a happy state today, she insists, if only human beings were guided by their reason and not by brutish passions. Astell warns her fellow women to be extremely wary of entering into marriage in the first place. She points to the fact that a wife is expected to offer blind submission to her husband, even when he does not deserve it. This expectation of submission might lead a woman to ignore the dictates of her reason, the law of God, and to act in terms of worldly self-interest instead. As a result, an unhappy marriage to a vicious man could lead to the destruction of a woman’s soul. As a remedy, Astell once again highlights the necessity of a good education for women, to fortify their reason and to cultivate their virtue. If Mancini had had the benefit of a higher education in philosophy and religion, Astell suggests, her husband’s abuse might not have led to her moral degradation.

Some scholars propose that Astell’s Reflections contains a hidden political sub-text. More specifically, they interpret the work in light of Astell’s conservative Anglican Tory political commitments.  In their view, when Astell highlights female slavery within marriage—when she asks her famous question, “if all Men are born free, how is it that all Women are born slaves?” (RM 18)—she is really presenting an ironic challenge to Whig theorists of her time. They claim that she challenges her Whig opponents to extend the same authority to sovereigns in the state that they uncritically permit to husbands in the domestic sphere. If submission and obedience to authority is acceptable in the family home, she asks, then why not in the state? Whig theorists, such as Locke, ought to practice the same obedience to their political leaders that they exact from their domestic subjects—they ought to practice passive obedience.

7. Political Thought

Astell has been widely interpreted as a critic of Locke’s political thought and as a vocal opponent of the Whig theories of liberty, toleration, and resistance. For some commentators, it is puzzling that Astell could be both a feminist and a High-Church Tory. At first glance, her support for women’s freedom of judgement seems to be incompatible with her support for a political party that opposes freedom of conscience, a tolerationist ethic, and other perceived threats to the Anglican church. To dispel these tensions, scholars have highlighted the fact that Astell’s feminism is founded on philosophical principles, not progressive political ideals, and this partly explains why Astell does not call for full political equality for women in her time.

In keeping with Anglican political theology, Astell maintains that all subjects are bound to observe the doctrine of passive obedience, the idea that subjects must actively obey political authority where they can, and quietly submit to the penalty for disobedience where they cannot (in those cases, for example, where the authority commands something sinful or irreligious). In her view, political subjects are never justified in engaging in active resistance to the crown, even if the crown wields a tyrannical, arbitrary power. These commitments lead Astell to criticize Locke’s views concerning the natural law of self-preservation and the right of resistance in his Two Treatises (1689).

In Locke’s view, every man has an equal right to freedom from arbitrary power. In the natural state, whenever another man threatens to enslave me, I have the right to resist him in order to preserve my life, liberty, and property. In civil society, a political authority is set up to ensure the preservation of my life, liberty, and property; but if that authority fails to act for the public good, and wields a tyrannical, arbitrary power instead, I can still exercise my right of resistance, as an extension of the natural law of self-preservation. I can depose that authority by force, if need be.

In response, in her Christian Religion (§274), Astell agrees with Locke that self-preservation is a fundamental right. But in her view, strictly speaking self-preservation consists in the preservation of the immaterial, immortal soul; so, according to the natural law, we are only ever permitted to act to secure our souls from damnation. From her Anglican viewpoint, soul-preservation entails passive obedience, not active resistance.

8. Legacy

In her lifetime, Astell’s writings were known to the philosophers John Locke, Gottfried Wilhelm Leibniz, and George Berkeley. But her ideas seem to have had the greatest impact on other eighteenth-century defenders of women, such as Mary Chudleigh, Elizabeth Thomas, the writer known as “Eugenia,” Mary Wortley Montagu, and Sarah Chapone. Her influence as a feminist can be discerned right up to the suffragist movement of the late nineteenth century, especially in the writings of English suffragette Harriett McIlquham. In recent history, there have been two revivals of academic interest in Astell as a feminist: the first from the 1890s to the early twentieth century; and the second from the mid-1980s to the present day, facilitated to a great extent by Ruth Perry’s authoritative biography, The Celebrated Mary Astell. Perry claims that Astell would be surprised at the history of her reception as feminist pioneer—Astell thought of herself more as a metaphysician and philosopher than a political reformer.

9. References and Further Reading

a. Primary Sources

  • Astell, Mary, Bart’lemy Fair: Or, An Enquiry after Wit; In which due Respect is had to a Letter Concerning Enthusiasm, To my LORD ***, London: Richard Wilkin, 1709.
    • Astell’s moral-theological critique of Whig political ideas in Shaftesbury’s Letter. No modern edition currently exists.
  • Astell, Mary, Astell: Political Writings, ed. Patricia Springborg, Cambridge Texts in the History of Political Thought, Cambridge: Cambridge University Press, 1996.
    • Contains the third edition of Reflections on Marriage (1706), cited in the text as RM.
  • Astell, Mary, A Serious Proposal to the Ladies, Parts I and II, ed. Patricia Springborg, Peterborough, ON: Broadview Press, 2002.
    • Standard modern edition of Astell’s best-known work. Cited in the text as SPL part, page.
  • Astell, Mary, and John Norris, Letters Concerning the Love of God, ed. E. Derek Taylor and Melvyn New, Aldershot, UK: Ashgate, 2005.
    • Modern edition of Astell’s correspondence with the Malebranchean philosopher John Norris.
  • Astell, Mary, The Christian Religion, as Professed by a Daughter of the Church of England, ed. Jacqueline Broad, The Other Voice in Early Modern Europe: Toronto Series, Toronto, ON: Centre for Reformation and Renaissance Studies and Iter Publishing, 2013.
    • Modern edition of Astell’s most mature work of moral theology, based on 1717 second edition. Cited in the text by section number.

b. Secondary Sources

  • Boyle, Deborah, “Mary Astell and Cartesian ‘Scientia’,” in Judy Hayden, ed., The New Science and Women’s Literary Discourse: Prefiguring Frankenstein, New York: Palgrave Macmillan, 2011, 99–112.
    • Account of Astell’s theory of knowledge and her distinction between faith, science, and opinion.
  • Broad, Jacqueline, The Philosophy of Mary Astell: An Early Modern Theory of Virtue, Oxford: Oxford University Press, 2015.
    • First book-length examination of Astell’s wider philosophy. Presented from the point of view of her ethical theory.
  • Detlefsen, Karen, “Custom, Freedom and Equality: Mary Astell on Marriage and Women’s Education,” in Alice Sowaal and Penny A. Weiss, eds., Feminist Interpretations of Mary Astell, Re-reading the Canon, University Park, PA: Pennsylvania State University Press, 2016, 74-92.
    • Examines Astell’s Cartesian epistemology with a focus on dispelling tensions within her feminism.
  • Goldie, Mark, “Mary Astell and John Locke,” in William Kolbrener and Michal Michelson, eds., Mary Astell: Reason, Gender, Faith, Aldershot, UK: Ashgate, 2007, 65–85.
    • Insightful analysis of Astell’s critique of John Locke’s religious and philosophical ideas.
  • Kinnaird, Joan K., “Mary Astell and the Conservative Contribution to English Feminism,” The Journal of British Studies 19, no. 1 (1979), 53–75.
    • Analysis of connections between Astell’s feminism and her conservative religious and political commitments.
  • Lascano, Marcy P., “Mary Astell on the Existence and Nature of God,” in Alice Sowaal and Penny A. Weiss, eds., Feminist Interpretations of Mary Astell, Re-reading the Canon, University Park, PA: Pennsylvania State University Press, 2016, 168-87.
    • One of the first detailed discussions of Astell’s proofs for the existence of God.
  • Lister, Andrew, “Marriage and Misogyny: The Place of Mary Astell in the History of Political Thought,” History of Political Thought 25, no. 1 (2004), 44–72.
    • Interprets Reflections as a feminist work with the primary aim of urging women to remain single if possible.
  • Myers, Joanne E., “Enthusiastic Improvement: Mary Astell and Damaris Masham on Sociability,” Hypatia: A Journal of Feminist Philosophy 28, no. 3 (2013), 534–50.
    • Provides insight into the so-called debate between Astell and fellow feminist Masham.
  • O’Neill, Eileen, “Mary Astell on the Causation of Sensation,” in William Kolbrener and Michal Michelson, eds., Mary Astell: Reason, Gender, Faith, Aldershot, UK: Ashgate, 2007, 145–63.
    • Interprets Astell as holding a Cartesian interactionist position on mind-body causal relations.
  • Perry, Ruth, The Celebrated Mary Astell: An Early English Feminist, Chicago, IL: University of Chicago Press, 1986.
    • The most authoritative and engaging account of Astell’s life and works.
  • Sowaal, Alice, “Mary Astell’s Serious Proposal: Mind, Method, and Custom,” Philosophy Compass 2 (2007), 227–43.
    • Analysis of Astell’s educational strategy in relation to her theory of mind.
  • Springborg, Patricia, Mary Astell: Theorist of Freedom from Domination, Cambridge: Cambridge University Press, 2005.
    • Interprets Astell’s writings in light of her support for the Tory political party and her High-Church Anglicanism.
  • Squadrito, Kathleen M., “Mary Astell’s Critique of Locke’s View of Thinking Matter,” Journal of the History of Philosophy 25, no. 3 (1987), 433–9.
    • Early article on Astell’s critique of Locke’s claim that God could conceivably add the power of thinking to matter.
  • Taylor, E. Derek, “Mary Astell’s Ironic Assault on John Locke’s Theory of Matter,” Journal of the History of Ideas 62, no. 3 (2001), 505–22.
    • Examines Astell’s critique of Locke with reference to Astell’s own views about the mind-body relationship.
  • Taylor, E. Derek, “Mary Astell’s Work Towards a New Edition of a Serious Proposal to the Ladies, Part II,” Studies in Bibliography 57 (2005–6), 197–232.
    • Provides evidence that Astell may have had plans for a new edition of her second Proposal (1697).

 

Author Information

Jacqueline Broad
Email: jacqueline.broad@monash.edu
Monash University
Australia

The Port Royal Logic

Logic or the Art of Thinking, commonly known as The Port Royal Logic, was written by Antoine Arnauld and Pierre Nicole and first published in 1662. Although it was a textbook containing much worked-over material, the Logic was extremely influential, certainly the most important textbook in logic for the next two hundred years. Part of its influence was due to its accessibility: it was short for a logical treatise and the first logic textbook in a vernacular language. It was quickly translated, had numerous editions, and was popular throughout Europe and the U.S. well into the 19th century. Its technical logic, however, is unoriginal. From a modern perspective the Logic’s interest is twofold: it harmonizes Cartesian dualism with standard doctrines of late medieval logic, and for the first time it gives intentional content a central role in semantics. The two are related. Because dualism was inconsistent with the standard medieval theory of reference, it was necessary to forge a new foundation. To do so, the Logic’s authors relocated objective being, an early version of intentional content, to the center of logical theory. This article focuses on the Logic’s innovations in semantics, especially the role of intentional content, and on the place of its innovations in the history of logic, both where they came from and how they evolved.

As a result of its commitment to dualism, the Logic faced four tasks: (1) to explain anew how terms in mental language “signify” things in the world, (2) to reformulate truth-conditions in a way compatible with its new definition of signification, (3) to preserve the standard proof theory of late medieval logic, and (4) to explain how logical demonstration contributes to scientific knowledge in the context of Cartesian rationalism. These tasks correspond to the four “parts” of the Logic. Parts I to III correspond to the standard books of earlier logical treatises, which follow a division loosely modeled on Aristotle’s Organon: the logic of terms, the logic of propositions, and the logic of arguments. Part IV concerns method, a topic of special interest in the 17th century.

Table of Contents

  1. The Logic of Terms
    1. Summary
    2. Ontology
    3. Dualism, Ideas
    4. Mental Language
    5. Intentional Content
    6. Kinds of Ideas, Occasionalism
    7. Determinative and Explicative Restriction
    8. Indeterminate Restriction
    9. False Ideas
    10. Abstraction
    11. The Categories and Predicables
    12. Nominalism-Realism
    13. Comprehension as a Generalization of Essence
    14. Species and Difference
    15. Signification and Extension
    16. The Structure of Ideas
  2. The Logic of Propositions
    1. Summary
    2. Modality
    3. Distributive and Confused Supposition
    4. Truth-Conditions for Categorical Propositions
    5. The Correspondence Theory of Truth
  3. The Logic of Arguments
    1. Summary
    2. The Syllogistic
    3. Validity
  4. Method
    1. Summary
    2. Necessary and Contingent Truth
    3. Certainty, Clear and Distinct Ideas
    4. Demonstration
    5. Sensation and Knowledge of Contingent Truth
    6. Method: Analysis and Synthesis
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Logic of Terms

a. Summary

(The original edition of Port Royal Logic is hereafter referred to as Logic; Arnauld 2003, hereafter KM, Logic is vol. 5., 99–413. English translation: Arnauld 1996, hereafter B.)  In Book I, the authors lay out the fundamental assumptions and concepts of their semantic theory. These include a substance-mode ontology with its dualistic division into matter and spirit (Logic, Part I, Chapter 2, hereafter I:2); a theory of mental language (I:1, 4); ideas and their causes, including abstraction and restriction (I:1, 5, 8); the traditional ten categories (I:3) and five predicables including genera and species (I:7); false ideas and error (9–11); and essential definitions (I:12–15). Most importantly they explain their theory of reference (I:6). The key concepts possess a definitional order. First, every term possesses by nature an intentional content. This content determines what the term signifies in the world. What it signifies in turn determines its inferior ideas. Inferior ideas then combine to form the term’s extension. Extension will then be the key concept in the definition of truth in Part II. At multiple points in both the introductory Discours and Part I the authors point out the intellectual and moral dangers lurking in equivocation and false ideas.

b. Ontology

In the introduction the authors decline to engage in the realism/nominalism debate, on whether, as they put it, universals exist a parte rei, because they judge the issue uninteresting and useless (Discours I, KM V 112–113, B 11–12). In Part I, nevertheless, they assume a basic substance-mode ontology that is roughly Aristotelian. They divided being into two kinds: substances, which can be conceived as existing independently, and modes (attributes, qualities), which can only be conceived of as existing instantiated in substances (I:2).

c. Dualism, Ideas

To this Aristotelian foundation the Logic adds Cartesian dualism. Substances and their modes divide into two kinds: spiritual and material. The essential property of material substances is extension and that of souls is thought. In the Logic the modes attributed to material substances are those described in Cartesian physics; for example, relative size, position, motion, and shape. Modes attributed to the soul include sensory qualities, ideas, and mental operations. These operations include the three traditionally listed in medieval logic: conception (concevoir), judgment (affirmation and denial, juger), and reason (logical deduction, raisonner), and a fourth, the methodological organization of knowledge (ordoner), which was considered important in 17th century logic (I:Introduction, KM V 125, B 23). These four operations correspond to the four parts of the Logic. Although the authors sometimes used idea loosely to refer to any spiritual mode, in more precise contexts an idea is a mental mode that functions as a term in mental language, or what medieval logicians and Descartes call a concept.

d. Mental Language

The Logic discusses grammar piecemeal (I:1–6 and II). It does not provide an exhaustive breakdown of spoken language into basic parts of speech, nor does it attempt to formulate precise grammar rules for complex expressions like those of a modern generative grammar. As in medieval logic, the spoken language in which logic is conducted (and which the Logic discusses) turns out to be a rather stylized fragment of natural language (I:1, 4). Chomsky surprised the linguistic community in the 1960s by pointing out in Cartesian Linguistics that Logic posits a mental language parallel to speech and suggested that their distinction anticipates his between surface and deep structure (Chomsky 1966, 31 and following). It is more accurate to say that medieval logicians had been working out the theory of mental language for centuries, in which spoken words and phrases were conventional signs for a language of thought that was prior to speech and had its own grammar and semantics. The basic linguistic operations are conceptualization, judgment, and reasoning.

Conceptualization is the act of instantiating in the soul an idea that serves as a basic term in mental grammar. These ideas have semantics. An idea by its nature has an intentional content that the soul is aware of more or less clearly during the act of conceptualization. This intentional content determines what the idea signifies in the world. What it signifies in turn determines other ideas that are “inferior” to it. The set of its inferiors constitute its “extension” in the special sense peculiar to the Logic. An idea that fails to signify anything real (that fails of reference) is called a false idea.

Judgment is the act in which the soul affirms or denies propositions, which are grammatical complexes in which ideas occur as terms. Reasoning is the act where the soul draws a conclusion from other propositions as premises. Part II explains the truth-conditions of propositions, and Part III explains which reasoning patterns are valid.

Substantives and adjectives are the two basic kinds of referring terms in mental propositions. The Logic has no single technical term for reference. Sometimes it is called expression, sometimes representation, but most frequently it is called signification, which was the standard term in earlier logic. Fundamental to the Logic’s semantics is the thesis that signification is explained by intentional content (I:2, 5–6).

e. Intentional Content

As mentioned above, one of the challenges faced by the Logic was how to reconstruct the medieval theory of reference. In earlier Aristotelian accounts, reference is explained by the transmission of a property from the world to the soul. By sensation and abstraction, the view held, an external property was causally transmitted via the sense organs to the brain and from there to the intellect. Once in the intellect it serves as a concept or term in mental language. This term was then said to signify those objects outside the mind that instantiate the transmitted property. Dualism, however, makes this mechanism impossible. If dualism is true, no property can be instantiated in both matter and the soul.

To explain reference, the Logic appeals to intentional content. Intentional content was far from a new idea. Versions had been used throughout the Middle Ages to explain various semantic phenomena (Pasnau 1997). Peter Aureol holds that what we see when we have an illusion, like the apparent movement of the trees from a passing boat, is not something that really exists outside the mind but rather a third entity that only exists “in the eye objectively” and “intentionally.” At some points in his career, Ockham calls “what we understand” when we grasp an abstract noun a “fictum” having esse objectivum and esse cogitum (Willam of Ockham 1978, §10.) Scotus calls something’s nature an “intelligible being” distinct from the thing itself. By the 16th century, it was common for logicians to distinguish between the “formal” and “objective” being of a concept (Cronin 1966). A concept has formal being inasmuch as it is a mode of the soul and as such is part of its “form.” It exhibits objective being because it carries with it the understanding of an object—it “throws” the object “against” the mind. Suárez, for example, holds that an essential definition is true timelessly, even prior to creation, because it signifies objective being. Toletus explains “beings of reason” like a chimera and non-referring terms like antichrist that do not refer to existing things as signifying objective being. By Descartes’ time, the distinction was commonplace in the logic books studied in schools and universities, including the schools attended by Descartes, Arnauld, and Nicole. It is prominent, for example, in treatises by Toletus, Raconis, Fonseca, and Eustache de Saint-Paul. (Toletus S.J. 1596, 3, 30; Raconis 1651, De principis entis, a. 3, §1a, 827; Fonseca S.J. 1599, q. ii, §1; Eustachio-De-S.-Paulo 1648 Metaphysia, De natural entis, de conceptus formali et objectivo, 1; see also Cronin 1966). Descartes appeals to the objective being of the idea of God in his famous ontological argument of Meditation III (§§ 21–22). The Logic prefers to speak about an idea’s content, but Arnauld uses the medieval terminology objective realty or objectively being in On True and False Ideas. (See Arnauld 1813, vol. I, hereafter VFI, Ch. 5, 6; KM I, 202, 205; English translation Arnauld 1990 [1683], hereafter G, 69, 71–127). In the Logic objective being is used to explain not only of signification, but also extension, abstraction, restriction, privative negation, essential definition, ambiguity, equivocation, clear and distinct ideas, and perception.

The explanatory role of intentional content (I:6–7) begins with substantives. Grammatically, substantives are ideas that serve as the subjects or predicates of categorical propositions. Semantically, a substantive is distinguished by its intentional content, which in the case of a substantive is called its comprehension. Comprehension is explained by appeal to substance-mode ontology. A substantive’s comprehension is a series of modes. It modern terms it may be thought of as a set of modes. These form the idea’s content and provide its identity criteria. Two substantives are identical if and only if they have the same comprehensions. Signification is then defined in terms of comprehension. A substantive signifies all and only those entities that satisfy all the modes in its comprehension. The theory is not unlike—indeed it is a remote ancestor of—Frege’s view that sense determines reference. A substantive that signifies many individuals is a common or abstract noun. One that signifies a single individual is a proper noun. Normally, a substantive signifies substances, but it can also signify modes, like whiteness. If a substantive signifies another idea, which is a mode of the soul, it is a term of second intention.

Adjectives too have intentional content, but the terminology is different. Grammatically, adjectives serve as the predicates of categorical propositions or as modifiers of substantives in longer noun phrases. Semantically, an adjective has as its intentional content a mode or sometimes multiple modes. In the case of an adjective these are called its secondary signification. This content determines the objects the adjective is true of or “signifies in the primary sense”: an adjective signifies primarily all and only the entities that instantiate all the modes in its secondary signification. Again, intentional content provides identity conditions: two adjectives are identical if and only if they have the same secondary signification. Following medieval usage, an adjective is called a connotative term (I:8; KM V 152; B 46). It directly signifies a mode and indirectly connotes the individuals in which they inhere. Substantives differ semantically from adjectives in that a substantive’s primary function is to signify an entity in abstraction from its modes. An adjective, however, draws attention to entities by first drawing attention to the mode or modes in its secondary signification. (The primary and secondary terminology derives from Aristotelian metaphysis in which substances are ontologically prior to modes because a mode must exist in a substance.) Because a substantive signifies objects directly but an adjective signifies objects indirectly by first signifying a mode, a substantive is called absolute and adjective relational.

It is clear that the Logic’s authors regarded intentional mode-sets as part of its explanation of conceptualization or of “what it is to understand an idea.” Details are fleshed out in Part IV in the discussion of clear and distinct ideas, and sensation. Like some nominalists who believed in objective being, the Logic’s authors make the point that objective being is not some kind of representative entity between, or in addition to, the soul and the external world. It is a fact of psychology, they hold, that when a perception is experienced during sensation or when an idea is clearly conceived in thought, the soul is aware of the modes that make up its content. No mode of matter experienced in the content of a perception or idea, however, can be true of the soul itself. They are true rather of the material substance outside the soul that is the object of sensation or that the idea signifies.

f. Kinds of Ideas, Occasionalism

Like Descartes (Meditations III.7), the authors hold that there are three kinds of ideas that differ by how they are caused. They are adventitious, innate, and factitious.

Adventitious ideas are those caused by God on the occasion of a bodily sensation. Sensation is more fully explained in Part IV. Because material modes cannot be instantiated in the soul, the Logic is forced to reject the usual Aristotelian account of sensation and concept formation. The material transfer of modes in sensation only goes as far as the brain. The properties of a material substance travel from the object being sensed to perceiver’s sense organs, and from there to the brain, but they stop there. Material modes cannot then be transferred “intentionally” to the soul itself to become consciously perceived. The Logic’s alternative explanation is a form of occasionalism. (On occasionalism in the Logic see I:1, KM V 132–33, B 29-30; I:9, KM V 157–78, B 9–50; I:12, KM V 168–170, B 58–60; VFI 6, KM I 204, G 71–71; VFI 27, KM I 349–50, G 208. For broader accounts in Cartesianism generally, see Nadler 2011, Nadler 1989, and Garber 1993.)

On the occasion of bodily sensation in which a material object transfers its modes to the perceiver’s brain in the form of physical motion, God simultaneously causes to be instantiated in the soul a mental mode. This mode is adventitious and is called a perception in a narrow sense. A perception, moreover, has an intentional content of which the soul is aware with varying degrees of vividness, clarity, and distinctness. Some of these modes, like motion, relative position, and shape, are material and are true of the object outside the mind causing the sensation. Other modes in the perception’s content are sensory. They are true of the soul itself, like colors, tastes, smells, textures, sounds, and feelings of pleasure and pain.

Innate ideas are ideas directly instantiated in the soul by God apart from sensation. They include the idea of infinity and of God himself.

Factitious ideas are caused by the soul itself through one of two mental operations: restriction or abstraction. Both operations were standard topics in earlier logic. The Logic’s account is novel in that it explains their mechanisms in terms of intentional content.

g. Determinative and Explicative Restriction

Grammatically, restriction is a mental operation by which the soul forms a longer substantive phrase by modifying a substantive with an adjective or relative clause. Semantically, a relative clause functions like an adjective: it has a primary and secondary signification. In restriction a new idea is formed. Its comprehension is the intersection of the comprehensions of the two contributing ideas. Since the new comprehension contains more modes than either the substantive or its modifier, it will frequently be true of fewer things and will then be less general. If the restricted phrase signifies fewer individuals than the substantive alone, it is said to be determinative. On the other hand, if the restriction does not signify fewer things but simply adds extraneous information, it is called explicative.

Because an explicative restriction does not reduce the significance range of the modified substantive, the proposition expressed is equivalent to a conjunction of propositions. In one of these, the extraneous modifier is deleted, and in the other, the modifier is predicated of the original substantive. For example, in the Pope, who is the Vicar of Christ, resides in Rome, the relative clause who is the Vicar of Christ does not further restrict the significance range of the subject the Pope. The proposition is therefore equivalent to a conjunction of two propositions: The Pope resides in Rome and the Pope is the Vicar of Christ (I:8, KM V 151–52, B 44–45). This distinction between determinative and restrictive relative clauses had been made frequently in earlier logic. (See, for example, Buridan 2001, 286; Parsons 2014, 5.6.) The Logic adds its explanation in terms of content. The distinction is also made in modern grammar using the terminology restrictive and non-restrictive relative clauses.

h. Indeterminate Restriction

The Logic also recognizes what some commentators call a special type of restriction called indeterminate restriction (II:6, KM V 145, B 40; I:7, KM V 147–48, B 41–42; I:7, KM V 150, B 44; II:3, KM V 199, B 83). (Pariente 1985, 247–238, Auroux 1993, 74.) This is not really a second type of restriction but rather a way of referring to restriction in the metalanguage using an existential quantifier. Indeterminate restriction is important in Part II where it is used to state the truth-conditions of particular affirmative propositions. As explained there, a particular affirmative some S is P is true if there is some third term, call it Q, by which both terms S and P are restricted with the result that the restricted terms have the same extension. As in a similar analysis of Aristotle’s called ecthesis (see, for example, Prior Analytics 28a23–26, 30a9–14), the common subset shared by S and P is “exhibited” by the two restricted terms. In the precise statement of the truth-conditions, restriction occurs in its univocal sense but in such a way that there is an existential quantification in the metalanguage over the restricting term: some S is P is true if and only if there is some idea Q so that restrictions of S by Q and P by Q have the same extension.

i. False Ideas

If the combination of modes in an idea’s comprehension are not jointly true of any actual object, then the idea is said to be false.

If the objects represented by these ideas, whether they be substances or modes, are represented to us as they are in fact, one calls them true [véritables]. If they are not such, they can only be false [elles sont fausses en la maniere qu’elles les peuvent être], and this is what one calls in the School beings of reason, which usually consist of the combination that the soul makes from two ideas real in themselves, but which are not joined in truth to form a single idea. An example is the one that can be formed from a mountain of gold. It is a being of reason, because it is composed of two ideas, of mountain and of gold, which it represents as one even though they would not really be so. (I:2, KM V 136, B 32, author’s translation. See also Discours I, KM V 110, B 9–10; I:9, KM V 157–78, B 49–50; I:11, KM V, 168–170, B 58–60.)

Many of the examples of false ideas given in the Logic are not just false but impossible, either because their contents contain contrary modes or because the laws of nature prevent their joint satisfaction. In earlier logic it was common to call such a non-existing thing a being of reason. It was often said to have objective being and to have some status in reality distinct from the soul and real beings (esse reale, in re). (See, for example, Willam of Ockham 1978 §10, and Suárez 1995.) A standard example was an impossible being like a chimera, goat stag, or golden mountain, as well as a planned but incomplete possible being like a castle, house, or city. The authors of the Logic, however, reject the view that a being of reason possesses a reality independent of the soul, and regard objective being rather as a property of ideas. An idea has objective being to the extent that the soul is aware of the modes in the idea’s intentional content when the idea is instantiated in the soul. As egregious examples of false ideas the Logic cites those with comprehensions that combine spiritual and material modes. Examples include a red, blue and orange rainbow (of water drops); pain caused by fire; heaviness caused gravitational attraction; happiness as caused by material wealth; courage as feats of valor; lack of physical pleasure as evil; and spatial solitude as misery. Some ideas, however, are only contingently false. The Logic remarks, for example, that Alexander, the son of Philip would be a false idea if Alexander had not been Philip’s son. The idea the bent stick in the water would be false if the stick were straight, but true if not. Peter, the denier of Christ happens to be a true idea, but since Peter was free, it might well have been a false idea.

Following Descartes, the Logic places false ideas at the center of its explanation of error, especially the errors characteristic of Aristotelian psychology and various moral failings. Aristotelian accounts of perception err because a mode true of matter cannot travel via sensation and abstraction to become instantiated in the soul. Rather, the material world, which consists of Cartesian extension modified by geometric and mechanical modes, is entirely separate from the soul, which is modified by modes of sensations, feelings, and morals. Ideas that combine the two are false. In addition, many moral failings are grounded in false ideas. When young, we mistakenly believe that moral qualities, which are true of the soul, are caused by material circumstances. We err when we combine them into a single idea, for example, when we combine virtue and worldly wealth.

False ideas are important to logic because they have implications for the theory of truth. Semantically, false ideas are nonreferring terms—they fail of existential import. What are the truth-conditions of an affirmative categorical proposition with a false idea as subject term? Medieval logicians had divided on whether this failure makes the position false. The quotation above, and others in the Logic, strongly suggest that in Arnauld and Nicole’s view an affirmation with a false idea as subject is false. The issue recurs in Part IV’s account of necessary and contingent truth. (See Martin 2012.)

j. Abstraction

 The second way in which the soul causes new ideas is by abstraction. Various accounts of abstraction had been part of logic since Aristotle, but the Logic’s version had to be made consistent with dualism. (For a standard medieval account see Aquinas, Summa Theologica I.I, Q. 85.) To do so, the authors explain its mechanism as a manipulation of intentional content. (I:5, KM V 142–43, B 37–38; I:11, KM V 168–170, B 58–59; VFI 6, KM I 207–210, G 74–76; VFI 11, KM I 234–235, G 98–100.)

Abstraction is either from a sensation or a prior idea. On the occasion of sensation, God causes the soul to experience a vivid awareness of a modal content. Some of these modes, like extension, relative position, and motion, are true of the material object that is the external correlate of the experience and that has caused the corresponding movements in the perceiver’s brain. Other of these modes, like colors, tastes, weight, sounds, and associated feelings, are true of the soul. When attending to this broad content, the perceiver may form an idea from this content. The soul does so by selecting as the idea’s comprehension a subset of modes evident in the perceptual experience.

The second kind of abstraction is from a prior idea. While attending to the comprehension of a prior idea, the perceiver may form a new idea with a new comprehension by selecting a subset of modes from the prior idea’s comprehension. Because the new content contains fewer modes, it is generally true of more things and is therefore more general.

The Logic describes the accumulation of abstract ideas as progressive, starting with abstractions from sensory experience and proceeding to increasingly more general ideas. In earlier logic, abstraction was usually described as progressing in the reverse order (for example, in Aquinas above), starting with abstraction to the most general ideas and then progressing by steps of restriction to more concrete ideas.

k. The Categories and Predicables

Following the logical tradition, the Logic endorses Aristotle’s categories (I:3) and Porphyry’s predicables (I:7). Its list of the ten categories is standard: substance, quantity, quality, relation, activity, passivity, place, time, position, and state. The book later scarcely mentions the distinctions among the various nonsubstance categories. Like some late medieval logicians, the authors evidently regarded distinctions among mode types as unimportant. In particular, the Logic rarely speaks of relations as such, although it recognizes that some modes are internal to a substance and others external. In the medieval fashion, external modes are another name for relations, and are so-called because they hold of a substance only by reference to another substance.

The authors also endorse the five predicables, a standard topic in logic since Porphyry: genus, species, deference, property, and accident. These classify mental predicates according to their degree of necessity. These distinctions, unlike those among type of modes, are important in the Logic.

Genera and species are common nouns in mental language (I:4–6). As such they have comprehensions and signify individuals. In the normal case they signify substances, but as terms of second intention, they may also signify modes. Differences, properties, and accidents too are terms in mental language, but they are adjectives. As such they signify secondarily a mode or modes, and signify primarily the individuals that satisfy these modes. Thus, viewed ontologically, genera and species differ from difference, property, and accident as substances differ from modes. Corresponding to genera and species are the individuals they signify, which in general are substances. Corresponding to differences, properties and accidents are the modes they signify directly and the objects in which the modes inhere indirectly.

The Logic uses the predicables to articulate its account of essential definition. The details follow earlier standard accounts. Every species has an essential or real definition, as distinct from a nominal definition (1:12–014). A nominal definition lays down a convention in which a spoken sound is paired with an idea. (This relation is also called signification, but in this sense it is distinct from signification in the sense of reference, a natural relation between an idea and what it stands for outside the mind. The dual senses were common in medieval logic.) A real or essential definition is a universal affirmative proposition in which a species is the subject term and its genus restricted by a distinguishing adjective is the predicate. The adjective is called the species’ difference (differentia). An essential definition is necessarily true and describes the species’ nature. Part IV assigns a major role in scientific knowledge to essential definitions.

It follows from the account of essential definition that species fall into a structure. Every genus except the highest is itself a species and has its own essential definition. The highest genus, which has no definition, is referred to as being or substance. A species that is not a genus is an infima species. At several places, the Logic mentions the traditional doctrine of differentiation by privation. This is the case in which a genus divides into two species which are such that the difference of the second is the privative negation of the difference of the first. (I:7, KM 148, B 42; II:15, KM 242, B 124.) The species animal, for example, is said to divide into the species human with the difference rational and the species brute with the privative difference irrational. Although the authors do not draw attention to the fact, the account entails genera and species’ conforming to a finite finitely branching tree-structure, which is traditionally called the tree of Porphyry. (See Structure of Ideas below.)

A property (proprium) is an adjective that is not the difference of any species but that nevertheless signifies secondarily a mode necessarily true of a species. As an example, the Logic cites a mode that is necessarily true of a circle but not part of its definition: all lines from the center to the circumference are equal. An accident is such that any species “can be conceived without it.” More precisely, an accident is an adjective that is not a difference or a property. An accident connotes a mode that is true but not necessarily true of what it signifies. It is either not true of every member of a species, or not always true of them.

l. Nominalism-Realism

Like some medieval logicians, the Logic explains genera and species nominalistically. Ontologically genera and species are not classified as some special kind of entity in addition to matter and its modes, or souls and their modes, as some realists had suggested. They are simply ideas, which are modes of the soul. On the other hand, the Logic treats differences, properties, and accidents realistically. Strictly speaking, these too are ideas in mental language (I:7). But they are also adjectives, and as such they signify modes secondarily. Frequently the Logic muddles use-mention and uses difference, property, and accident to refer to these modes themselves. For example, the difference of the species human sometimes means the adjective rational and sometimes the mode rationality. The context makes clear which is intended. The Logic’s overall metatheory is basically realistic because it assumes a fundamental substance-mode ontology in which modes have real existence. Thus, although the Logic’s ontology is basically realistic, which is evident in what it has to say about difference, property, and accident, it adopts the nominalistic move of some earlier logicians of avoiding positing special entities for genera and species by identifying them with modes of the soul.

m. Comprehension as a Generalization of Essence

In the Logic’s technical vocabulary, the collected differences of a species’ higher genera constitute the species’ comprehension, which is its intentional content. Indeed, it is not an exaggeration to say that the Logic’s entire theory of reference via intentional content—comprehension in the case of substantives and secondary signification in the case of adjectives—is a generalization of the Aristotelian theory of essential definition. Available in that theory was the generalization that a species comprises those individuals that satisfy its difference and those of its higher genera. In the Logic, it is these modes that determine what a species term signifies. In other words, a species signifies those individuals that satisfy the modes in its intentional content. It is a small step to attributing a content to all terms and explaining their signification similarly.

n. Species and Difference

The Logic holds an odd view about species. A species and its difference, it holds, are semantically equivalent. They both signify the same individuals and therefore have the same extension (I:7, KM V 147–48, B 42). The authors are here following Aristotle, who maintained that each species has a unique difference. No two species, in other words, have the same difference. (See, for example, Parts of Animals 3, 642b20–643a20.) Moreover, if a difference, which is an adjective, is read as a substantive, the species and the difference signify the same individuals. It is perfectly normal in Latin to read an adjective as a noun when it is not used to modify another noun. The Logic’s example is the word album (white) which may be understood as either an adjective or a noun. Likewise, rationalis is an adjective in animal rationalis, but a noun signifying the same individuals as homo when it occurs alone as a noun. Thus, the signification of the noun is the primary signification of the adjective. In the Logic’s terminology, the secondary signification of the difference when construed as an adjective is identical to its comprehension when construed as a noun. It is for this reason that the Logic at times says that a species and its difference are the same thing.

o. Signification and Extension

Extension is probably the most interesting concept in the Logic’s semantics. Truth is defined in terms of extension but extension is defined in terms of ideas. The result has the appearance of a kind of idealism in which truth is defined independently of the external world. The appearance, however, is misleading. Although the authors are dualists and revolutionaries of a sort who want to define truth using only mental categories, they are also conservative in the sense that they want to maintain a correspondence theory of truth. To do so, they defined extension so that it tracks what happens in the world outside the mind. A universal affirmative, to be sure, is true if the extension of its subject term is contained in the extension of the predicate. Moreover, extension here consists of ideas. But subordination among these ideas corresponds, it turns out, to subordination among things in the world.

The story is somewhat indirect due to the authors’ loose mathematical style. They fail to give what we would regard today as a clear definition of extension. The best they have to say is that a term’s extension consists of its “inferior subjects.” (I:6) They make clear that by “subjects” here they mean ideas. From these two remarks it is possible to piece together a definition: the extension of an idea consists of all its inferior ideas. The problem, however, is that they do not define “inferior.” They do give an example. Various types of triangles, they say, are inferior to the genus triangle. This reading, moreover, conforms with prior usage in medieval logic. The suggestion is that the extension of A is the set of all ideas B such that all the modes in the comprehension of A are included in the comprehension of B. Several things follow: First, the extension of B would be included in that of A if and only if all the ideas defined in terms of B are a subset of all the ideas defined in terms of A. Second, all the ideas defined in terms of B are a subset of all the ideas defined in terms of A if and only if the comprehension of A is a subset of the comprehension of B. It follows that whether a species is inferior to a genus would be a function of the essential definitions. The definition of inferiority would entail that every S is P is true if and only if the comprehension of P is a subset of that of S. A true universal affirmative would then be a matter of conceptual inclusion and, as such, necessary. A plausible example would be every animal is a living being. It would be true because the species animal is included in the genus of living being or, equivalently, animal is defined in terms of living being. Being an essential definition it is also necessary.

This reading, however, is much too narrow to fit other views within the Logic’s more general metatheory. It excludes, for example, the possibility of contingent truth. In particular, it entails the wrong truth-conditions for propositions that affirm accidental predicates. Accidents, of course, are not species. They have no “inferiors” in the proposed sense. On the other hand, contingent propositions like Peter denied Christ and every student in the classroom is asleep can be true, yet the intentional content of the predicate is not contained in that of the subject. Similarly as a contingent matter a universal negative like no doctor is a thief can be true and no doctor is poet false, yet in that case both can be made false as function of ideas. The set of ideas defined in terms of doctor and poet is non-empty if the idea doctor-poet is formed by restriction  Likewise the set of ideas defined in terms of doctor and thief is non-empty if it contains doctor-thief.  (See Auroux 1993, 135, and Martin 2017.) The reading also poses problems for the Logic’s doctrine of false idea. As explained above, the Logic attributes errors in philosophy and morality to believing propositions that have false ideas as subjects. These are ideas that fail to signify any existing thing, like pain caused by fire or virtuous rich man. Affirmative propositions with false ideas as subjects—that is, affirmatives with subject terms that have no existential import—are supposed to be false. On the other hand, the intentional content of mountain is contained in that of golden mountain, and anything defined in terms of golden mountain would be defined in terms of mountain. It would seem, then, that a trivial but empty proposition like every golden mountain is a mountain would be true despite have a false idea as subject. The issue is important in Part IV (see Martin 2012).

The broader context makes clear the correct definition of inferiority. The key is to define inferiority in terms of signification: idea A is inferior to B if and only if every individual that A signifies B signifies. Equivalently, A is inferior to B if and only if all the individuals that satisfy the modes in the intentional content of A also satisfy all the modes in the intentional content of B. The extension of idea A, or Ext(A), is defined as the set of ideas B that signify only individuals that B signifies. It follows that the ideas in a term’s extension, which is the set of its inferiors, signify what the term signifies but do so in finer detail. Let the significance range of idea A, or Sig(A), be the set of all individuals that A signifies. In short, Ext(A) is the set of ideas B such that Sig(B)⊆Sig(A).

The mappings Ext and Sig stand in one-to-one correspondence and as a result the definition of extension insures that a term’s extension provides an indirect way of referring to individuals “outside the mind.” Ext(A) determines Sig(A) because Sig(A) is the set of all individuals that are in any idea in Ext(A). Conversely, Sig(A) determines Ext(A) because Ext(A) is the set of all ideas inferior to A that signify only individuals in Sig(A). Moreover, their inclusion relations mirror one another: Sig(A)⊆Sig(B) if and only if Ext(A)⊆Ext(B). A correspondence theory of truth follows. As Part II explains, the truth-conditions of the universal affirmative are stated in terms of extensional inclusion: every S is P is true if and only if Ext(S)⊆Ext(P). But this holds exactly when Sig(S)⊆Sig(P).

The reader should be warned that the definition of extension in the Logic is rather different from the usual one in modern logic. Modern usage, which follows Leibniz and Frege, identifies the extension of A with Sig(A). That is, in modern usage the extension of A is a set of individuals, not ideas. Although the Logic’s usage has fallen into desuetude, it has historical priority.

p. The Structure of Ideas

The ordering relations on ideas and extensions have suggested to some commentators that the Logic anticipates 19th century Boolean algebra. (See Dominicy 1984, Auroux 1982, and Auroux 1993.) The suggestion is intriguing but overblown. It is true that intentional content seems to be a set of modes and sets are ordered by the subset relation. This ordering, moreover, induces a containment relation on ideas: idea A is contained in idea B (briefly AB) if and only if the intentional content of B is a subset of the intentional content of A. In addition, every idea determines a significance range. The mapping from ideas to significance ranges is, moreover, many-one because distinct ideas may signify the same individuals. For example, a species-difference and its proprium would have the same significance range, as do the two terms Peter and the man who denied Christ three times. The mapping, moreover, is antitonic—it reverses the ordering: if AB, then Ext(B)⊆Ext(A). As pointed out above, there is also a one-one order preserving mapping from significance ranges to extensions. It follows that there is a many-one antitonic mapping from ideas to extensions: if AB, then Ext(B)⊆Ext(A). Thus, as Leibniz later observed, the order of extensions reverses the order of ideas. These are all genuine algebraic properties in the modern sense, and they are in some sense implicit in the Logic. On the other hand, these properties were not remarked upon by the Logic’s authors themselves.

In their own pre-algebraic language about containment, signification, and extension, the authors do remark on order and correspondence. It is an exaggeration, however, to say they noticed duality or the properties of a Boolean algebra (see Martin 2016c). They do not comment on the fact that the order of extensions reverses that of ideas, a necessary condition for duality. They do not point out that ≤ and ⊆ are reflexive, transitive, or asymmetric. Much less do they claim that abstraction and restriction satisfy the conditions for meet and join operations. Abstraction, for example, is treated as a one-place operation, and there is no suggestion that the set of ideas is “closed” under either abstraction or restriction. There is no textual evidence that they envisaged a maximal idea, which would have as its intentional content the set of all modes. It is also unclear whether being, the highest genus, should be regarded as a minimal idea. Is being in the comprehension of golden mountain or square circle? The authors avoid such issues. The few times they refer to a negation as an operation it is as the medieval notion of privative negation rather than as a complementation operation in the modern sense. (I:7, KM 148, B 42; II:15, KM 242, B 124. See Martin 2016b.) They do not even say explicitly that genera and species exhibit the structure of the tree of Porphyry. (See Auroux 1992, Auroux 1993.) All in all, the discussion of structure in the Logic is pre-algebraic, like discussions of structure found in medieval logic, of which the Logic is a continuation.

2. The Logic of Propositions

a. Summary

Part II discusses the properties of expressions in spoken language. Among other expressions it discusses nouns, pronouns, and verbs (II:1), the four categorical propositions (II:3), gappings (II:4-5), false ideas (II:7), exclusives such as only (II:10:1) and exceptives such as except (II:10:2), the alethic modalities (II:8), comparative adjectives (II:10:3), various compound sentences (II:9), and definitions (II:16). Expressions in spoken language represent propositions in mental language, essentially the categorical propositions of the syllogistic and their truth-functions. Part II concludes with the truth-conditions for categorical propositions (II:17–20) and conversion. Much of the material is unoriginal or of slight logical interest. Remarks here are limited to modality and the truth-conditions for categorical propositions.

b. Modality

The alethic modalities—possible, contingent, impossible, and necessary—are discussed briefly and characterized syntactically as verbal modifiers. There is no attempt to provide semantic analysis or truth-conditions. A point of interest is that in a series of mnemonic names setting out four squares of oppositions, one for each of the four modalities, they conflate contingent with possible. That is, they identify contingency with so-called “single-sided” possibility: it is contingent that P means it is possible that P. They are probably following the commentary tradition. At some points in the De Interpretatione Aristotle explains contingency in the single-sided sense, a conflation that had been regularly remarked upon by later commentators. The Logic’s authors may in fact be copying a virtually identical discussion of the mnemonic names from the logic of Eustache de Saint-Paul in which he makes the same conflation using the same names and squares. Fonseca in his logic of roughly the same period is more revealing. He reports Aristotle’s conflations of contingency with single-sided possibility and remarks that in 17th century logical discourse contingency had evolved to its double-sided sense. He nevertheless goes on in his text to provide a list that follows Aristotle and identifies the contingent with the possible. Regardless of the mnemonics at II:8, where the authors themselves actually use the word contingent or contingency in the Logic to state their own views, they use contingent in the double-sided sense following the general usage of the period. For example, in Part IV they describe knowledge of historical and human events, which is based on sensation, as “contingent” with the understanding that the events might have been otherwise.

c. Distributive and Confused Supposition

Part II concludes with sections laying out “axioms” for the truth-conditions for the categorical propositions of the syllogistic. These sections are some of the most interesting parts of the book. The account is not really axiomatic in the modern sense: it is rather a series of informal definitions. From the “axioms” and the explanatory remarks that accompany them, however, it is possible to abstract clear truth-conditions in the modern sense. What is interesting from a modern perspective is that truth is defined as a function of the semantic interpretations of a proposition’s parts, much as in a modern recursive definition. The particular way they do so is also of historical interest because it draws on ideas from medieval supposition theory.

In the theory of supposition, medieval logicians had distinguished various ways in which categorical terms refer. Depending on the type of propositions in which it occurs and its position, a term signifies either all the individuals in its scope, in which case it was said to have distributive supposition, or just some individuals, in which case it was said to have confused supposition. These species of supposition were explained in terms of characteristic entailments that hold between the proposition itself and specific conjunctions and disjunctions of multiple identity statements. (See Parsons 2014, Chapter 7.) There are four cases, one for each of the four types of categorical proposition. It is assumed that there are proper names for each of the individuals that a term signifies:

  1. A universal affirmative is equivalent to a long conjunction of disjunctions. For each individual in the subject’s scope there is a conjunct, and this conjunct consists of a disjunction that affirms of that individual that it is identical to one or the other of the individuals in the predicate’s scope.
  2. A particular affirmative is equivalent to a long disjunction of disjunctions. For each individual in the subject’s scope there is a disjunct, and this disjunct consists of a disjunction that affirms of that individual that it is identical to one or the other of the individuals in the predicate’s scope.
  3. A universal negative is equivalent to a long conjunction of conjunctions. For each individual in the subject’s scope there is a conjunct, and this conjunct consists of a conjunction that denies of that individual that it is identical to each of the individuals in the predicate’s scope.
  4. A particular negative is equivalent to a long disjunction of conjunctions. For each individual in the subject’s scope there is a disjunct, and this disjunct consists of a conjunction that denies of that individual that it is identical to each of the individuals in the predicate’s scope.

These equivalents can be stated briefly in the notation of sentential logic. Let us assume that the constants s1,…,sn name all the individuals in the scope of the subject S, and that p1,…,pn name all the individuals in the scope of the predicate P. The entailments for the four propositional forms are then:


By reference to these equivalents it is possible to give a semantic definition of distributive supposition. The definition depends on whether the term is a subject or predicate. A subject term is distributive if the proposition in which it occurs is equivalent to a conjunction of conjunctions or disjunctions and is non-distributive otherwise. A predicate is distributive if the proposition in which it occurs is equivalent to a conjunction or disjunction of conjuncts and is non-distributive otherwise:

Subject Predicate
A Distributive Non-Distributive
E Distributive Distributive
I Non-Distributive Non-Distributive
O Non-Distributive Distributive

d. Truth-Conditions for Categorical Propositions

What makes the entailments relevant to truth-conditions is that they suggest a way to characterize truth according to distributional properties. For example, a universal affirmative is true if the subject is truly distributive and the predicate non-distributive. From the perspective of modern truth-theory, however, any definition of truth in terms of the medieval notions of distribution and non-distribution would be flawed because it would be circular. Truth cannot be defined in terms of distribution because distribution is defined in terms of entailment, and entailment in terms of truth. Medieval logicians were not troubled about circularity because the distinction between distributive and non-distributive supposition was part of a system of classifying the different ways terms refer. There was no intention of incorporating supposition into a definition of truth, recursive or otherwise.

Arnauld and Nicole, however, in effect noticed that is possible to explain distribution and non-distribution directly without reference to the entailments of “descent and ascent.” It is then possible to use the distinction to state truth-conditions in a non-circular way. More precisely, it is possible to say in the metalanguage, without referring to object language identity statements, that in a true universal affirmative each referent in the extension of the subject is identical to some referent in the extension of the predicate, and so forth for the other propositional types. (See Martin 2013 and Martin 2016a. Compare Pariente 1985, who questions the influence of supposition theory.)

To explain the authors’ metalinguistic approach, it is useful to make use of the notation of restricted quantification. The notation attaches a subscript to the quantifier symbol naming its “extension.”

Here f(v) is a functor f applied to a variable v, and P(f(v)) is an open sentence saying something about f(v); [∀A f(v)] P(f(v)) is read as for any f(v) in A, P(f(v)); and [∃A f(v)] P(f(v)) is read as for some f(v) in A, P(f(v)).

The truth-conditions for the categorical propositions are easily stated in the metalanguage as facts about the identity or non-identity of all or some of the referents to individuals in the subject’s relevant extension to those in the predicate’s. In the notation below, the individual signified by term v, briefly Sig(v), is referred to as being in either the extension of the subject S, briefly Ext(S), or in the extension of the predicate P, briefly Ext(P), or in the intersection of the two extensions, briefly Ext(S) ∩ Ext(P):
In the formulas above let the right hand side be called the truth-conditions of the categorical proposition on the left. Let the outer (left-most) quantifier in the truth-conditions, which has the broader scope, be called the subject quantifier, and let the inner (right-most) quantifier, which has narrower scope, be called the predicate quantifier. It is then possible to define distributive term semantically. A term is used distributively in a proposition if the proposition is true and its quantifier in its truth-conditions is universal, and non-distributively if it is true and its quantifier is existential.

The Logic’s authors make the same distinction using different terminology. They prefer to use universal for distributive and particular for non-distributive, a common variant in earlier logic. They also noticed that the relevant extension of a term varies by its position. A term’s relevant extension is its entire extension if the term is the subject of universal affirmative, universal negatives, or particular negatives, or it is the predicate of a universal negative. In all other cases, a term’s relevant extension is the intersection of the extensions of the proposition’s two terms. (In modern general quantification theory, similar distinctions are made in terms of whether the proposition’s quantifier is monotonic.) Recall that in the Logic, a term’s extension is a set of ideas. In the authors’ terminology, then, a universal term is one that asserts of each idea in that term’s relevant extension that it is identical to (“is put in”) or is not identical to (“is not put in”) the ideas in the relevant extension of its collateral term. A particular term is one that asserts this identity or non-identity only of some. The truth-conditions axiomatized by the authors (II:17–20) can then be easily stated, first in their own terminology (in italics) and then in a modern paraphrase.

every S is P is true iff The proposition is affirmative. The subject is universal. The attribute is particular. The extension of the attribute is restricted by that of the subject. The attribute is put in the subject according to the entire extension of the subject.
The relevant extension of the subject S is its entire extension; the relevant extension of the predicate P is the restriction of its extension of P by that of S. Every element of the relevant extension of S is identical to some element of the relevant extension of P.
no S is P is true iff The proposition is negative. The subject and the predicate are universal. The attribute of a negative proposition is always taken generally. Negative propositions separate the attribute from the subject according to the entire extension of the attribute. The attribute is denied of everything contained in the extension of the subject.
The relevant extension of S is its entire extension. The relevant extension of P is its extension. Every element of the relevant extension of the subject is not identical to every element of the relevant extension of P.
some S is P is true iff The proposition is affirmative. The subject and predicate are particular. The extension of the attribute is restricted by that of the subject. The attribute is conceived only in part of the extension of subject.
The relevant extension of the subject S is the restriction of its extension by that of the predicate P. The relevant extension of P is the restriction of its extension by that of S. Some element of the relevant extension of S is identical to some element in the relevant extension of P.
some S is not P is true iff The proposition is negative. The subject is particular. The attribute is universal. The attribute is denied of everything contained in the extension of the subject. Negative propositions separate the attribute from the subject according to the entire extension of the attribute. Negative propositions separate this attribute from the subject, particularly if it is particular.
The relevant extension of S is its entire extension. The relative extension of P is its extension restricted by that of S. There is some element in the relevant extension of S that is not identical to any element in the extension of P.

 

e. The Correspondence Theory of Truth

Although these truth-conditions are formulated in terms of extensions, which are composed of ideas, the Logic’s broader intention is to capture a correspondence theory of truth. As explained in Part I, there is a one-one mapping between significance ranges and extensions. Accordingly, although the conditions above refer to the identity or non-identity of ideas in term-extensions, ideas in a term’s extension are proxies for individuals in the term’s significance range. It is true that the Logic’s authors say little about the expressive completeness of mental language. They make no claim that there is an idea in mental language for every individual that actually exists—that, in medieval terms, there is an individual concept for each existing thing. But to insure a genuine correspondence theory, all that the authors need assume is that the ideas in a term’s extension “cover” the individuals in the term’s significance range in the sense that for any individual signified by a term there is some idea in its extension that signifies it. In the trivial case in which an idea has no strict inferiors, the idea itself would “cover” its own significance range.

As they stand, the truth-conditions do not address the issue of existential import of the subjects of affirmative propositions. They do not provide for what happens when an affirmative proposition has a false idea as subject term. The discussion of false ideas in Part I requires that these propositions be false. The issue recurs in Part IV in the discussion of necessary and contingent truth.

3. The Logic of Arguments

a. Summary

(Formater: Insert paragraphs for this section here.)

b. The Syllogistic

Acceptable arguments, including the immediate inferences of the square of opposition and syllogisms, are described in terms of “rules.” An example is the rule for accidental conversion: universal affirmative propositions can be converted by adding a mark of particularity to the attribute which becomes the subject (II:1, KM V 250, B 132). All these rules are laid down without proof. A modern reader, on the other hand, would expect that the authors, having just stated the truth-theory for the categorical propositions in Part II, would have made some effort to argue for the validity of the rules in Part III. They seem to have thought, however, that the rules they cite are too obvious to require justification, and indeed most of the logic of Part III is trivial. They remark, for example, that “there is little value in knowing the rules of the syllogism” (IV:introduction, KM V 354, B 227). Of logical errors they say, “… it is almost impossible for a person of average intelligence who has some insight ever to fall into them” (IV:8, KM V 384-385, B 252). On the other hand, in the later sections of Part III, they attach much importance to the avoidance fallacies, especially various kinds of equivocation.

The rules describing the square of opposition, conversion, and the valid moods are formulated using a series of technical terms: subject and predicate; affirmative and negative proposition, universal and particular term; universal and particular proposition; syllogism; major, middle, and minor term. (A singular proposition is classified as a special case of universal proposition.) These are all understood syntactically with the exception of universal and particular term, and affirmative and negative proposition, which in Part II also have semantic senses.

c. Validity

It is surprising that the authors do not attempt to prove the validity of their rules. It had been common to do so in logic since Aristotle. Nor do they attempt a syntactic account of what counts as a valid mood. In the Logic’s first edition they do discuss the traditional method of reducing the valid moods to Barbara and Celarent and describe a set of traditional mnemonic names for the reductions. (See B, xxxv and 156.) From a modern perspective, that procedure is not without interest because it is an early form of an axiom system even though the set being axiomatized (the valid syllogisms) is trivially finite. The authors, however, dismiss reductions as “useless” and omit the topic in later editions. (III:8, B 156. See B, xxxv and 156.) They may do so because they reject one of the traditional reduction rules, per contradictionem (if A,~B├ ~C, then A,CB). The rejection is perhaps due to their more general doubts about indirect proof in Part IV. (See III:9, KM 276, B 157 and IV:2, KM V 367, B 238; IV:9, KM V 388, B 255.)

Although the authors do not prove the validity of their rules or attempt a syntactic characterization of the set of valid moods in general, they do provide what amounts to a syntactic decision procedure for the set of valid moods. They do so by laying down six syntactic rules (III:3), which in various forms have been repeated in textbooks ever since.  

Rule 1. The middle term cannot be taken particularly twice, but must be taken universally once.

Rule 2. The terms of the conclusion cannot be taken more universally in the conclusion than in the premises.

Rule 3. No conclusion can be drawn from two negative propositions.

Rule 4. A negative conclusion cannot be proved from two affirmative propositions.

Rule 5. The conclusion always follows the weaker part. That is, if one of the two propositions is negative, the conclusion must be negative; if one of them is particular, it must be particular.

Rule 6. Nothing follows from two particular propositions.

The set of six rules is not new. The four rules that do not mention universal and particular terms were common in medieval logic. Rules 1 and 2, which are today known as “process rules,” are formulated in terms of universal and particular term and are found in contemporary works. The complete list of six rules is given verbatim in Eustache de Saint-Paul. (Summa philosophiae quadripartita, Logia III.2.I. 117, Eustachio-De-S.-Paulo 1648.) Leibniz later used the same rule set in his more formal version of the syllogistic dividing Rule 5 into two. (See Lensen 1990.)

The rules are interesting if understood syntactically. The vocabulary in which they are framed is clearly syntactic except for perhaps universal and particular term, which in Part II had been defined semantically. But as any student knows who has used the rules, universal and particular term also have simple syntactic definitions. A term is universal (or distributive) if and only if it is the subject of a universal proposition or the predicate of a negative. In a number of places in Part III, the authors refer to them syntactically.

Viewed syntactically, the rule set provides a decision procedure for the set of valid moods. By reviewing syntactically each of the 256 syllogisms, it is easy to confirm of each syllogism that if it is not on the list of 24 valid moods, it violates a least one rule. Conversely, it is easy to check, again syntactically, that if a syllogism violates a rule, it is not on the list of 24 valid moods. However, the authors of the Logic are not interested in metatheory. They do not explicitly make the point that the valid rules are exactly those that do not violate a rule, much less prove it.

4. Method

a. Summary

Part IV is about epistemology, both scientific knowledge, which is certain and based on clear and distinct ideas, and lesser sensory knowledge, which is contingent and concerns current events, history, and the future. The goal is to spell out logic’s role in scientific discovery and justification. The introductory sections (IV:1) distinguish genuine knowledge from philosophical and mathematical speculation, which is illustrated by puzzles arising from infinite divisibility. They continue (IV:2–3) with an account of scientific “method,” which consists of reasoning from causes to effects and conversely from effects to causes. The method they describe, which is divided into analysis and synthesis, makes implicit use of syllogistic techniques. Part IV’s central sections (IV:4–12) contain an extended discussion of scientific and sensory knowledge, including “demonstration,” which is another name for logic. The final sections (IV:13–16) warn about the epistemological and moral difficulties connected with faith and contingent beliefs. Remarks here focus on the central sections concerning epistemology, demonstration, and sensory knowledge. They conclude with an explanation of the role of syllogistic logic implicit in the authors’ notion of method.

b. Necessary and Contingent Truth

Certainty in science and the method for achieving it depend on the kind of truth being sought, and in particular, on whether the goal is necessary or contingent truth. The distinction between necessity and contingency had previously been made in Parts I and II. There, the necessity of essential definitions was contrasted with the contingency of accidental predications. The distinction also played a role in the discussion of false ideas. Affirmations with non-referring subjects are false. Some of these are impossible because the subject term has a comprehension that combines modes that are contrary, contradictory, or naturally incompatible (Part II:i). Part IV expands on the distinction between necessary and contingent truth, committing itself to the view that they differ in existential import:

The first reflection is that it is necessary to draw a sharp distinction between two sorts of truths. First are truths that concern merely the nature of things and their immutable essence, independently of their existence. The others concern existing things, especially human and contingent events, which may or may not come to exist when it is a question of the past. I am referring in this context to the proximate causes of things, in abstraction from their immutable order in God’s providence, because on the one hand, God’s providence does not preclude contingency, and on the other, since we know nothing about it [that is, contingent creation], it contributes nothing to our beliefs about things.

For the other kind of truth [viz. of essential natures], since everything [of this sort] is necessary, nothing is true that is not universally true. So we ought to conclude that something is false if it is false in a single case. (IV:13, KM V 398, B 263. See also II:13:iv)

The authors are committing themselves here to one side of a long debate. Earlier logicians were generally agreed that a contingent affirmation with a non-referring subject is false, but they were divided about the case of necessary propositions like essential definitions. Was “Humans are rational animals” true before creation, when there were no humans? Is a chiliagon has 1000 sides true today even though there are no actual chiliagons? Logicians like Aristotle and William of Ockham were clear that all propositions with non-referring subjects are false, even empty affirmations about a species’ nature. Others, like William of Sherwood, John Buridan, and Francisco Suárez, allowed that propositions that affirm of a species its nature (for example, essential definitions) have a timeless status. If true, they are necessarily true. Descartes too held that God can make some affirmations eternally true, like the species definition every triangle has three sides. On the other hand, Descartes appears to be open to inconsistency because he seems to have been the inspiration of the Logic’s doctrine of false ideas. He held that affirmations with non-referring subjects like chimera are false and a major source of error (compare Meditations III.6, Martin 2011). In the passage above, the Logic’s authors commit themselves to the view that propositions that affirm of a species its nature make no existential claim and that if they are true, they are necessarily so. Other affirmations that do not predicate an essence of a species, including propositions concerning worldly matters—for example, reports of sensation or claims about people, history, or geography—are contingent and carry existential import. It follows that the truth-conditions for categorical affirmatives in Part II, which as stated there do not require existential import, must be amended. For affirmations other than those affirming a nature of a species, an additional condition is necessary. Their truth-conditions should contain the requirement that the subject term signifies at least one existing thing. Doing so would also bring the Logic’s truth-theory into agreement with the prevailing opinion in medieval and then contemporary logic (see Ashworth 1973).

The distinction between necessity and contingency is important in epistemology. Though we can know or fail to know both what is necessary and contingent, the degree of certainty attached to each is different. The most important source of certainty is clear and distinct ideas.

c. Certainty, Clear and Distinct Ideas

In the first of two epistemological axioms, the Logic’s authors endorse Descartes’ doctrine that clear and distinct ideas are a source of knowledge:

First Axiom: Everything contained in the clear and distinct ideas of a thing can be truthfully affirmed of that thing (IV:6, KM V 378, B 250).

Examples of clear and distinct ideas include: ourselves as thinking beings, thinking, judging, reasoning, doubting, willing, desiring, sensing, imagining, shape, motion, rest, extended substance, existence, duration, order, number, and God.

Part I makes clear that substantives and adjectives, which are ideas, have intentional content. Sensory perceptions also have content. It consists of the many modes that flood awareness on the occasion of a sensation. This content, according to the Cartesians, may be clear or distinct. In the logical tradition, distinctness has usually been contrasted with generality. To say that an idea is distinct is to say it is a single idea. If it is distinct, it is unambiguous, and its content is internally consistent and possible (II:12, B 112). Clarity is probably to be understood as it is in Aquinas: it is a kind of intellectual light, a gift of God that allows the soul to be aware of an idea’s modal content. (Compare Aquinas, De veritate, q. 13 a. 2 arg. 4.) Thus, if the soul instantiates a clear and distinct idea, it is aware of a consistent and coherent modal content. Axiom I then says that if S is an idea that is conceived by the soul with clarity and distinctness and P is a mode in its content, the soul knows with certainty that every S is P is true. If the proposition is an essential truth, then it is necessary. Moreover, since S’s comprehension is coherent, the proposition S exists is possibly true, even if S has no actual instances. On the other hand, if the proposition is contingent and true, the proposition S exists is actually true.

Examples of clear and distinct ideas are not limited to species, nor is the knowledge they impart limited to essential definitions. An example is the cogito. Several times the Logic endorses Descartes’ argument that because the soul has a clear and distinct idea of itself as a thinking thing, it knows that it exists. The existence of a soul, however, is not necessary, nor is existence part of its nature. Most of the Logic’s examples of clear and distinct ideas, on the other hand, are from Cartesian science or metaphysics, and their contents illuminate essences. Part IV stressed that the bulk of scientific knowledge consists of the knowledge of essences imparted by clear and distinct ideas.

The necessity of essential truths is highlighted by Part IV’s second epistemological axiom, which is about possibility:

Second Axiom: At least possible existence is contained in the idea of everything we conceive clearly and distinctly (IV:6, KM V 378, B 250).

God insures that the soul never has a clear and distinct idea of an impossible being. The second axiom entails that if an essential affirmation every S is P is grounded in a clear and distinct idea, which is the preferred case in science, then possibly there is an S is true. (Arnauld makes clear elsewhere that he does not believe in the existence of possibilia as a category of being distinct from actual things. See “Arnauld to Leibniz,” May 13, 1686; KM VI, pp. 31–32; Stencil 2016). In the Logic’s first edition the authors go so far as to single out possibility as a marker of truth for essential affirmations:

possibility is a sure mark of the truth with respect to what is recognized as possible, whenever it is a question only of the essence of things (IV:13, B 263.)

The authors are making a point familiar from modal logic. An essential truth is either necessary or impossible. Thus, if it is possible, it is necessary. In the case of an essential definition, then, if it is known to be true scientifically by means of a clear and distinct idea, even if its subject term fails of actual reference, it is true and its subject is possible.

In the same text, the authors explain why a geometrical construction is also a source of certainty. Again, the reasoning turns on possibility. The mere construction of a figure with a property shows that the figure possibly possesses that property. But since the properties of a geometrical figure are either necessary or impossible, this possibility alone insures that the property holds of figures of that type necessarily.

While it is true that clear and distinct ideas have the premier role in scientific justification, they are not the only sources of knowledge. Less certain varieties of knowledge are based on demonstration and sensation.

d. Demonstration

Descartes seemed to have explained demonstrations by appeal to clear and distinct ideas. He interpreted the propositions that make up the lines of a logical or mathematical proof as a series of independent epistemological insights, each justified by its own clear and distinct idea. Once an individual line was formulated and appreciated, the thinker is inspired to conjure up an additional clear and distinct idea, and this forms the justification for the next line of the proof, and so on for the proof’s subsequent lines (See Gaukroger 1989.)

The Logic’s authors have a more modern idea of proof. A demonstration, as they understand it, is a series of lines, each of which is either a premise that is either previously proven or certain in itself, or a line that follows logically from earlier lines of the proof. They say:

A true demonstration requires two things: one, that the content includes only what is certain and indubitable; the other that there is nothing defective in the form of the argument. (IV:8, KM V 384, B 251)

Four types of premises are acceptable in a sound proof: propositions that affirm the content of a clear and distinct idea; nominal definitions, which are true by convention; properties of a geometric construction; and previously proven propositions (IV:8, KM V 384, B 251). All other lines in the demonstration must follow formally from earlier lines by rules of logic, presumably the rules of Part III. Perhaps oddly, the authors regard the application of logical rules as relatively trivial. Applying logical rules, they say, is “natural.” How to do so “does not need to be studied” (IV:7, KM V 397, B 252).

e. Sensation and Knowledge of Contingent Truth

A third source of knowledge is sensation. Although sensation is not certain, it is reliable. Its reliability is based on a demonstration. This is the (brief) argument that sensation is reliable because, if not, God would be a deceiver, which he is not. (VFI 28, KM I 355, G 213–214.) Sensory knowledge, moreover, is largely limited to contingent truths about past, present, or future individuals or events.

Although the account of sensation in Part IV is brief, Arnauld explains it more fully in “On True and False Ideas” (VFI 1; KM I, 190, 193, 195–196, 199; G 58, 62–63, 66). There, Arnauld debates Malebranche on whether perception is representational. Arnauld argues fiercely that the soul perceives the world directly, while Malebranche holds that the soul perceives only intermediate representations, which he identifies as ideas in God’s mind. Arnauld maintains that in sensory perception there are only two substances: the soul and the object sensed. As he puts it, the soul perceives the world “by the idea.” The process has two stages. First, the soul is aware that it is having a perception. Second, it is aware of the perception’s content, which consists of modes, some of which are true of the object in the world being perceived and some of which are true of the soul.

At one point he describes perception as a “relation” (VFI 5, KM I 198, G 66. Compare Raconis, De principiis entis a. 3, 827). According to the then-standard analysis of relations, a relational fact between two individuals breaks down into two non-relational substance-mode facts true respectively of the two relata. In other words, the fact that the relation holds breaks down into two nonrelational facts, in each of which the relatum possesses a mode characteristic of its role in the relation. Perception is such a relation. According to Arnauld, when a perception obtains, the soul instantiates a mode, namely an idea which possesses an intentional content that the soul is aware of. Simultaneously, the object sensed instantiates its modes, namely those modes that impact the body’s sense organs. That the soul and the sensed material object possess their respective modes constitutes the relational fact that the one is perceiving the other. It is God, who is not a deceiver, who insures that the material modes in the idea’s content match the modes of the object outside the mind.

Accordingly, veridical sensation consists of a vivid awareness of multiple modes all at once. Some of these are material modes, and as such they are true of the object outside the mind. These material modes consist of various geometrical and mechanical properties that hold to matter according to Cartesian physics. A perception, however, also contains modes of the soul. These are the sensory modes of color, taste, sound, etc. as well as psychological modes like feelings and states of mind. This rich group of material and spiritual modes constitutes the “content” of the perception. Despite the fact that a perception has a content, it is not an idea. Perceptions, for example, do not serve as terms in the propositions of mental language. Its content would be automatically “false” of any subject because it includes a contrary mixture of material and spiritual modes. Rather, the role of perceptual experience is to provide a rich source of modes for abstraction. The modes that the soul is aware of at the time of a veridical sensation are in fact instantiated, some in matter and some in the soul. If on the occasion of a sensation the soul abstracts an idea with a purely material content, the idea is true of the objects in the world impacting the body’s sense organs; if it abstracts an idea with a purely spiritual content, it is true of the soul. Thus, although the Logic’s authors are “rationalist” Cartesians and attach premier importance to clear and distinct ideas, they also allow for empirical knowledge of the material world, albeit of a less certain sort.

f. Method: Analysis and Synthesis

Although medieval logicians had much to say about method, 16th century figures like Peter Ramus had initiated renewed interest. (See, for example, Edwards 1967.) Earlier in the Logic the authors had made brief methodological remarks on classification and its pitfalls not unlike those of Ramus (IV:2, KM V 243, B 125). Part IV begins with an extended discussion of method.

In the Logic’s account (IV:2, KM V 362-367, B 233–238), method divides into analysis and synthesis. Analysis reasons from effects to causes or from the specific to the general, and synthesis reasons inversely, from causes to effects or from the general to the specific. Both presuppose that science classifies its subject by ideas of increasing generality.

The paradigm the authors seem to have in mind is a chain of syllogisms in the mood Barbara. The chain starts with an affirmative premise that characterizes its subject in terms of a narrow species and finishes with a conclusion that predicates of it a more general idea. The same chain in reverse is a synthesis.

The authors provide an example of analysis. In it, the investigator “discovers” that a subject has St. Louis as a remote ancestor or “cause.” The pattern is a series of syllogisms in Barbara. Each syllogism in the series has two premises, one affirming of a subject that he is the descendent of his father, and a second affirming of his father that he is the descendent of his grandfather. The syllogism’s conclusion then affirms that the subject is the descendent of his grandfather. This pattern is repeated, one syllogism for each subsequent generation, until an ultimate conclusion affirms that the original subject is a descendent of St. Louis. Because increasingly earlier ancestors have increasingly more descendants, each succeeding predicate has a broader extension, and the final predicate is a descendent of St. Louis is the most general of all.

A is a descendant of B, every descendant of B is a descendant of C / ∴ A is a descendant of C

A is a descendant of C, every descendant of C is a descendant of D / ∴ A is a descendant of D

A is a descendant of D, every descendant of D is a descendant of E / ∴ A is a descendant of E

A is a descendant of E, every descendant of E is a descendant of St. Louis / ∴ A is a descendant of St. Louis

Synthesis is the series in reverse order.

Analysis is also called resolution and the method of discovery. It reasons from effect to cause, from a narrower to a broader predicate. Because an effect follows from its cause, it is said to be a posteriori.

Synthesis reasons from cause to effect, and is called the method of composition. Because causes are prior, synthesis is said to be a priori.

Although it is odd to a modern reader to regard a more general class as the cause of its subsets, it was normal in earlier philosophy. Aristotle regarded the genus as the formal cause of the species, and Neoplatonism considered higher nodes in the ontological tree as the more causally productive. The authors retain this paradigm is their understanding of the hierarchy of genera and species represented by the tree of Porphyry. When analysis is applied to the pursuit of the essential truths, it carries the investigator from knowledge of species lower in the tree to knowledge of a genus higher in the tree. As each conclusion is drawn, a new premise would be required that assigns a genus to the species mentioned in the preceding conclusion. An example of the method applied to genera and species is

Socrates is a human, every human is an animal /∴ Socrates is an animal

Socrates is an animal, every animal is a living creature /∴ Socrates is a living creature

Socrates is a living creature, every living creature is a body /∴ Socrates is a body

Socrates is a body, every body is a substance /∴ Socrates is a substance

This chain starts by assigning to Socrates the narrow predicate human, which has the comprehension {rational, self-moving, living, corporeal, being}. It proceeds through species with comprehensions of increasingly fewer modes. It finishes assigning to Socrates the most general genus.

The authors of the Logic were not alone among their contemporaries to have this understanding of cause, or of analysis and synthesis. Spinoza argues for his own version of quasi-Neoplatonic causation in which the order of cause to effect was the same in a sense as the order in logic of predicate to subject. Hobbes defends an account of analysis and synthesis that is almost identical to the Logic’s (Hobbes, De Corpore I.6.1, 66: Hobbes 1992). In various papers, Leibniz explores versions of analysis that are essentially more formal versions of the Logic’s. It was typical of Leibniz to symbolize the predicate of a universal affirmative as a series P1Pk of concatenated terms. In his notation, the term letters are intended to stand for modes that are like those that make up species-comprehensions in the Logic. In a typical example, Leibniz lays down an initial premise S is P1Pk. The “analysis,” then, is a deduction that proceeds by the application of a simplifying inference rule that deletes terms from the predicate, thus making the new line’s predicate more general. The deduction terminates in a line with the most general predicate of all. The inference rule would be: S is X1XnS is X1Xn-1. (See, for example, De arte combinatoria in Parkinson 1966, and Swoyer 1995.)

S is P1,P2,P3,P4

S is P1,P2,P3

S is P1,P2

S is P1

In sum, the Logic’s notion of cause and its associated methods was a symptom of its time. What is of interest from the perspective of logic is that its details make implicit use of technical ideas from syllogistic logic. It should also be remarked, however, that it is hard to see how these methods actually would be of use in Cartesian physics, about which the Logic says very little.

5. References and Further Reading

a. Primary Sources

  • Arnauld, Antoine. 1813. Œuvres Philosophiques d’Antoine Arnauld. Paris: Adolphe Delahays. Abbreviated VFI.
  • Arnauld, Antoine. 1990. On True and False Ideas. Transalated by Stephen Gaukroger. Manchester:, Manchester University Press. Abbreviated G.
  • Arnauld, Antoine. 2003. Œuvres Philosophiques d’Arnauld. Edited by Elmar Kremer and Denis Moreau. Bristol: Theommes Press. Abbreviated KM.
  • Arnauld, Antoine and Pierre Nicole. 1996. Logic or the Art of Thinking. Translated by Jill Vance Buroker. Cambridge: Cambridge University Press. Abbreviated B.
  • Buridan, John. 2001. Summulae de Dialectica. New Haven: Yale University Press.
  • Eustachio-de-S.-Paulo. 1648. Summa philosophiae quadripartita, de rebus dialecticis, ethicis, physicis et metaphysicis. Cantabrigia [Cambridge]: Rogerus Danielis.
  • Fonseca S.J., Petrus. 1599. Commentarii in XII libros Metaphysicarum Aristotelis. Frankfurt.
  • Hobbes, Thomas. 1992. De Corpore. Edited by William Molesworth. London: Routledge-Thoemmes Press.
  • Raconis, C. F. d’Abra de. 1651. Tertia Pars Philosophiae seu Physicae, Quarta Pars Philosophiae seu Metaphysicae. Totius Philosophiae, hoc est Logicae, Moralis, Physicae et Metaphysicae, brevis et accurata, facilique et clara methodo disposita tractatio. Lugdunum [Lyon]: Irenaeus Barlet.
  • Suárez, Francisco. 1995. On Beings of Reason (De entibus rationis) Metaphysical Disputation 54. Milwaukee: Marquette University Press.
  • Toletus S.J., F. 1596. Commentaria una cum quaestionibus in universam Aristotelis logicam. Cologne: Agrippina.
  • William of Ockham. 1978. Expositio in librum Perihermenias Aristotelis. Edited by A. Gambatese and S. Brown. St Bonaventure, New York: Franciscan Institute.

b. Secondary Sources

  • Ashworth, E. J. 1973. “Existential Assumptions in Late Medieval Logic.” American Philosophical Quarterly, 10: 141–147.
  • Auroux, Sylvain. 1982. L’Illuminismo Francese e la Tradizione Logica di Port-Royal. Bologna: CLUEB.
  • Auroux, Sylvain. 1992. “Port-Royal et l’arbre de Porphyre.” Archives et documents de la Sociéte d’histoire et d’épistémologie des sciences du langage, 6: 109–122.
  • Auroux, Sylvain. 1993. La Logique des Idées. Montréal, Paris, Bellarmin : Vrin.
  • Chomsky, Noam. 1966. Cartesian Linguistics. New York: Harper and Row.
  • Cronin, T. J. 1966. Objective Being in Descartes and Suárez. Rome: Gregorian University Press.
  • Dominicy, Marc. 1984. La Naissance de la Grammaire Moderne, Bruxelles: Pierre Mardaga.
  • Edwards, William F. 1967. “Randall on the Development of Scientific Method in the School of Padua—A Continuing Reapraisal.” In Naturalism and Historical Understanding, edited by John P. Anton, 53–69. State University of New York.
  • Garber, Daniel. 1993. “Descartes and Occasionalism.” In Causation in Early Modern Philosophy, edited by Steven M. Nadler, 9–26. University Park, Pennsylvania: Pennsylvania State University Press.
  • Gaukroger, Stephen. 1989. Cartesian Logic. Oxford: Oxford University Press.
  • Lensen, Wolfgang. 1990. “On Leibniz’s Essay Mathesis Rationis.” Topoi, 9, 29–59.
  • Martin, John N. 2011. “Existential Import in Cartesian Semantics.” History and Philosophy of Logic, 32:2, 1–29.
  • Martin, John N. 2012. “Existential Commitment and the Cartesian Semantics of the Port Royal Logic.” In New Perspectives on the Square of Opposition, edited by Jean-Yves Beziau. Bern: Peter Lang.
  • Martin, John N. 2013. “Distributive Terms, Truth, and The Port Royal Logic.” History and Philosophy of Logic, 34:2, 133–154.
  • Martin, John N. 2016a. “A Note on ’Distributive Terms, Truth, and The Port Royal Logic’.” History and Philosophy of Logic, 37:4, 391–392.
  • Martin, John N. 2016b. “Privative Negation in The Port Royal Logic.” Review of Symbolic Logic, 9, 23.
  • Martin, John N. 2016c. “The Structure of Ideas in The Port Royal Logic.” The Journal of Applied Logic, 19, 1–19.
  • Martin, John N. 2017. “Extension in the Port Royal Logic.”  South American Journal of Logic, 3:1, 1-20.
  • Nadler, Steven. 2011. Occasionalism: Causation among the Cartesians. Oxford: Oxford University Press.
  • Nadler, Steven M. 1989. Arnauld and the Cartesian Philosophy of Ideas. Manchester: Manchester University Press.
  • Pariente, Jean-Claude. 1985. L’Analyse du Langage à Port-Royal. Paris: C.N.R.S. Éditions de Minuit.
  • Parkinson, G. H. R. 1966. Leibniz, Logical Papers. Oxford: Clarendon Press.
  • Parsons, Terence. 2014. Articulating Medieval Logic. Oxford: Oxford University Press.
  • Pasnau, Robert. 1997. Theories of Cognition in the Later Middle Ages. Cambridge: Cambridge University Press.
  • Stencil, Eric. 2016. “Essence and Possibility in the Leibniz-Arnauld Correspondence.” Pacific Philosophical Quarterly, 97, 2–26.
  • Swoyer, Chris. 1995. “Leibniz on Intension and Extension.” Nous, 29, 96–114.

 

Author Information

John N. Martin
Email: john.martin@uc.edu
University of Cincinnati
U. S. A.

 

Novalis (Georg Philipp Friedrich von Hardenberg) (1772-1801)

“Novalis” was the pseudonym of Georg Philipp Friedrich Freiherr von Hardenberg, an early German Romantic philosopher, poet, and novelist. Born into a Pietistic family of minor, slightly cash-strapped, Saxon nobility in 1772, he died of tuberculosis in 1801 at the age of 28. Novalis is sometimes seen as the paradigmatic figure of German Romanticism: His early death, the illness and death of his young fiancée Sophie a few years earlier—which inspired one of his most famous works, Hymns to the Night—and the sometimes mystical style of his writing have contributed to his reputation as an otherworldly, even morbid poet. However, Novalis was also a trained philosopher working within the post-Kantian Idealist tradition, with a concern for the problems that occupied this tradition: the possibility of freedom and the nature of the human vocation, the basis of knowledge, the relationship between nature and science, the significance of religion, and the best way to promote a thriving and ethical community.

Novalis was a central figure in the Jena circle of early German Romantics, which was influenced by the work of Fichte, Herder, Goethe, and the Christian mystic Jakob Boehme, and which included Friedrich and August Wilhelm Schlegel, Ludwig Tieck, Caroline Schlegel, Dorothea Veit-Schlegel, and others. During his short life, Novalis wrote philosophical fragments (some of which were published in the Schlegel brothers’ journal Athenaeum), as well as poetry, novels (The Novices of Saïs and Henry of Ofterdingen), philosophical essays (including “Christendom or Europe” and “Faith and Love or The King and Queen”), and notes and short essays on science, medicine, religion, history, language, art, and nature, including many intended for an encyclopedia, which are available in translation as Notes for a Romantic Encyclopaedia. Most of these works were only published after Novalis’s death, with the collection of his writings by Ludwig Tieck and Friedrich Schlegel.

Table of Contents

  1. Life and Works
  2. Cosmology
  3. Novalis’s Account of History
  4. Subjectivity and the Vocation of Humankind
  5. Romanticization and Poetry
  6. The Artist as Genius
  7. Language and the Fragment
  8. The Mediator
  9. Relation to Christianity
  10. References and Further Reading
    1. Works by Novalis in German
    2. Works by Novalis in English Translation
    3. Works About Novalis and Early German Romanticism in English

1. Life and Works

Georg Philipp Friedrich Freiherr von Hardenberg, better known by his pen name Novalis, was born on May 2nd, 1772, at his family’s home at Schloss Oberwiederstedt in the Harz Mountains, about 80 kilometers from Leipzig. Friedrich was the oldest son of eleven children born to Heinrich Ulrich Erasmus Freiherr von Hardenberg (1738-1814) and Auguste Bernhardine von Hardenberg (née von Bölzig; 1749-1818). Hardenberg’s family belonged to the Saxon nobility, although of a relatively low rank, and financial worries feature in many of Hardenberg’s letters as a young man. Financial concerns also motivated the family to move from Schloss Oberwiederstedt to a smaller home in Weißenfels in 1784. Hardenberg’s father was a follower of the pietistic Herrnhuter (Moravian) church founded by Zinzendorf, and attempted to raise his family in strict adherence to his pietistic beliefs. This upbringing had a lasting effect on Hardenberg’s thought.

From 1783 to 1784, Hardenberg lived with his wealthy, aristocratic uncle, Friedrich Wilhelm Freiherr von Hardenberg, who exposed the young Hardenberg to his extensive library and interest in Enlightenment thought. Hardenberg subsequently moved with his family to their home in Weißenfels, situated between Leipzig and Jena, and until 1790 studied at the gymnasium in Eisleben, where the curriculum emphasized literature and rhetoric.

In 1790, Hardenberg began his studies in jurisprudence in Jena before moving to Leipzig and then Wittenberg. Despite beginning his time at university by devoting most of his attention to having fun and flirting with women, when he completed his studies in 1794 he obtained the highest possible grade. During this time, Hardenberg met and studied with or befriended many notable figures who were to have a profound influence on his thought, including Fichte, Schiller, Reinhold, Jean Paul, Schelling, and August and Friedrich Schlegel. Friedrich Schlegel, whom he met in Leipzig, became a particularly close friend and interlocutor, and was a central figure of the Jena circle of early German Romantics. Also during this time, in 1791, Hardenberg published his first piece, a poem dedicated to his friend and mentor Schiller, titled “Klagen eines Jünglings” (“A Youth’s Lament”), in Der Neue Teutsche Merkur.

In 1794, Hardenberg took a job as an assistant to the district official in Tennstedt, Coelestin Augustin Just, who became his friend and, after his death, his biographer. It was while working for Just that Hardenberg travelled to Grüningen, where he met the twelve-year-old Sophie von Kühn at the home of her parents. According to his own account, Hardenberg was immediately captivated by Sophie, and they became engaged the following spring, when Sophie was just thirteen. However, in 1795 Sophie became ill with consumption, from which, after several painful surgeries, she died in 1797 at the age of 15. Sophie’s death came just a few weeks before that of Hardenberg’s closest brother Erasmus, and Hardenberg’s diary entries following their deaths reveal a deep depression from which he gradually emerged over the following months. During this time, he had a vision on Sophie’s grave that he recorded in his diary; with only minor modifications this became the first hymn of his famous Hymns to the Night, which were published in 1800.

From 1795, Hardenberg was employed as a salt-mine inspector for his father. Hardenberg took this apparently mundane role very seriously, and some of his writings, including the novel Henry of Ofterdingen, give rocks and mines an important place as analogies for various aspects of the universe and the self. In 1795 and 1796 Hardenberg studied Fichte intensively, and his notes from this time are published as his Fichte Studies.

In 1797, Hardenberg entered the Mining Academy of Freiberg. He also immersed himself in the study of biology, history, medicine, and the philosophy of Schelling, Kant, Spinoza, Hemsterhuis, and others. In 1798, he published one of his influential pieces, a set of fragments called “Blüthenstaub,” or “Pollen,” in the Schlegel brothers’ journal Athenaeum. The fragments were worked on to some extent by Friedrich Schlegel, reflecting the early Romantic idea of “symphilosophy,” or performing philosophy together. In the same year, Hardenberg published “Faith and Love or the King and Queen” in Yearbooks of the Prussian Monarchy. In this piece, he praises the new King and Queen of Prussia, Friedrich Wilhelm III and Luise, while using them as metaphors to outline his ideas on the ideal state. The essay was not well understood or received at the time, with the monarchs as well as Friedrich Schlegel expressing strong disapprobation. Hardenberg’s pen name “Novalis” first appears as the author of these texts. The name means “new land” and recalls the name “de Novali,” which was used by some of Hardenberg’s ancestors. Hardenberg’s notes for an encyclopedia project, available in German as Das Allgemeine Brouillon and in English translation as Notes for a Romantic Encyclopedia, also date from this period.

In 1798, Hardenberg met Julie von Charpentier, the daughter of a minerology professor at Freiberg. They became engaged the following year. However, like his engagement to Sophie, Hardenberg’s betrothal to Julie was never fulfilled, as he fell ill later that year with the tuberculosis that was to kill him in 1801. The last years of Hardenberg’s life were extremely busy: He worked again as a salt-mine inspector and was promoted to director, and was also appointed a magistrate of Thuringia. In 1799 he met Ludwig Tieck, who immediately became a close friend, with whom he absorbed himself in the study of Jakob Boehme. During this time, Hardenberg wrote his essay “Christendom or Europe” as well as the short novel The Novices of Saïs, the poems collected as Geistliche Lieder (Spiritual Songs), and The Hymns to the Night, which were published in 1800, and worked on Henry of Ofterdingen, a bildungsroman that Hardenberg never completed. In late 1799 the Jena circle of early German Romantics, including Hardenberg, Tieck, the Schlegel brothers and their spouses Dorothea Veit-Schlegel and Caroline Schlegel, and others, met regularly.

Throughout the last part of 1800 and the early months of 1801, Hardenberg’s health worsened, and on March 25th, 1801, he died at Weißenfels. Friedrich Schlegel was by his bedside while his brother Karl played piano for him, and his death was described as very peaceful.

After Hardenberg’s death, Ludwig Tieck and Friedrich Schlegel edited and published the first edition of his collected works, which appeared in two volumes in 1802 (a third volume was added in 1846). Tieck and Schlegel promulgated the myth of Novalis as the otherworldly arch-Romantic, which was unfortunately taken up uncritically by commentators and has shaped his reputation, and that of Romanticism, ever since. Hegel, in particular, contributed to a popular conception of Romanticism in general and Novalis in particular as morbid, overly emotional, and pathologically introspective. This conception does not do justice to Novalis’s rigorous and sophisticated engagement with the philosophical, political, scientific, religious, and literary thought of his time.

2. Cosmology

Novalis’s cosmology is pantheistic; that is, it explains the world as a manifestation of the divine. Novalis presents the universe, including human beings, as the self-development of an originally infinite, undifferentiated, unconscious unity into finite individual entities, for the purpose of self-knowledge, or self-consciousness. While the starting point for this idea was the philosophy of Fichte, Novalis was concerned that Fichte’s emphasis on the development of the subject, or “I,” through positing the object—that is, the real, physical world, or the “not-I” —stripped the physical world of freedom and selfhood: Novalis wonders whether Fichte had “stuffed too much into the I.” Novalis, by contrast, views the world outside the subject as an active interlocutor—as, effectively, another subject. The difference is often summed up by saying that Novalis turned Fichte’s “not-I” into a “you.” Underlying this attribution of selfhood to the world at large is Novalis’s claim that the universe is divine: It is the comprehensible realization of an infinite God, unfolding in space and time.

For both Novalis and Fichte, the self-differentiation of an original absolute into individual beings allows it to perceive and reflect on itself, by creating the subject-object distinction that, like Fichte, Novalis asserts is essential for cognition, and even consciousness. On this model, the reflection on each other of finite elements within the universe is also the self-knowledge of the universe. However, the nature of the universe as originally infinite, whole, and undifferentiated can never be perfectly known. This is because all knowledge and consciousness depends on the subject-object distinction, and so is necessarily mediated by the particular finite entities that make up the world: Thus, Novalis claims, “We seek the unconditioned everywhere, and find only things” (“Pollen,” Schriften II, p.412 #1). Perfect knowledge of the universe is, therefore, a regulative ideal.

While Novalis acknowledges the reality of the world of everyday experience comprised of particular entities, he claims that underneath these divisions the universe is whole, unified, and divine. Thus, Novalis situates human beings in two realities that, he maintains, are in fact two aspects of one and the same world: the everyday universe of individuated entities (including individual human beings), and a spiritual universe of undifferentiated unity. Novalis’s analysis of this model is sophisticated, identifying the limitations of understanding the world based on particular, finite, material things while recognizing its reality, necessity, and value within experience. He acknowledges that the categories and divisions of our everyday perspective on the world have value, expressing gratitude for “scientists” and “scholars” who have measured and calculated the physical world, advancing the self-knowledge of the universe even while they obscure its essential nature as a single spiritual whole; but he also points out some damaging consequences of espousing this worldview, and insists on the importance of moving beyond it. His philosophy and poetry are largely attempts to demonstrate how (and to what extent) we can, first, have epistemological access to the underlying divine unity of the universe; second, articulate this access and communicate it to others; and third, make the relationship between these worlds closer, moving towards a regulative ideal of a perfect correspondence, or unity, in which the material realm manifests its divine inner nature.

As part of this project, Novalis attempts to find ways of overcoming the divisions between individual entities as well as several dichotomies that characterize the way we tend to experience and understand the universe. These include those between subject and object, the divine and the mundane, the rational or spiritual and the emotional or sensuous, the conscious and the unconscious, activity and passivity, and freedom and determinism. Novalis maintains that the segregation of existence into these dualities is a source of unhappiness and alienation. The terms of the dichotomies are usually understood to be mutually exclusive, leading to a fragmentation of both the world at large and, specifically, human identity, and often a rejection and/or devaluation of one or other of the terms. In particular, Novalis is concerned that this dualism often results in devaluation of the physical, emotional, sensuous, unconscious, and mundane aspects of the universe. According to Novalis, this fragmenting and alienating tendency also divides human beings from important parts of themselves, which on this model are construed as external to them—these parts include the natural world, other human beings, and God. In Novalis’s cosmology, while we currently experience these things as existing outside ourselves, at a more fundamental level they and we form a single whole.

For Novalis, the divisions and categories under which we usually perceive our environment obscure the unity of the cosmos and conceal its divine nature by presenting it as purely physical, rather than as a manifestation of spirit. The spiritual seems to be separate from the physical and their relation mysterious. This applies not only to God, but also to aspects of human existence thought to transcend physical processes, such as freedom, thought, and the will—the possibility of these aspects of existence manifesting themselves in a material realm becomes hard to explain. Many of Novalis’s contemporaries, including Kant and Schelling, struggled with this difficulty, and Novalis attempts to resolve this problem by overturning the dualism that lies at its root. Much of Novalis’s writing is concerned with revealing the inherent spirituality and rationality of the frequently devalued material elements of existence, as well as the superficiality of the divisions between human beings, nature, and God.

3. Novalis’s Account of History

Novalis claims that the world that modern human beings inhabit, in which the universe is a system of separate, finite entities and in which human beings are individual subjects, does not reflect the essential nature of the universe. Rather, this state of affairs is a development that began at the start of time, with an initial self-differentiation of an originally unitary cosmos that has become more pronounced through history. According to Novalis, the universe tends to move from an original state of undifferentiated, unconscious unity towards a community of conscious individuals who are aware of their nature as emanations of the divine whole. Novalis’s account of history aims to describe this development, which he presents as taking place through a repeated dialectical process that moves from a state of relatively undifferentiated, unconscious existence, through a state of individuated but fragmented conscious existence, to a state of more unified and harmonious “organic” consciousness, eventually approaching an ideal community of conscious individuals aware of their nature as parts of the same greater whole: “Before abstraction everything is one, but one like chaos; after abstraction everything is unified again, but this unification is a free interconnection of independent, self-determined beings. From a heap, a community has emerged” (“Pollen,” Schriften II p.455 #95). Although human beings epitomize this conscious awareness, Novalis indicates that plants, animals, and other aspects of the natural world also form parts of this ideal community.

Novalis often depicts earlier states of the manifestation of spirit in the world in order to convey this process and, through extrapolation, point both forwards in time to the ideal coming age of communion and backwards to the original position of absolute unity and non-self-awareness that preceded the origination of the world. Novalis has been criticized for creating sentimental idealizations of historical periods, particularly the medieval Europe described in “Christendom or Europe”; however, it is fairer to interpret him as presenting these images not as factual accounts, but as abstracted views of history meant to exemplify the progression from unconscious unity through conscious disunity to conscious unity. He depicts periods in which, prior to the emergence of the modern worldview of the universe as material, atomistic, and causally regulated and of the human being as an individual, conscious subject, human beings, God, and nature existed in closer relation to each other but also with less rational or discursive awareness. The Hymns to the Night describe an ancient time in which human beings lived in communion with nature and saw the spiritual essence of the world in mythical form in all things. “Christendom or Europe” presents a later period in a fictionalized medieval Europe, still before the advent of an Enlightenment worldview, in which education, trade, and communication flourished, but the people still lived in harmony, united under one spiritual goal (Catholicism). Although these examples are from different works, the period described in “Christendom or Europe” can be understood to represent a state of greater development of the self-knowledge of the universe than the pagan age described in the Hymns. While both periods described above seem idyllic because, according to Novalis’s poetic descriptions, the entities that make up the world at these times exist in greater harmony than they do now, the lower intellectual development at these times means they do not manifest consciousness, rationality, and spirit to as high a degree as modernity. However, “Christendom or Europe” suggests that greater differentiation and intellectual development can still be harmonious and unified if these occur within a community working towards a common spiritual goal. Thus, in addition to describing a past stage of the development of the universe, this piece points to the possibility of a future higher synthesis of society into a spiritual community, calling its readers to overcome the relatively fragmented and spiritless situation in which Novalis believes they currently exist.

According to Novalis’s outline of history, the beginning of the Enlightenment marked the emergence of a highly developed cognition of particular entities, but also the loss of the original community that he describes, as well as of the ability to see spiritual significance in physical objects and mundane events: “The gods disappeared with their following—nature stood forlorn and lifeless. Arid count and strict measure bound her with iron chains” (“Hymns to the Night,” in Schriften I, p.145 s.5). The result is the world as it appears to a mentality that emphasizes an intellectual and categorizing approach to experience: a mechanistic, material universe, which permits a detailed understanding of physical processes, but lacks deeper meaning and is unimbued by spirit.

The categorizing and individuating activity of reason is, for Novalis, instrumental in achieving the self-reflective unity that he views as the divine purpose of the universe. Without the fragmentation engendered by division and categorization, spirit would be unable to reflect on itself and would remain in a state of blind self-identity. However, not just greater individuation, but also greater integration with the whole is necessary for knowledge of the universe as essentially unified and divine, rather than just according to its appearance as a set of particular entities and events. Thus, Novalis sees an overemphasis on discursive reason, with its divisive and alienating tendencies, as an antithesis to a preceding state of the world that was less rational and more unified, and as also preparing the ground for a subsequent synthesis into a more complex and self-conscious harmonious whole. At each level, the universe’s consciousness of its essential nature is, at least ideally, enhanced.

4. Subjectivity and the Vocation of Humankind

Novalis is a pantheist, maintaining that what we perceive as particular entities, including individual human beings, are not, in fact, most essentially distinct objects related externally and physically to one another, but are more fundamentally parts of a divine whole, connected internally through their shared spiritual nature. The individual human being is, therefore, a manifestation of God, who is present in all particular entities: “Only pantheistically does God appear wholly—and only in pantheism is God wholly everywhere, in every individual. Thus for the great I the ordinary I and the ordinary you are only supplements” (“Allgemeine Brouillon,” Schriften III, p.314 #398). This means that, on Novalis’s model, parts of the world that seem external to the individual subject—other human beings, animals, plants, objects, and even God—are in fact essential parts of the self, that is, of the “great I.” Overidentification with ourselves as individuals, in particular with ourselves as conscious, active individuals, makes us experience ourselves as fragmented in this way, set over and against a “you” that is, more fundamentally, another part of ourselves. For Novalis, the vocation of humankind is to realize our true nature as part of the divine whole, simultaneously developing closer connections with that “you” and fostering the self-understanding of the universe as a divine absolute: “We are not at all I—but we can and shall become I. We are seeds for becoming I. We shall all transform into a you—into a second I—only thereby do we raise ourselves to the Great I—which is one and all together” (“Allgemeine Brouillon,” Schriften III, p.314 #398).

Novalis’s model of the self reflects the post-Kantian Idealist separation of the everyday, empirical or individual I from the absolute I, often identified with God and sometimes described as “the Absolute” or “spirit,” terms whose relation to each other was at issue for Fichte and Schelling, among others. Novalis’s response to this problem is to claim that, with regard not just to the individual but also to the universe as a whole, we have access to both kinds of existence, although imperfectly, and can combine them, although never completely. The task of doing so is, on Novalis’s account, the human vocation. Taking up this task not only allows human beings to integrate into their selves aspects of their greater self (or God) from which they are currently alienated, but also facilitates the original purpose of the world as the gradual development from an absolute, undifferentiated, blind unity, to a community of individuated entities conscious of their true spiritual nature.

Because the universe, as divine, is both one and infinite, Novalis maintains that the task of uniting oneself with one’s greater self can never be completed while one exists as a finite, conscious individual, and the aim is therefore to draw increasingly close to this union without ever fully attaining it. The approach is characterized by spirit’s increasingly adequate self-expression and self-knowledge in and through the physical world, in particular through human beings and their understanding of and actions in the world. Novalis thus situates the human being within the world as both a part of it and at a special place where it becomes self-aware, and where its essential freedom, rationality, and spirituality are epitomized.

The development of the self-awareness of the universe through the activities of human beings occurs through a process in which the individual both comes to understand that the world is a reflection of him- or herself, indeed at a deeper level part of him- or herself, and shapes the world so as to more closely reflect the spiritual nature that lies within both world and self. Because of their shared spiritual nature, the self and the apparently external world are, on Novalis’s account, analogues of each other. The mind reflects the world in the form of representations, and the world correspondingly manifests the mind in a physical medium, in what Novalis calls “figures” or “hieroglyphs” or “ciphers” —the shapes of objects and events, which form a secret language that we can learn to read. The better we can read this language, the more closely our representations—that is, our minds—reflect the world, and the more we work to interpret the world in this way, the more we invest it with spirit. Thus, Novalis claims, “We are on a mission: our vocation is the cultivation of the earth” (“Pollen,” Schriften II p.427 #32). Novalis maintains that this double process of interpretation begins to mend the fragmentation between minds and bodies and between the spiritual and the physical, allowing a closer mirroring of these at first apparently incommensurate elements, in the process spiritualizing the physical.

It should be noted that the mediation of the spiritual to the world is for Novalis only possible because the world is fundamentally already divine. Thus, human beings do not accomplish a union of two originally or inherently different realms, but the realization of a pre-existing spiritual inner essence of the world. By revealing this spiritual nature, human beings take up their vocation, and the world becomes readable as a symbol and manifestation of the divine.

5. Romanticization and Poetry

Novalis claims that the “cultivation of the earth” that he describes as the vocation of humankind is to be achieved through a form of “poetic” or “Romantic” creativity. The activity of interpreting the world as embodying the divine partially overcomes the separations between the physical and the spiritual and between the self, as subject, and the rest of the world, as object. Novalis refers to this spiritualization of the world as “raising,” “raising to a higher power,” or “romanticizing”: “Insofar as I give the common a high sense, the usual a secret aspect, the known the worth of the unknown, the finite an infinite appearance, I romanticize it” (“Logological Fragments [II],” Schriften II, p.545 #105). The shaping of the physical world to reflect the spiritualized vision of it created by the poet, artist, or genius is the crux of Novalis’s concept of “magical idealism.”

According to Novalis, there are two ways of interpreting and relating to the world: one that perpetuates and even exacerbates the fragmentation of self, nature, and spirit that modern human beings experience in their everyday lives, and one that undermines this fragmentation. The first reflects the excessive rationality epitomized by science, and sees the representations formed by a categorizing and divisive form of discursive thought as more or less accurate reflections of an external world. The second is a “Romantic” attitude epitomized by poetry (although also potentially present in conversation, translation, art, art criticism, and many other practices), which recognizes that one’s representations of the objects of one’s knowledge are based on intuitive connections with these objects, and acknowledges the contingency, subjectivity, and partiality of any attempt to articulate these intuitions or conceptualize the universe. This approach therefore employs emotions and intuitions to inform attempts to understand the world, and those who adopt it are motivated to continually improve on these attempts, thereby fulfilling the human vocation of developing the increasing self-knowledge of the universe.

Although Novalis, like other Romantics, often seems to stress the intuitive aspect of this process, Romantic interpretation is not supposed to be of a raw emotional or intuitive nature, but is rather articulate and rational, being informed, shaped, and mediated by consciousness. Novalis views as relatively undeveloped the “raw, discursive thinker” who interprets the world as atomistic and mechanistic and the “raw, intuitive poet,” whose interpretations have no fixed form (“Logological Fragments [I],” Schriften II pp.524–55 #13); these tendencies are united in the Romantic poet, novelist, philosopher, or artist, who can switch back and forth between these modes and give form to her or his visions of the living, dynamic world of nature. The means by which Romanticization is achieved, therefore, is the synthesis of reason and emotion (or science and imagination, or philosophy and poetry). This synthesis is reflected in the literary and allegorical forms in which Novalis (and other early German Romantics) chose to write as well as in the content of his writings.

Novalis maintains that the world is in principle epistemologically accessible, although a complete understanding of the universe as a divine whole and of oneself as a part of that whole is a regulative ideal. Insofar as individuals encounter their own nature and the rest of the world through mental representations, they experience these things as individual and particular, and the essential nature of the universe as a divine unity eludes them. However, one can partially overcome these separations and glimpse the nature of reality through intuition, imagination, and creative interpretation. It is only when one’s unconscious physical and affective nature participates in constructing interpretations of one’s environment that one can really understand that environment. Thus, for Novalis, an interpretation of the world that raises it towards the divine begins by circumventing narrowly rational categories for acquiring knowledge and allowing one’s intuitions to reveal the way things are.

It is not enough, however, merely to have these intuitions; they must be articulated, that is, as best as possible reproduced in the medium of discursive thought, in order to bring them to consciousness. For Novalis, Romantic creativity does not entail abandoning conscious representation, but rather integrating it with intuitions. A poetic representation is not simply an intellectual model of reality, aiming to adequately describe particular events and objects; nor is it raw intuition of the spiritual wholeness of the universe. Rather, emotions and intuitions let the poet read the world as divine and use language in an imaginative and symbolic way to represent this divine nature. Thus, Romantic interpretation reveals the spiritual unity that underlies all seemingly particular things.

On Novalis’s account, by revealing the spiritual nature of the world in this way, Romantic interpretations actually allow us to inhabit a more spiritual world. According to Novalis, the spiritual essence of things is not given in phenomena but is imparted to phenomena through interpretation, as he explains using the example of music: “All tones that nature brings forth are raw—and spiritless—often only to the musical soul does the sound of the forest—the piping of the wind, the song of the nightingale, the plashing of the brook seem melodious and meaningful” (“Anecdotes,” Schriften II, pp.573–74 #226). By creating order and meaning for objects and events, or in other words by perceiving naturally occurring objects and events in a spiritualized, rational form, the poet or artist invests them with spirit, allowing their inner nature and significance to shine forth.

6. The Artist as Genius

Novalis’s theory of the genius reflects how he thinks interpreting physical objects and events as spiritual actually invests these objects and events with spirit, creating a real physical world that manifests the divine. According to Novalis, the activity of the artist or genius is an exemplification and intensification of what human beings always do. Human beings do not exist in a world that is simply given, but rather project a world on the basis of their understanding or interpretation of their experiences. This, Novalis claims, is the essence of genius: “When we speak of the external world, when we portray real objects, then we act like genius. Thus genius is the ability to act towards imaginary objects like real ones, and also to treat them like these” (“Miscellaneous Remarks,” Schriften II, pp.418–20 #22). The way people interpret and understand their experiences is, therefore, the way in which they create the world that they inhabit. The artist has this capacity to a much higher degree than most people: Novalis refers to the artist as “the genius of genius” (p.420 #22). In other words, artistic activity is a raised form of the everyday human way of being.

Although Novalis describes the world-creating activity of the genius as “spontaneous,” he envisions this activity not as a generation ex nihilo or an imposition of an interpretation on an inert world, but as the way the world expresses itself in a more conscious and articulate, and therefore more spiritual, form. The world the genius creates is, due to her or his intuition and greater connections with the rest of the world, a free expression of the spirit, unity, and life of the universe, including of the genius her- or himself as the place where these characteristics are epitomized and come to expression. The actions of the genius are shaped and informed by the world of which she or he is a part, so that the spontaneous expression of the genius’s spirit that occurs in artistic creation is also a response to what is given. Novalis takes as the archetype of this activity the novelist, who “from his given crowd of accidents and situations—makes a well-ordered, lawlike series” (“Anecdotes,” Schriften II, p.580 #242). The freedom and creativity of the author are restricted by the terms given to him or her, while he or she draws objects and events together into a coherent, meaningful whole. This process can be extended to the task of understanding and acting towards the events of one’s experiences generally: Novalis claims, “All the accidents of our lives are materials out of which we can make what we want” (“Pollen,” Schriften II, pp.437–39 #66).

In other words, the genius is engaged in creative dialogue with his or her surroundings. For Novalis, nature is a language, if one that modern human beings have forgotten how to read and respond to. The beings and events that make up the world, including but not restricted to human beings and their activity, are symbols of the divine. While these symbols are now hidden to most people, the genius can both read and respond to this language of nature, like a participant in a conversation, and in doing so bring this world to a higher, more spiritual expression.

7. Language and the Fragment

According to Novalis, language, the mind, the world, and the divine have analogous structures that allow them to reflect each other. Furthermore, some uses of language bring the world and the mind closer together by allowing them to reflect each other more closely. The kinds of language that do this are not those that give the most accurate descriptions of their objects, but those that stimulate listeners to use their imaginations to intuit something about the world that cannot be captured in discursive categories.

Novalis believes that language signifies, not on the basis of semantic rules for connecting terms to objects, but through association, imagination, and creative interpretation. Like the relationship between the human being and its world, and between these and the divine, the connection between linguistic utterances and the things they signify is one of analogy between realms that at a deeper level share a common structure or essence. One of the clearest sources for Novalis’s account of language is his short essay, “Monologue,” in which he states, “If one could only make it comprehensible to people that it is with language like with mathematical formulae—they constitute a world for themselves—they play only with themselves, embody nothing but their wonderful nature, and just for that reason they are so expressive—just for that reason they reflect in themselves the same play of relations as things” (Schriften II, p.672). In other words, language does not denote particular objects and events, but creates a new world that mirrors or has the same structure, and therefore the same meaning, as these objects and events. Ultimately, both the material world and language have as their object the divine which they, like the human being itself, reflect and embody, and can therefore reveal.

For Novalis, the nature of the relationship between sign and signified entails a degree of separation between them, meaning that the objects of language always escape full articulation. The search to conceptualize and convey the divine essence of things can therefore never be finished, and progress in this search depends on openness and readiness for revision. Language can serve this purpose when it is used in forms that prompt the audience to take an active role and rework what has been said. Novalis attempts to embody this practice in “Monologue,” undermining the claims of his essay to provide an accurate description of how language works: “If I thereby think to have indicated the essence and function of poetry in the clearest way, still I know that no one can understand it” (Schriften II, p.672). By pointing to the inadequacy of his speech, taken literally, Novalis invites his audience to reach beyond the words to grasp his meaning, and provide a better representation of it.

As a result, irony and poetic techniques like metaphor, suggestion, and association emerge as the best tools for understanding the world and oneself, revealing these to others, and constructing more spiritual versions of these things through creative dialogue. Novalis’s use of the fragment for many of his writings is supported by this concept: Because fragments are incomplete, their readers must use their imaginations to complete them, and are thereby called to participate in their vocation. Fragments thus function as “seeds” for developing insights into the true nature of the universe: “Fragments of this kind are literary seeds. There may admittedly be some deaf grains among them: however, if only some bear fruit!” (“Pollen,” Schriften II, p.463 #114). Novalis claims that not just language, but everything we encounter can play this role, intimating a spiritual meaning that we are invited to explore and complete: “Everything is seed” (“Logological Fragments [II],” Schriften II, p.563 #189).

The invitation to others to participate in constructing meaning is an important aspect of Novalis’s account of Romantic interpretation. Rather than a finished or complete system of philosophy, Novalis advocates a continuous activity of “philosophizing” (which he also sometimes called “Fichtesizing”) which gradually reveals the spiritual nature of the world. In part, Novalis’s refusal to find a final form for his philosophy is motivated by his rejection of the demands of Fichte and Karl Leonhard Reinhold for a first principle as a foundation for philosophy. Novalis emphasized the importance of the activity of seeking a ground for knowledge and experience over the ground itself. Furthermore, this activity is not, for Novalis, performed in isolation by a single subject, but is carried forward within a community—this is the “symphilosophy” advocated and to an extent embodied by the Jena Romantics. Fichte’s call to his reader in the revised version of his Wissenschaftslehre to “think the I” is thus altered, in Novalis’s work, to become a call not to an individual but to a community to think the I and its world together.

On Novalis’s account, an imaginative and intuitive use of language contributes to the human vocation of creating a raised or Romanticized world. Because it has been worked on by the human mind, especially where the human mind in question is informed by intuitions of the divine nature of the world, the new world established through language is a raised, spiritualized version of the physical world that it reflects. But in addition, this process can be repeated and refined by working on the constructions created by others. Following a first speaker’s utterance, an audience is called to create a yet more spiritualized version of the world by investing the objects and events described with their own thoughts and feelings. By retracing the meaning of the first speaker’s utterance, a second participant combines three elements in a higher synthesis: the objects and events described by the speaker; the speaker’s spirit as imparted to these objects and events in her words; and his or her own spirit in his interpretation of this picture. The same process of joint, mutually reflective creation characterizes the poetic interpretation of nature, which Novalis understands as like a conversation, as well as the creation of art, art criticism, translation, and potentially many other endeavors.

8. The Mediator

Interactions with other human beings and objects and events in nature are important in Novalis’s account for realizing the inner spiritual unity of the world. In addition, particular figures stand out as especially important for this goal, acting as precursors for unification with the rest of existence and indeed as means to establishing this union. Novalis describes these figures as “mediators,” claiming that “Nothing is more essential to true religiosity than a mediator, who unites us with the Godhead. Unmediated, the human being can absolutely not be in relation to the latter” (“Pollen,” Schriften II, pp.443–45 #74). In particular human beings we see the highest manifestation of spirit in the world, and when we engage imaginatively with them as symbolic figures, we can see how the divine is embodied in the world and draw closer to that divinity.

Novalis’s work includes numerous examples of this kind of relationship. For instance, the teacher in The Novices of Saïs initiates the novices into the secrets of the universe, as an exemplar and tutor in the search for the meaning of nature’s language. In Henry of Ofterdingen, Zulima, who shows Henry how to construct a meaningful narrative out of chance events and gives him a musical instrument with which to begin his life as a poet, provides an axial moment in Henry’s development. Later, the sage Klingsohr and his daughter Mathilde initiate Henry into further pieces of wisdom required to become aware of his unity with existence, and Mathilde and Henry, who marry, share a union that Novalis describes as prefiguring the unification of all things. The same prefiguration occurs in the relationship between the narrator and the beloved in Hymns, in which the bond between the narrator and his dead beloved initiates the narrator’s release from the limits of space, time, and individuation.

Novalis maintains that any object can reveal one’s union with the rest of existence and mediate the divine. What is important is not the object through which one perceives the divinity of existence, but one’s relationship or attitude to that object. However, while the whole world can reveal the divine, other human beings do so more easily, because they more clearly manifest the spiritual within the physical than do other objects. Thus, Novalis claims that as human beings become more sophisticated they tend to choose a more limited range of objects to hold religious significance and to select other human beings as those mediating objects: “The more independent the human being becomes, the more the quantity of mediators shrinks, the quality is refined, and his relationships to these become more various and cultured: fetishes, stars, animals, heroes, idols, gods, one God-man” (“Pollen,” Schriften II, p.443 #74). This seems to imply that Christianity, with its God-man Christ, is a more rational, raised form of religion than earlier or other religions or systems of thought. However, Novalis also suggests that as time goes on individual human beings tend to choose a mediator who is of personal importance to them, making Jesus Christ just one potential mediator among many.

9. Relation to Christianity

Novalis was raised in the pietistic tradition, attending a Lutheran school and having a strictly religious father who attempted to raise his children in line with the precepts of the Herrnhuter church. This background influenced the vocabulary, imagery, and some of the content of Novalis’s philosophy, in particular: his claim that the mundane can be spiritualized by attention to the divine; his emphasis on the need to improve one’s community in order to transform the earth; and his rejection of radical sin. Novalis’s work also incorporates modified versions of central ideas of Christianity more generally, particularly the narrative of Fall and salvation as alienation from and reconciliation with the divine, the role of Christ as an exemplary mediator of the divine, the idea that the world embodies the divine and can be interpreted analogically to reveal this spiritual essence, and the idea of union through love after death. Some commentators, notably William Arctander O’Brien, have argued that Novalis’s work stretches Christian doctrine too far to be considered Christian: O’Brien points to Novalis’s pantheism and his rejection of Jesus Christ’s special status as the son of God as fundamental departures from Christian tradition. However, whether we see Novalis’s philosophy as remaining within a Christian paradigm or moving outside it, several of its central themes take their starting point from Christian models.

Novalis assimilates the Christian narrative of Fall and salvation, in which union with God is lost and sought, to the Fichtean account of the self-differentiation of the absolute into finite entities in space and time in order to achieve self-knowledge. Novalis’s model also reflects the ideas of the Christian mystic Jakob Boehme, whom he studied in detail from 1800. Like Novalis, Boehme claimed that the differentiation of God into particular entities in the world is necessary for the development of self-awareness and a higher form of harmonious existence.

Novalis avoids a puritanical interpretation of the Fall as due to the temptations of the flesh, and to an extent follows an opposite stream within Christian thought, in which consciousness, reason, and knowledge on the one hand, and individuation on the other, are responsible for the alienation of the human being from its true self, the rest of existence and God. For Novalis, an original communion with nature and God was lost through the development and enhancement of consciousness and individuality, which require division and separation. However, Novalis also grants individual existence and discursive reason a positive place in his narrative as essential for realizing the imperative of the universe to know itself. Without these, the universe would remain an unconscious, blind unity.

Novalis does not take from Christianity the moral notion that alienation from the divine is a result of sin, claiming that “To true religion nothing is sin” (“Fragments and Studies,” Schriften III, p.589 #228). However, this rejection or reduction in emphasis on radical sin is a characteristic of pietism, which may have influenced Novalis in this respect. Although Novalis exhorts his audience to take steps to overcome the fragmentation of their existence, he describes the consequences of the approach to the divine in utilitarian, rather than moral, terms, providing a vision of the benefits that attend a closer relationship with the divine, including a deeper connection with the rest of existence, a sense of meaning for one’s life, control over one’s destiny, and the elimination of the fear of death as one realizes that one is part of a greater whole, and therefore that one’s selfhood is, more essentially than existence as an individual, the selfhood of the absolute. These benefits are direct consequences of learning to view the universe in a new way. Novalis does not distinguish between the eventual fate of sinners and saved; all human beings and all of nature will return to unity with the divine when they die, and no one is fully integrated with the divine while a living individual, although some individuals may experience greater connections with the divine nature of existence while alive.

Novalis has sometimes been seen as life-denying and morbid, in part based on his writings on death, in which he famously uses erotic imagery of longing for union with dead loved ones and with the unconditioned. It is true that Novalis wrote some of these passages, at least the vision on the grave of the beloved found in the Hymns to the Night, while very depressed. However, Novalis attempts to give death a positive value as promising union with the divine and some form of eternal life. These concepts have resonance within a Christian context, and in particular his use of the imagery of marriage to prefigure union with the divine has pietistic parallels, but in Novalis’s case these concepts are also shaped by his philosophical commitments. For Novalis, individuated existence is an obstacle to realizing the divine unity of the world, and as a result, the unification with the divine that he calls us to work towards realizing can only be completed in death. Thus, he claims, “Life is the beginning of death. Life is for the sake of death” (“Pollen,” in Schriften II, p.417 #14). Novalis’s emphasis on the value of the process rather than the goal of the human vocation means that this should not devalue life, with its characteristic individuation and consciousness; these are required in order for the universe to become self-consciousness. Furthermore, Novalis’s pantheism means that human beings, like all other particular entities, are manifestations of the divine, with the result that death is not annihilation, but the final transformation of the individual person, or the “ordinary I,” into the “great I” of the divine absolute. Thus, Novalis claims that we will awaken after death into a new state, for which we may be prepared by our Romantic vocation.

Novalis’s attitude towards Jesus Christ provides one of the clearest places where his account modifies Christianity. Christ is a relatively important figure for Novalis, mentioned explicitly several times as an ideal mediator of the divine and spiritual to the world, and in the Hymns to the Night his teachings are described as spreading the word of the overcoming of death in mystical union. However, Novalis does not present Christ as different in kind from other human beings or the rest of the world. Although Christ exemplifies the integration of divine and mundane that Novalis claims it is the human vocation to bring about, and although as a result Christ is an ideal mediator of this spiritualized world to others, Novalis maintains that all entities can potentially play this role—what matters in this respect is the individual’s attitude to these entities, rather than who or what they are.

Novalis’s idea that things in the world can be interpreted as having a divine meaning also has parallels in Christianity. Medieval Christian scriptural exegesis could be applied not only to the Bible itself, but also to physical objects and events, in order to discover doctrinal, moral, and metaphysical and eschatological meanings. Novalis’s work reflects an interest in the fourth or “anagogical” form of interpretation, which was supposed to give knowledge of the heavenly or the spiritual and to initiate the interpreter into hidden knowledge of metaphysics and the afterlife. For Novalis, the beings and happenings of the world form “figures” or “hieroglyphs” that signify the divine and allow those who can read them to function as “prophets.”

10. References and Further Reading

a. Works by Novalis in German

  • Schriften. Zweite, nach den Handschriften ergänzte, erweiterte und verbesserte Auflage in vier Bänden, edited by Paul Kluckhohn and Richard Samuel. Stuttgart: Kohlhammer, 1960.
    • The authoritative edition of Novalis’s collected works, including notes, diary entries, and letters.

b. Works by Novalis in English Translation

  • Fichte Studies, edited by Jane Kneller. Cambridge: Cambridge University Press, 2003.
    • Novalis’s critical reception of Fichte. Includes an informative introduction by Kneller.
  • Henry of Ofterdingen, translated by Palmer Hilty. Long Grove, Illinois: Waveland Press, Inc., 1992. This translation first published in 1964.
    • An unfinished bildungsroman in which Henry, with the aid of various mediating figures, develops towards his vocation as a poet.
  • Hymns to the Night, translated by Dick Higgins. Many editions. This translation first published in 1978.
    • A bilingual edition. The Hymns use Christian, mystical, and Romantic imagery to describe longing for union with loved ones after death.
  • Notes for a Romantic Encyclopaedia: Das Allgemeine Brouillon, translated, edited, and with an Introduction by David W. Wood. Albany, NY: State University of New York Press, 2007.
    • Novalis’s writings on science, religion, art, and nature, intended for an encyclopedia.
  • The Novices of Saïs, translated by Ralph Manheim. Brooklyn, NY: Archipelago Books, 2005.
    • Describes the novices’ mystical search for an understanding of nature, under the guidance of their teacher, who leads them to discover the hidden connections between all things.
  • Philosophical Writings, translated and edited by Margaret Mahony Stoljar. Albany, NY: State University of New York Press, 1997.
    • An abridged, introduction to many of Novalis’s most influential pieces, including “Pollen,” “Monologue,” “Christendom or Europe,” and “Faith and Love or The King and Queen.”

c. Works About Novalis and Early German Romanticism in English

  • Behler, Ernst. German Romantic Literary Theory. Cambridge: Cambridge University Press, 1993.
    • An influential account of the literary theory of the early German Romantics, situating Novalis’s work in the context of his study of Fichte and the work of close contemporaries such as the Schlegel brothers and Tieck.
  • Haywood, Bruce. Novalis, The Veil of Imagery: A Study of the Poetic Works of Friedrich von Hardenberg, 1772–1801. Gravenhage: Mouton, and Cambridge, MA: Harvard University Press, 1959.
    • An introduction to Novalis’s use of imagery.
  • Von Molnár, Géza. Romantic Vision, Ethical Context: Novalis and Artistic Autonomy. Minneapolis: University of Minnesota Press, 1987.
    • An influential study emphasizing a central theme of Novalis’s work: the vocation of the individual to work towards the realization of the unity of the universe.
  • O’Brien, William Arctander. Signs of Revolution. Durham: Duke University Press, 1995.
    • Investigates Novalis’s work on language and symbols in relation to his contemporary political, ethical, religious, and scientific context.
  • Seyhan, Azade. Representation and its Discontents: The Critical Legacy of German Romanticism. Berkeley: University of California Press, 1992.
    • Presents the work of Romantic writers, including Novalis, as explorations of new ways of thinking in the light of political and scientific change, and as important precursors to modern critical theory.
  • Strand, Mary. I/You: Paradoxical Constructions of Self and Other in Early German Romanticism. New York: Peter Lang, 1998.
    • On the work of Romantics, including Novalis, on otherness, particularly women and the Orient.

 

Author Information

Anna Ezekiel
Email: info@annaezekiel.com
McGill University
Canada

Norman Malcolm (1911–1990)

MalcolmNorman Malcolm was instrumental in elaborating and defending Wittgenstein’s philosophy, which he saw as akin to a kind of “ordinary language” philosophy, in America. He also defended a novel interpretation of Moore’s “common sense philosophy” as a version of ordinary language philosophy, although Moore himself disagreed. Malcolm criticized Descartes’ account of mind by elaborating Wittgenstein’s criticisms of a private language. He produced a controversial new modal version of the Ontological Argument for the existence of God. He produced two very different kinds of arguments against the mechanistic view of human beings; the first argues that the mechanist is committed to a “pragmatic paradox,” and the second argues that such accounts may seem empirical but contain a disguised unintelligible metaphysics. He produced two very different kinds of accounts of memory, the earlier more “analytical,” and the later “more historical, systematic, and destructive.”

Malcolm was instrumental in building Cornell into one of the leading philosophy departments in America. He was President of the Eastern Division of the American Philosophical Association from 1972-73. Malcolm authored ten books and a plethora of influential articles and reviews.

Norman Malcolm studied philosophy with O. K. Bouwsma at the University of Nebraska before enrolling as a graduate student at Harvard in 1933. He received his Ph.D. from Harvard in 1940 but spent 1938-39 at Cambridge University in England, where he met G. E. Moore and Ludwig Wittgenstein, which proved decisive in his development. He was briefly an instructor at Princeton before joining the US Navy in 1941. He returned to Cambridge to study again with Moore and Wittgenstein from 1946-47. In 1947, he joined the Sage School of Philosophy at Cornell University, where he remained until his retirement in 1978.

Table of Contents

  1. Biography
  2. Wittgenstein: A Memoir
  3. Dreaming
  4. Malcolm’s Modal Version of the Ontological Argument
  5. Criticism of Descartes
  6. The Conceivability of Mechanism
  7. Philosophy of Mind
  8. Memory
  9. Nothing is Hidden
  10. Wittgenstein: From a Religious Point of View
  11. References and Further Reading
    1. Books
    2. Articles
    3. Reviews
    4. Secondary Sources

1. Biography

Norman Malcolm was born in the tiny town of Selden in northwest Kansas (pop. 250) on June 11, 1911. In his early schooling, his exceptional intellect was soon recognized, and he was sent to Omaha, Nebraska, for high school. He later attended the University of Nebraska, where he studied philosophy with O. K. Bouwsma. He began his graduate studies at Harvard in 1933 and received his Ph.D. in 1940. He spent 1938-39 at Cambridge University in England, where he met G. E. Moore and Ludwig Wittgenstein, which proved decisive in his development. He was briefly an instructor at Princeton before joining the US Navy in 1941. After the war, he returned to Cambridge from 1946-47 to study again with Moore and Wittgenstein. In 1947, he joined the Sage School of Philosophy at Cornell University, where he remained until his retirement in 1978. He was President of the Eastern Division of the American Philosophical Association from 1972-73. Wittgenstein visited Malcolm at Cornell during the summer of 1949, and their discussions during this visit inspired Wittgenstein’s last philosophical work, On Certainty, and Malcolm’s book, Knowledge and Belief. Malcolm was married twice. He had two children, a son and a daughter, by his first wife, Lee. A few years after his divorce from Lee, he met Ruth Riesenberg, an accomplished psycho-analyst and author, in Hampstead, London. Ruth was originally from Santiago, Chile. Ruth and he moved permanently to London soon after marrying.

Malcom enjoyed athletics in his youth—an interest that remained with him for life. He swam regularly before classes at Cornell. During his years at Cornell, he enjoyed sailing on Lake Cayuga and took his role as captain of the ship very seriously. A passenger might be forgiven for conjuring images of Captain Bligh. Malcolm was of a robust constitution (Serafini, 1993, 310-11). One of his close friends on the Cornell faculty relates that when in England in his 60s, Malcom had a back problem, perhaps sciatica, and was getting little or no relief. A friend in Hampstead told him to try the Queen’s horse doctor, who had a reputation for solving her horses’ problems. Malcolm duly went. The horse doctor showed Malcolm a large wooden mallet and how he used it on the horse. He had Malcolm lie stomach down on a table and gave him a massive whack on his back. Malcolm claimed that it cured his problem.

Malcolm’s famous review of Wittgenstein’s Philosophical Investigations in 1954 initiated decades of fruitful controversy about Wittgenstein’s views, which Malcolm understood as akin to an “ordinary language philosophy” (Parker-Ryan, § 2). Malcolm’s aim was to expose the confusions underlying much philosophy and psychology by showing how the relevant philosophical words are actually used in ordinary life. Although Malcolm’s chief philosophical influence was clearly Wittgenstein, he was also much influenced by Moore’s “common sense philosophy.” Malcolm saw Moore as being the first to refute paradoxical philosophical claims by showing that they “go against ordinary language.” Malcolm held that Moore’s common sense philosophy was essentially the same as ordinary language philosophy, although Moore himself rejected this interpretation (Carney, 1962). It is also worth pointing out that though Malcolm emphasized attention to the uses of words in ordinary language, he held that this is not sufficient to resolve philosophical problems (Serafina, 1993, 315; Uschanov, 2002). Finally, although Malcolm was powerfully influenced by Wittgenstein, it would be wrong to think that he slavishly followed his lead (Serafini, 1993, 315-317). For example, whereas Wittgenstein eschewed rational theology, holding that religion is more a matter of faith or passion, Malcolm produced and defended a novel modal version of Anselm’s ontological argument for the existence of God.

Malcolm admitted that it was hard not to pick up some of Wittgenstein’s mannerisms and practices (1970, 26). One story that circulated at Cornell was that a new graduate student turned up late at a seminar that Wittgenstein, during his year at Cornell, was giving at Malcom’s class and whispered to the graduate student beside him, “Who is this guy trying to imitate Malcolm?” Further, since Wittgenstein detested academic life, he often attempted to talk students out of pursuing philosophy as a career and doing something useful with their lives—like becoming a manual worker on a farm and being kind to people (1970, 30). Since Malcolm shared Wittgenstein’s distaste for professional philosophy (Serafina, 1993, 310), he often did the same. Malcolm calls an enthusiastic graduate student into his office. His face is grave. The student can only fear the worst and wonders if it could be the end. Malcolm, speaking with great severity, says, “Are you sure you want to pursue a philosophy career?” The student, with the zeal of Socrates, professes absolute devotion to philosophy. They seem prepared to face the Hemlock. Malcolm, unmoved, tries again. He says, “Are you sure you do not want to do something useful with your life instead—perhaps medical school?” (Serafina, 1993, 311) The student reaffirms that there is nothing else he could possibly do. Malcolm, looking grim and disappointed, shrugs and turns away to rifle through his bookshelves as he says, “Well, I guess there’s nothing to be done about it then!” Despite his misgivings about academic philosophy, however, Malcolm was fascinated by philosophical issues, which he approached with great passion and intensity. He continued, like Wittgenstein, working on philosophy to the end.

Malcolm’s lectures were not typical philosophy lectures. A student sitting through a course of Malcolm’s lectures might have had the feeling that she was not learning much. Sellars has a theory of knowledge, Chisholm has a theory of knowledge, but where is Malcolm’s theory of knowledge? However, by the end of the semester, students often found that they looked at things quite differently from the way they had at the beginning of the course. This is because, following Wittgenstein, Malcolm did not aspire to teach his students philosophical theories, but to impart methods that can be used over and over again on countless different kinds of problems—as Wittgenstein said, “not a single problem” (Philosophical Investigations, § 133). “Each class was a bit like a journey and one either accompanied Malcolm on the journey or not” (Serafini, 1993, 310).

Malcolm employed several methods borrowed from Wittgenstein, including describing the circumstances in which the relevant philosophical words, “knowledge,” “consciousness,” “certainty,” and so forth, are actually used in everyday life, comparing actual uses of words with imaginary language games, imagining a fictitious natural history for the use of words, and attempting to diagnose the motivations for the temptation to use certain words in a misleading way (Richter, § 4). By these means, Malcolm attempted to show that philosophers typically fall into error because they forget, when doing philosophy, how such words are actually used in ordinary life (Serafina, 1993, 321). When confronted by some typical philosophical thesis in class (of the sort that most philosophers take uncritically as grist for the logical mill), Malcolm would appear genuinely puzzled why anyone would say such a peculiar thing while he ran his hands over the top of his head as if searching his brain for the possible meaning of this dark saying. Although Malcolm was always prepared for his classes, he preferred to let the discussion develop organically, often in response to student questions, rather than imposing his own preferred grid on the discussion. One cannot, however, take this ban on philosophical theories too far. When Malcolm taught courses involving the views of some philosophers (Descartes, Leibniz, and so forth), he sympathetically articulated and defended their theories. Thus, a student normally would learn philosophical theories in Malcolm’s courses. Malcolm’s response to these theories was not to oppose them with an alternative theory but to subject them to his understanding of Wittgenstein’s and, perhaps, Moore’s methods.

Malcolm could seem a bit gruff and bearish sometimes. A student in class, labouring to articulate his position, finally manages, with evident relief, to articulate his view. Malcolm’s voice booms out, “Completely wrong!” Serafini (1993, 309) recalls that after receiving an F on his first paper, but then finishing strongly with a series of As and a B+, he asked Malcolm for a recommendation to graduate school. Reviewing his record, Malcolm recites his grades, “A-, A, B+,” leaving that F for last, which he recited in stentorian tones, apparently with some relish” (Serafini, 1993, 311). However, there was always a good dose of humour behind his gruffness. As Kretzmann, Shoemaker and Miller put it, “He could seem gruff and bearish, but those who began by fearing him soon found that he was very warm and kind. He lived his life and conducted his intellectual projects with full, guileless, and fearless commitment, earning the respect of all who knew him.” It is no exaggeration to say that many of his former students and colleagues came to love him.

In the course of his long and productive career, Malcolm exerted an enormous influence over the development of the Cornell Philosophy Department and was instrumental in building it into one of the most highly regarded philosophy departments in America. He had a fierce philosophical integrity and refused to be swayed by the metaphysical and scientistic fashions of the day. Malcolm belongs to a bygone age that has been largely forgotten in the push for more complicated, technical, and abstract philosophical theories (Serafini, 1993, 317). Malcolm spent the last thirteen years of his life living in London where he gave much admired weekly graduate seminars at King’s College, London until the year of his death. A committed Anglican, he died on August 4, 1990, and is buried in the cemetery of the Anglican Church in Hampstead near his London home.

2. Wittgenstein: A Memoir

Malcolm’s famous Memoir of Wittgenstein attempts to paint a picture of the person behind the great philosopher. It is a picture of a person who is intense, brilliant, austere, and eccentric and who suffered greatly throughout his life but who also could be playful, humorous, and compassionate. Despite the fact that he “abhorred” academic life and professional philosophy (1970, 30), Wittgenstein was fierce about attendance at his classes, saying, “My lectures are not for tourists” (1970, 28). Wittgenstein once tried to lecture from notes, but the thoughts that came out were “stale,” and the “words looked like corpses” (2001, 24). Wittgenstein could be “a frightening person” in his classes (2001, 26-27).

The Memoir also sometimes sheds light on Wittgenstein’s philosophy. For example, Malcolm reports that Wittgenstein dismissed attempts to provide a rational foundation or proof for God’s existence, believing instead in a Kierkegaardian type of view that religion is a matter of passion (1970, 59, 82). Wittgenstein referred to Kierkegaard “with something like awe in his expression” (1970, 60). Malcolm also recounts being especially struck by one remark Wittgenstein made during one of their walks that bears on his “use-conception” of meaning: “An expression has meaning only in the stream of life” (1970, 73-75).

Malcolm’s lively portrait of Wittgenstein the person should be of interest both to the philosopher and the historian alike, not only for its portrait of Wittgenstein but also for what it reveals about Malcolm. Although there are great differences between Wittgenstein and Malcolm as human beings, Malcolm’s Memoir emphasizes certain of Wittgenstein’s traits that Malcolm himself shares, such as his distaste for academic life, his impatience with anything less than a full commitment to the philosophical task, and his desire to let philosophical discussions develop naturally rather than to impose his own blueprint on them.

3. Dreaming

Malcolm argues in his paper “Dreaming and Skepticism” (1956) and in his book Dreaming (1959) that the notion of dreams, in the sense of conscious experiences that occur at a definite time and have definite duration during sleep, is “unintelligible” (1959, 52). This contradicts the views of philosophers and psychologists like Descartes, Kant, Moore, Freud, and Russell, who, he holds, assume that human beings have conscious thoughts and experiences during sleep (1959, 1-4). Descartes claimed that he had been deceived during sleep (1959, 101).

Malcolm’s first point is that ordinary language contrasts consciousness and sleep. The claim that one is conscious while one is sleepwalking is “stretching the use of the term” (1959, 27, 84). Malcolm rejects the alleged counterexamples based on sleepwalking or sleep-talking. For example, dreaming that one is climbing stairs while one is actually doing so is not a counterexample because in such cases the individual is not sound asleep after all (Springett, § 3.b.1). “If a person is in any state of consciousness it logically follows that he is not sound asleep” (1956, 21). Our concept of dreaming is based on our descriptions of dreams after we have awakened in “telling a dream” (1959, 55ff, 76, 87ff). Thus, to have dreamt that one has a thought during sleep is not to have a thought any more than to have dreamt that one has climbed a mountain is to have climbed a mountain (1959, 51-53, 57). Since one cannot have experiences during sleep, one cannot have mistaken experiences during sleep (1956), thereby undermining the sort of philosophical scepticism based on the idea that our experiences might be wrong because we might be dreaming.

Malcolm’s second point is that reports of conscious states during sleep are unverifiable (1959, 83ff; Springett, 3.b.i). If Ginet claims that he and Shoemaker saw a bigfoot in charge of the reserve desk at Olin library, one can verify that this took place by talking to Shoemaker and gathering forensic evidence from the library. However, there is no way to verify Ginet’s claim that he dreamed that he and Shoemaker saw a bigfoot working at Olin library (1959, 38-40). Ginet’s only basis for his claim that he dreamt this is that he says so after he wakes up. How does one distinguish the case where Ginet dreamed that he saw a bigfoot working at Olin Library and the case in which he dreamed that he saw a person in a bigfoot suit working at the library but, after awakening, misremembered that person in a bigfoot suit as a bigfoot proper? If Ginet should admit that he had earlier misreported his dream and that he had actually dreamed he saw a person in a bigfoot suit at Olin library, there is no more independent verification for this new claim than there was for the original one. Thus, there is, for Malcolm, no sense to the idea of misremembering one’s dreams (Windt, 2015, 18ff). Malcolm here applies one of Wittgenstein’s ideas from his “private language argument: “One would like to say: whatever is going to seem right to me is right. And that only means that here we can’t talk about ‘right’” (Philosophical Investigations, § 258).

For similar reasons, Malcolm challenges the idea that one can assign definite durations or times of occurrence to dreams (1959, 70-82). If Ginet claims that he ran the mile in 3.4 minutes, one could verify this in the usual ways. If, however, Ginet says he dreamt that he ran the mile in 3.4 minutes, how is one to measure the duration of his dreamt run? If he says he was wearing a stopwatch in the dream and clocked his run at 3.4 minutes, how can one know that the dreamt stopwatch is not running at half speed (so that he really dreamt that he ran the mile in 6.8 minutes)? One might say that dream reports do not carry such implications, but Malcolm would say that just admits the point. The ordinary criteria we use for determining temporal duration do not apply to dreamt events. The general problem in both these cases (dreaming one saw a bigfoot working at Olin library and dreaming that one ran the mile in 3.4 minutes) is that there is no way to verify the truth of these dreamt events—no direct way to access that dreamt inner experience, that mysterious glow of consciousness inside the mind of the person lying comatose on the couch, in order to determine the facts of the matter. This is because, for Malcolm, there are no facts of the matter apart from the dreamer’s reports of the dream upon awakening. Referring to psychological studies of his time, Malcolm claims that the empirical evidence does not enable one to decide between the view that dream experiences occur during sleep and the view that they are generated upon the moment of waking up (1956, 29). Dennett agrees with Malcolm that nothing supports the received view that dreams involve conscious experiences while one is asleep but holds that such issues might be settled empirically (Springett, § 3.d).

Malcolm also argues against the attempt to provide a physiological mark of the duration of a dream, for example, the view that the dream lasted as long as the rapid eye movements (REM). Malcolm replies that “there can only be as much precision in that common concept of dreaming as is provided by the common criterion of dreaming” (1959, 75). These scientific researchers are misled by the assumption that the provision for the duration of dreams “is already there, only somewhat obscured and in need of being made more precise” (1959, 79). However, Malcolm claims, it is not already there (in the ordinary concept of dreaming). These scientific views are making “radical conceptual changes” in the concept of dreaming, not further explaining our ordinary concept of dreaming (1959, 81). Malcolm admits, however, that it might be natural to adopt such scientific views about REM sleep as a convention (1979, 76-77). Malcolm points out, however, that if REM sleep is adopted as a criterion for the occurrence of a dream, then “people would have to be informed upon waking up that they had dreamed or not” (1970, 80).

Malcolm does not mean to deny that people have dreams in favour of the view that they only have waking dream-behaviour (Pears, 1961, 145). “Of course it is no misuse of language to speak of ‘remembering a dream’” (1959, 57-58). His point is that since our shared concept of dreaming is so closely tied to our concept of waking reports of dreams, one cannot form a coherent concept of this alleged inner (private) something that occurs with a definite duration during sleep. Malcolm rejects a certain philosophical conception of dreaming, not the ordinary concept of dreaming, which, he holds, is neither a hidden private something nor mere outward behaviour.

Malcolm’s account of dreaming has come in for considerable criticism. Chihara and Fodor (1965) argue that Malcolm’s claim that occurrences in dreams cannot be verified by others does not require the strict criteria that Malcolm proposes but can be justified by “appeal to the simplicity, plausibility, and predictive adequacy of an explanatory system as a whole.” Dunlop (1974) argues that Malcolm’s account of the sentence “I am awake” is inconsistent. Windt (2015) offers a comprehensive program in considerable detail for an empirical scientific investigation of dreaming of the sort that Malcolm rejects. Canfield (1961), Siegler (1967), and Schröder (1997) propose various counterexamples and counter arguments against Malcolm’s account of dreaming.

4. Malcolm’s Modal Version of the Ontological Argument

In his 1960 paper “Anselm’s Ontological Arguments,” Malcolm states that Anselm gave two different ontological proofs for God’s existence. Anselm’s key premise in the first argument in Proslogion 2 is that a thing is more perfect if it exists than if it does not exist. As Kant points out, that argument is fallacious because existence is not a property of things (Himma, 2.d). Anselm’s second argument, which Malcolm revises and defends, is a modal argument in Proslogion 3 that is similar to arguments advanced by Hartshorne and Plantinga. The key idea here is that though existence is not a perfection, the logical impossibility of nonexistence, that is, necessary existence, is a perfection (and, therefore, a property). Lacewing (2015, 190-193) summarizes Malcolm’s modal argument for God’s existence as follows:

  1. Either God exists or does not exist.
  2. God can neither come into existence nor go out of existence.
  3. If God exists, then He cannot cease to exist.
  4. Therefore, if God exists, He exists necessarily.
  5. If God does not exist, then He cannot come into existence.
  6. Therefore, if God does not exist, His existence is impossible.
  7. Therefore, God’s existence is either necessary or impossible.
  8. However, God’s existence is only impossible if the concept of God is self-contradictory.
  9. The concept of God is not self-contradictory.
  10. Therefore, God’s existence is not impossible.
  11. Therefore, from 7 and 10, God’s existence is necessary.

One objection is that though it has been argued that the concept of God is self-contradictory (Trakakis, § 1.c; Beebe, § 1-3), Malcolm simply assumes that premise 9 is true (Himma, § 4). Another problem is that even if one grants that necessary existence is a property, Malcolm’s argument only shows that if God exists, then God exists necessarily. Finally, is it true that necessary existence is a perfection? If “x necessarily exists” means “x exists in all possible worlds,” why should God’s necessary existence across all possible worlds make God greater in the actual world (Himma, § 4)? For in this actual world, a necessarily existing God is no greater than a God that contingently exists in this world.

5. Criticism of Descartes

Malcolm’s core criticism of Descartes is in his 1975 paper “Descartes’ Proof that He is Essentially a Non-Material Thing.” He attributes the following argument to Descartes: “I think I am breathing entails I exist. I think I am breathing does not entail I have a body. Therefore, I exist does not entail I have a body.” Malcolm rejects the second premise on the grounds that it is conceptually impossible for minds to exist without ever having been united with a body or for minds to exist without there ever having been bodies because the primary use of “he thinks he is breathing” presupposes bodily behavioural criteria for its truth. Malcolm admits there are secondary uses of mental terms that refer to disembodied spirits, but these are parasitic on the primary uses. The paper expresses Malcolm’s most basic understanding of Wittgenstein’s objection to such dualistic views, namely that all such views treat a parasitic use of language as if it makes sense when severed from the primary use of mental terms that are essentially tied to bodily behaviour (Philosophical Investigations, § 571, 579-580). If the criterion for ascribing mental properties essentially involves an appeal to bodily behaviour, then Descartes’ argument for mind-body dualism collapses.

6. The Conceivability of Mechanism

In his 1968 paper “The Conceivability of Mechanism,” Malcolm argues that a completely mechanistic explanation of human behaviour is incompatible with the explanation of the intentional explanation of that behaviour. He argues against the two main attempts to justify such completely mechanistic views. The first is the view that intentional concepts can be defined in terms of non-intentional dispositions to behave in a certain way. The second is the view that intentional states or events are contingently identical with neural states or events. Malcolm argues that if all human behaviour had sufficient mechanistic causes, then human beings would not have intentions or desires. This leads to a “pragmatic paradox” (Chan, 2010). A person S’s assertion that all human behaviour is mechanistically explainable is a pragmatic paradox because S’s utterance can count as meaningful only if S has certain intentions about it (Ginet, 2006, 234). However, in that case, S’s meaningful endorsement of the mechanistic view is itself a counterexample to the asserted mechanistic view. For if the mechanistic view is true, then S’s endorsement of it cannot be meaningful. Although Malcolm’s argument generated a considerable amount of useful discussion at the time, it is not seen as obvious that there is a paradox in the assertion that intentions and thoughts can be realized in the state of a machine. In his 1977 Memory and Mind, Malcolm uses entirely different sorts of arguments against a mechanistic account of human mental phenomena.

7. Philosophy of Mind

Malcolm’s positive philosophy of mind is based on two fundamental principles, both inherited from Wittgenstein. The first deals with ascription of mental properties to others. The second deals with ascription of mental properties to oneself. The first principle is that we justifiably ascribe mental properties (like being in pain) to others on the basis of observable behavioural criteria that are conceptually (non-contingently) connected to those mental properties. Thus, it is part of the concepts of mental properties that there are behavioural criteria that justify ascribing those mental properties to other persons. The second principle is that it is not on the basis of any observable behavioural criteria that we ascribe mental properties to ourselves. One does not ascribe the mental property of being in pain to oneself by observing that one is screaming. Malcolm holds that such self-ascriptions are, rather, analogous to natural expressions of mental states. A child does not need to be taught to cry when it is in pain. Rather, the child cries naturally when it is in pain and later learns to replace the natural crying with linguistic utterances like “I am in pain.”

The asymmetry between first and third person ascriptions does not, however, mean they are completely unrelated. “First person utterances and their second and third person counterparts are linked in meaning by virtue of being tied, in different ways, to the same behavioural criteria” (1971, 91). Indeed, one can only know how to apply mental terms to oneself if one can apply them to others (Thornton, § 5). The behavioural expression of my (first person) being in pain is similar to the behavioural expressions of others that justify me in ascribing that same mental state to them. Introspectionism (exemplified by Descartes) violates the first principle. Behaviourism violates the second principle because Malcolm does not identify the mental state with its behavioural expressions. He only holds that the concept of a mental state is non-contingently connected with the natural and/or learned behavioural expression of those mental states. A key part of Malcolm’s attempt to find a third alternative to the extremes of introspectionism and behaviourism is that the mental state does not reduce to behaviour because behaviour is only an expression of a mental state.

In his 1964 paper “Scientific Materialism and the Identity Theory,” Malcolm argues against Smart’s claim that a sudden thought is contingently identical with a brain process on the grounds that brain states do have specific bodily locations but that we attach no meaning to the bodily location of a thought. Thus, if x is identical with y only if x and y occur at the same place and time and if the identity is contingent, then there is no way to establish that the same location condition is satisfied.

In his 1984 book Consciousness and Causality (David Armstrong also contributes a lengthy section to this book), Malcolm makes an analogous argument that mental states that lack genuine duration (dispositions, beliefs, intentions) cannot be identical with brain states that do have genuine duration. Appealing to the principle of identity cited in the preceding paragraph, if a brain state has a genuine duration (say, 8.1 seconds), but a disposition or intention does not possess genuine duration, then there is no way to establish that such mental and brain states are identical. It is important to acknowledge that some dispositions and intentions can be assigned a precise duration. One might not normally be able to say precisely when one lost the ability to count from 10 to 1 in Yanomami backwards, but in some cases one can do so. Question: “When did you lose the ability to count from 10 to 1 in Yanomami backwards?” Answer: “It was when my wife hit me in the head with the microwave.” However, apart from such exceptional cases, one cannot, for some kinds of mental states, normally assign them a precise temporal duration.

The problem with Malcolm’s arguments in these cases is that even though there are many kinds of mental states for which it is ordinarily impossible to establish a precise spatial or temporal spatial location or duration, one can, it seems, envisage advances in the sciences that might make it plausible to do so. For example, advanced studies of brain processes might discover precise correlations between acquiring certain brain states and acquiring certain mental dispositions, abilities, or intentions. These identities would be viewed as scientific discoveries. Malcolm would reply that this would involve considerable gerrymandering of our ordinary concepts of dispositions, intentions, and abilities. A critic of Malcolm would reply that this kind of gerrymandering of ordinary concepts is normal in the advancement of science and is not specific to changes in the concepts of mental entities. For example, human beings were traditionally divided into males and females, but more detailed scientific knowledge suggests that this traditional division fails to capture the complexity of the human gender reality. That is, one cannot rule out such discoveries simply by appealing to the fact that the concepts in ordinary language conflict on some level with the new concepts developed on the basis of greater scientific knowledge (Serafina, 1993, 321).

Another of Malcolm’s noteworthy contributions to the philosophy of mind comes out in his 1972 presidential address to the Eastern Division of the American Philosophical Association titled “Thoughtless Brutes.” Malcolm objects to Descartes’ view that since propositional representations do not occur in the lower animals, they do not have real sensations. Malcolm does not argue that lower animals do have propositional representations but that Descartes “exaggerated” the role of propositional representations in human beings (Ginet, 2006, 235-236). Since propositional representations play less of a role than most philosophers think, there is no principled reason why one cannot ascribe non-propositional thoughts to some of the higher animals. One correctly says that the dog barking up the tree, where it has just chased the squirrel, believes the squirrel is up the tree. Malcolm issues an important qualification. Though it is wrong to identify thoughts with their linguistic expression, it is also wrong that creatures without language can have thoughts. We can meaningfully say of a person that they have thoughts to which they never give expression only because they participate in a language in which there is an institution of testifying to previously unexpressed thoughts (1972, 55). Since dogs do not speak a human language, how, then, can one assign such thoughts to them? Malcolm holds that some higher animals participate in human language to a sufficient degree that one can attribute some thoughts to them by analogy. There is a squirrel and a rabbit in the field. Rover is told to get the rabbit, whereupon Rover chases the rabbit and ignores the squirrel. Rover must display regular patterns of such linguistically sensitive behaviour. Dogs are not full-blown members of our linguistic community, but they participate in our linguistic practices sufficiently to justify ascriptions of thoughts, beliefs, and desires to them by analogy. They behave much as we do in response to some relatively simple human linguistic behaviour.

Davidson (2001, 97) objects that one cannot say what precisely the dog is supposed to believe. Suppose the tree in question is an oak tree. Does the dog believe the squirrel went up the oak tree? However, there is an important sense in which Davidson misrepresents Malcolm’s position. Davidson (2001, 97-98) thinks that if one allows that the dog thinks the squirrel went up the tree, then “while dropping the feature of semantic opacity, there is a question whether we are using those words [‘thinks,’ ‘believes,’ and so forth] to attribute propositional attitudes.” For it has long been recognized that semantic opacity distinguishes talk about propositional attitudes from talk of other things. However, Malcolm does not hold that it is correct to say that the dog believes the proposition that the squirrel is up the tree (let alone that the dog believes that the proposition that the squirrel is up the tree is true). Recall that Malcolm holds that Descartes overestimates the role of propositional representations in human life. Malcolm distinguishes between, “The dog believes the squirrel is up the tree” and, “The dog believes that the squirrel is up the tree” (where the presence of the “that” in the latter formulation indicates that the alleged believer possesses a great deal of “logical machinery” not required by the former). Malcolm holds that many human beliefs described by logicians as beliefs-that (that is, propositional beliefs) are really non-propositional. When a dog believes the squirrel is up the tree, its belief resembles human non-propositional beliefs (which are more common than many philosophers think). Philosophers and psychologists have, alas, tended to over-intellectualize not just animal mind and behaviour, but also human mind and behaviour. Malcolm and Davidson also both address the moral issues involved in regarding animals as “thoughtless brutes” or mere machines.

8. Memory

Malcolm’s two main works on memory are his 1963b “Three Lectures on Memory” and his 1977a book Memory and Mind. In the first 1963b lecture, “Memory and the Past,” he argues that Russell’s hypothesis that the world began five minutes ago complete with misleading records, delusory memories, and the like is logically untenable. Malcolm’s main argument is that a linguistic community can be said to have mastered past tense statements and have past tense beliefs only if not all of their past tense statements are false. Further, if our apparent memories mostly agree with each other and with the records, then they would be verified as true, and “if the apparent memories were verified, it would not be intelligible to hold that, nevertheless, the past they describe may not have existed” (1963a, 199).

In the second lecture, “Three Forms of Memory,” Malcolm distinguishes factual memory (remembering that p), personal memory (remembering something one has oneself previously experienced), and perceptual memory (personally remembering something by forming a mental image of it). While a personal or perceptual memory always entails some factual memory, there can be factual memories that do not entail any perceptual or personal memory.  There could be a people who lacked perceptual memory altogether but had normal factual memories, but there could not be a creature that we would recognize as human who completely lacked factual memory. Malcolm’s point is that memory involving mental images is not nearly as basic as many philosophers and psychologists have thought.

Malcolm’s main aim in the third lecture is to show that our concept of factual memory “obviously” does not commit one to hold that there must be “a specific brain-state or neural process [mechanism] persisting between the previous and the present knowledge that p” (1963a, 237-8). He adds in the same passage “that our strong desire for a mechanism of memory arises from an abhorrence of the notion of action at a distance-in-time.” He acknowledges that there are causal elements in factual memory but argues that this does not require either the assumption of temporally continuous chain of causation or the existence of causal laws. The view found in accounts of the memory mechanism that there must be a representation that plays a causal role in remembering is unjustified.

Malcolm begins his 1977 Memory and Mind by contrasting his earlier (1963) views on memory with those in this book. Whereas his former views were more “analytical,” his new views, influenced by his discussions with Bruce Goldberg, to whom he dedicates the book, are “more historical, systematic, and destructive” (1977, 9). Part I is about the “mental mechanisms” of memory. Part II is about the “physical mechanisms” of memory.

Malcolm begins Part I by arguing against the common view tracing to Aristotle that memory is always of the past (1977, 15). He undermines this view with a series of examples (for example, “I remember this man”). Most philosophers will admit that there are a lot of odd things one says about memory that do not fit Aristotle’s model but hold that there is a “fundamental” type of memory that does. For example, Broad says that there are many things called “memory” in ordinary language that “do not really deserve the name” (1977, 63). That is, the common view among philosophers is that the concept of memory has “a unity which can be disclosed by analysis” that weeds out the deviant cases. Malcolm now sees this as wrong and counts himself, in his earlier “Three lectures on Memory,” among those misguided philosophers who have accepted that picture—but he has now “freed” himself from it (1977, 16 and note 9).

The core feature of this misguided view is that memory is a causal process, specifically that there is an input to the organism, that this input creates (causes) an enduring internal state of the organism (in its mind or brain), and that the proper stimulation activates this enduring internal state and causes the appropriate “output,” either a conscious state or a “behavioural memory performance” (1977, 28). The description of this process from input, to the enduring internal state of the organism, to the output elicited by the appropriate stimulus, is the description of the “memory mechanism.” The presence of the memory mechanism, of one form or another, constitutes the unitary essence common to all the genuine cases of memory. This memory causal process, in both its mental and physical forms, is analogous to the functioning of a computer. One types the initial input into the computer at time t1, for example, “The first President of the US was George Washington.” This input creates an internal state of the computer, which may lie dormant for years. However, when the appropriate stimulus occurs later at t2, for example, one types the question into the computer, “Who was the first President of the US?” the dormant internal state is activated and produces the response, in this example, the appearance of the words “George Washington was the first President of the US,” on the computer display. The computer has “remembered” the data it earlier received as input. Although the computer model is a physical model, something analogous occurs in the account of the mental memory mechanism. In the mental mechanism, each of these physical items is replaced by a corresponding mental item. Typing of data into the computer is replaced by something like a perception. The alterations in the internal physical state of the computer are replaced by alterations in the mental state of the organism. The physical output, the words on the computer display, is replaced by some kind of mental state (like thinking of the relevant fact). Although this picture, illustrated by the computer model, seems straightforward, Malcolm argues that in both its mental and its physical forms, it involves certain disguised and unintelligible metaphysical ideas (1977, 52).

Although Malcolm holds that there is a nest of interrelated unintelligible metaphysical ideas in these accounts of the mental and physical memory mechanisms, the most central is that a “genuine memory occurrence” must represent what is remembered (1977, 120, 132). In order for the representation to do its job, it must be intrinsically and unambiguously connected with what it represents (1977, 56, 124, 138-140). The account of this intrinsic connection appeals to the view that the structure of the memory must stand in a one-to-one correspondence with the structure of what is remembered (1977, 120, 125-126, 164). In the case of the mental memory mechanism, this condition is often satisfied by the view that the memory is some kind of image of what is remembered (1977, 120-121, 126-128). Since an image resembles what it represents, one can, in principle, introspect the connection between the memory-image and what is remembered. For example, since Jones’ image of the killer resembles the actual killer, it enabled Jones to pick the killer out of a line-up.

Whereas the mental representations often appeal to these conscious features of the representation, the physical memory mechanism is designed to explain how memory responses are caused (1977, 167). Even so, there is a considerable similarity between the accounts of the mental and the physical memory mechanisms. Whereas the central component of the mental memory mechanism is the memory image or picture, the central component of the physical memory mechanism is the memory “trace” (in the brain). This “trace” must also be intrinsically connected with what is remembered. The same idea found in the account of the mental memory mechanism reappears in a new form in the account of the physical memory mechanism. The physical trace must have the same structure as what is remembered (1977, 168). Malcolm traces this idea of the “physical basis of memory” to Plato’s view that the brain is like a “wax tablet” on which experience stamps impressions (1977, 169-170). Crito perceives Socrates snub nose at t1. This leaves an impression (trace) on Crito’s brain. Years later, someone asks Crito what Socrates looks like and he is, by virtue of this trace in his brain, causally enabled to describe Socrates’ snub nose. If the trace in Crito’s brain has degraded a bit over time, he can correctly say that Socrates has a snub nose but might describe it as a bit flatter than it actually is. If Crito’s trace has degraded a great deal, he cannot remember it at all. The fact that brain traces, like impressions in a wax tablet, degrade over time explains why some memories are more accurate than others. The underlying idea, both in the theories of the mental and the physical memory mechanisms, are the same. Both hold that the memory must be isomorphic with what is remembered. Malcolm also holds that the schema for such accounts is laid out in the picture theory in Wittgenstein’s Tractatus (1977, Chapters V and 10). Malcolm’s claim is not that the Tractatus provides an account of memory or of the memory mechanism. It does not. What it does do is provide the logical schema of a kind of account of language (representation, picturing), which is presupposed in the mental and the physical accounts of the memory mechanism.

Malcolm argues that, as Wittgenstein shows in his later works, the Tractatus account of this logical schema is wrong. The accounts of how the mental memory image or copy and the account of how the physical memory brain trace are intrinsically and unambiguously connected with what they are representations of require that the structures of the memory and of what is remembered stand in a one-to-one correspondence with each other. However, this can only work if one can appeal to the absolute structure of the relevant items—but the idea of the absolute structure of something makes no sense (1977, 161-162, 242-244). In order to speak of a correlation between the structure of Xs and Ys, one requires a key of interpretation that identifies the elements of Xs and Ys. The question whether Beethoven’s Quartet Opus 132 is isomorphic with Dostoevsky’s The Brothers Karamazov is meaningless unless one has a key of interpretation identifying the relevant parts of each and a principle for mapping the parts of the one onto those of the other (1977, 230-232). The fundamental question then is whether it is possible to construct a key correlating neural elements (whatever they are) with elements of experience (memories, perceptions, and so forth). Malcolm argues that it is a conceptual point that no such satisfactory key can possibly be produced (1977, 232-234). Malcolm focuses on the question whether it makes conceptual sense to identify the elements of a simple experience like wanting to catch the bus. Malcolm proceeds, following Wittgenstein’s method of dissolving essences by producing concrete examples (Philosophical Investigations, § 3, 23, 35), that there is no one thing common to all cases of wanting to catch the bus. There are “countless” things that can count as Fred’s wanting to catch the bus: his looking up the time the bus is to arrive and leisurely finishing his breakfast, his running hysterically out of the house after the bus after seeing a broken alarm clock, his shouting to his wife to run out and stop the bus for him, his calling the bus company and asking them to delay the bus, his praying to God that the bus will be late today, and so on. There is no essence to wanting to catch the bus that then might be divided into elements by some key in order to be correlated with the relevant neural items.

Why, then, do we think there is such an essence? “We predicate of a thing what lies in the method of representing it” (Philosophical Investigations, § 104). The expression “wanting to catch the bus” has a neat definiteness and is divided into discrete elements (words). One sees no difficulty correlating neural states with those elements. Why, therefore, would there be any difficulty correlating neural elements with what is meant by those words? However, the complete range of activities that could constitute wanting to catch the bus cannot be specified (1977, 237-239). Since there is no possibility of isolating the essence of that experience, there is no possibility of identifying the elements of that essence that are suitable for correlation with neural states. The key condition for providing an account of the memory mechanism is unintelligible. It is, therefore, a conceptual truth that there is no possible key for establishing such correlations.

9. Nothing is Hidden

Malcolm’s first sustained attempt to contrast the key views of the Tractatus with those of Wittgenstein’s later philosophy is presented in his 1986 book Nothing is Hidden. Malcolm identifies 15 key “interlocking” theses in the Tractatus. They are:

  1. The world has a fixed unchanging form that is independent of any facts,
  2. The fixed form of the world is constituted by absolutely simple objects,
  3. These simple objects are the substance of the world,
  4. Thoughts, composed of psychical constituents, underlie the sentences of language,
  5. A thought is intrinsically a picture of a particular state of affairs,
  6. A proposition or thought cannot have a vague sense,
  7. Whether a proposition has sense cannot depend on whether another proposition is true,
  8. To understand the sense of a proposition, it is sufficient to understand the meaning of its constituent parts (the principle of compositionality),
  9. The sense of a proposition cannot be explained but only shown,
  10. There is a general form of all propositions,
  11. Each proposition is a picture of one and only one state of affairs,
  12. When a sentence is combined with a method of projection, the resulting proposition is necessarily unambiguous,
  13. What one means by a proposition is determined by an inner process of logical analysis,
  14. The pictorial nature of our ordinary propositions is hidden, and
  15. Every sentence with a sense expresses a thought that can be compared with reality (1986, viii).

The first eight chapters of the book expound these Tractatus theses and explain Wittgenstein’s “sharp disagreement with them in his later thought” (1986, viii-ix). The ninth chapter deals with Kripke’s account of rule-following in Wittgenstein’s Philosophical Investigations. The tenth chapter considers the ideas of a psychophysical parallelism and mind-brain identity. Chapter eleven discusses Wittgenstein’s last writings on the concepts of certainty and knowledge eventually published as On Certainty.

Malcolm identifies the core thesis of the Tractatus in Chapter 1 as the view that the world has a fixed unalterable form determined by the set of indestructible simple objects. The first three chapters critique these theses with arguments familiar from Memory and Mind. It makes no sense to speak of the absolute unalterable form or essence of the world because ascriptions of structure and of simplicity presuppose a key of interpretation that determines what is to count as a form or structure or simplicity—making them relative to a key.

In Chapter 2, Malcolm argues against Winch’s view that the Tractatus is primarily a theory of language and for his own view that the Tractatus is founded on a metaphysical view of a language-independent form (essence) of the world. Whereas Winch sees the Tractatus primarily as a work in linguistic analysis, Malcolm sees its metaphysics as primary.

In Chapter 4, Malcolm goes against much conventional wisdom and argues that Tractatus thoughts are not just abstract entities but are psychical. His five main theses are:

  1. Thoughts are composed of mental elements,
  2. A thought is, by virtue of its intrinsic nature, a picture of a possible situation,
  3. A physical sentence is not intrinsically a picture but can be made into one; thus, the sense of a physical sentence is bestowed on it by a thought,
  4. A sense is bestowed on a physical sentence by establishing correlations between the elements of the propositional sign and the elements of the thought,
  5. In this way, a thought becomes “perceptible to the senses” (Tractatus, 3.1).

Malcolm concludes the chapter by identifying a Tractatus-like view of thoughts as intrinsically meaningful in John Searle’s Intentionality.

In Chapter 5, Malcolm discusses the Tractatus’ obscure view that “a proposition shows its sense” (4.022). He again goes against the conventional wisdom that what shows the sense of a proposition is its syntactical features or its use and argues instead that what primarily shows its sense are psychical thoughts. Unlike physical signs, which always admit of alternative interpretations, psychical thoughts have the unique ability to show what they mean without interpretation. A psychical thought is, in Goldberg’s (1968) terms, a “meaning terminus.”

In Chapter 6, Malcolm (1986, 103) takes his point of departure from the seemingly incompatible assertions in the Tractatus that “language disguises thought” (4.002) and that “all the sentences of our everyday language, just as they stand, are in perfect logical order” (5.5563). To reconcile these conflicting assertions, Malcolm distinguishes between the processes of analysis everyday people use, which take place mostly unconsciously when they understand a sentence, and the processes of analysis that philosophers employ when they attempt to represent perspicaciously the real logical structure of a proposition (1986, 106). When Ann says that the South Sea Islands are enchanting, Ann, the ordinary person, understands immediately. However, Ann is also a philosopher, and, in that capacity, might work a lifetime without success to provide a complete perspicacious representation of the analysed sense of that one proposition. Thus, language disguises thought from the philosopher but not from the everyday person. Ordinary language, for the everyday person, is in perfect logical order. Indeed, language is in perfect logical order for Ann, the everyday person, but as soon as she puts on her philosopher’s hat, she becomes perplexed.

In Chapter 7, Malcolm contrasts Wittgenstein’s later conception of language with Wittgenstein’s earlier view in the Tractatus. Whereas the Tractatus has a representational view of language, where the core notion of representation (logical picturing) is bound up with a whole series of “interlocking” metaphysical views about simple objects, substance, and absolute structure, Wittgenstein’s later works understand language as built on expressive behaviour (1968, 133). As Malcolm puts it, Wittgenstein eventually realized that language “does not emerge from reasoning but from natural forms of life” (1986, 153).

In Chapter 9, Malcom argues against Kripke’s interpretation that the Philosophical Investigations presents “the most radical and original sceptical problem philosophy has seen to date” (1986, 154). Kripke bases his interpretation on Wittgenstein’s remark at §201 of the Investigations, saying, “This was our paradox: no course of action could be determined by a rule because every course of action can be made out to accord with the rule.” Malcolm points out that Kripke fails to notice that in the very next sentence, Wittgenstein states that this paradox “is a misunderstanding” because “there is a way of grasping a rule which is not an interpretation” (1986, 154-155)—namely, in action. A 1,500-pound grizzly bear explodes from the bushes and heads straight for a group of elderly tourists. The tour guide yells, “Run!” Do the elderly tourists think, “I interpret her to mean that my legs should move rapidly in such and such a fashion”? No! They just run. They have grasped the intended meaning in action, not by “interpreting” it by means of another rule or sign, which, then, stands in need of interpretation by another rule or sign, and so on (1986, 180-181).

In Chapter 10, Malcolm argues against the common view that the mind is, or is realized in, the brain—roughly, the idea that thoughts are “in the head.” Malcom finds this common view to be “extraordinary” (1986, 191). The source of the confusion is that in ordinary life, we often say that our inner thoughts are hidden from everybody else. However, this is a metaphorical use of “inner.” Contemporary philosophers of mind have interpreted this metaphorical usage, which “reflects the different logical level you and I stand with regard to what I think and feel,” to mean quite literally that “thoughts and feelings are actually in the head” (1986, 191). Ironically, this literal interpretation of the view that the mental is inner actually “abolishes this logical difference.” Malcolm sees this as “a splendid illustration of how in philosophy it is possible to saw off the branch on which one is sitting” (1986, 191). The chapter includes an illuminating discussion of Wittgenstein’s criticism of the notion of a psychophysical parallelism in Zettel (§ 606-614).

In Chapter 11, Malcolm considers Wittgenstein’s final notebooks, which consist in rough unrevised notes “with no anticipation of publication” (1986, 201). Although many students find these notes “bewildering,” they “reward hard study” and contain “individual remarks of great beauty.” They also initiate lines of thought entirely new to Wittgenstein (1986, 201). Although this chapter is probably the sketchiest in the book, due to the sketchy nature of these notebooks, the best brief way to summarise the results of the chapter is to focus on the contrast between Descartes’ and Wittgenstein’s ways of conceiving of certainty. Whereas Descartes thinks that certainty is restricted to one’s own ideas, to certain highly abstract propositions, and to what can be deduced from these, Wittgenstein holds that one can have certainty about humdrum contingent propositions of everyday life, such as “My name is Ludwig Wittgenstein” (1986, 235). Further, whereas Descartes believes that a single human being can arrive at many certainties by themselves, Wittgenstein holds that anyone’s certainty about anything presupposes an enormous amount of knowledge and beliefs inherited from others and taken on trust (1986, 235). Once again, Descartes over-intellectualizes the phenomenon of certainty, and his solipsistic method of radical doubt is an illusion. Despite this, Malcolm admits that Wittgenstein is a sceptic in a certain sense. He stresses that though Wittgenstein holds that one can know or be certain about certain things, Wittgenstein always adds the qualifier “in so far as one can know such a thing” (1986, 234). Wittgenstein’s scepticism is “not to be confused with the familiar tradition of Philosophical Scepticism” but is rather philosophical “in the sense of being a set of general observations about the framework and boundaries of the concepts of knowledge and certainty, as these figure in the real life of human beings” (1986, 235).

10. Wittgenstein: From a Religious Point of View

Since Malcolm passed away while writing his final book, Wittgenstein: From a Religious Point of View, the final draft was edited into the published form by Peter Winch, who also contributed a lengthy critical essay to the book. The book takes its point of departure from Wittgenstein’s remarks to his friend Drury that “I am not a religious man but I cannot help seeing every problem from a religious point of view” (1995a, 1). Malcolm admits, with Drury, that this remark makes him wonder whether there are dimensions to Wittgenstein’s thought that he and others have not understood (1995a, 1). The book is Malcolm’s attempt to fathom this elusive dimension of Wittgenstein’s thinking.

Malcolm identifies four respects in which there are analogies between “the grammar of a language” and “what is paramount in religious life”:

First, in both, there is an end to explanation; second, in both, there is an inclination to be amazed at the existence of something; third, into both there enters the notion of an illness; fourth, in both doing, acting, takes precedence over intellectual understanding and reasoning. (1995a, 92)

First, in philosophy, as in religion, explanations come to an end somewhere. For example, Malcolm (1995a, 56-57) argues that, whereas Chomsky holds that one requires a mechanistic explanation of linguistic behaviour, his alleged scientific theory is really metaphysical in nature and does not provide the explanation of language that he claims. Second, Chomsky’s view also illustrates the tendency of philosophers to be amazed at something. Upon observing the paucity of linguistic data available to a child, Chomsky is amazed that the child can somehow learn a full-blown natural language (Malcolm, 1995a, 56-57). Just as a theologian’s amazement at the magnificence of the cosmos leads them to posit a creator to explain its existence, Chomsky’s amazement at the child’s ability to learn a natural language from such meagre data leads him to posit hidden mechanisms to explain this amazing fact. Third, Malcolm (1995a, 89-90) holds that Wittgenstein sees both philosophy and religion as having a tendency to see certain kinds of views and ways of living not as just mistakes but as akin to an illness. The philosopher has not just misapplied some logical rule, but, rather, error occurs because the philosopher’s thinking is in a diseased state. For example, Chomsky is led to posit a kind of explanation that cannot be given and, therefore, fails to appreciate the phenomenon of language that is right before his eyes. Fourth, Malcolm holds that in both philosophy and religion, doing and acting take precedence of intellectual understanding and reasoning” (1995a, 92). For example, to a genuinely religious person, what is important is not that one intellectually believes in God but that one lives accordingly.

Malcolm (1995a, 92) concludes with an admission that his suggestions “may be wide of the mark.” Winch (1995, 132) makes several criticisms of Malcolm’s reading but admits that his views are “less clear cut” than Malcolm’s and adds, pessimistically, that we should not expect a very clear-cut account of what Wittgenstein meant in that remark to Drury. Winch (1995, vii) stresses that though Malcolm was still making improvements to the book at the time of his death, he regarded it as fundamentally complete. However, it seems clear that both Malcom and Winch are still struggling with the meaning of Wittgenstein’s remark to Drury.

11. References and Further Reading

a. Books

  • Malcolm, Norman (1958) Ludwig Wittgenstein: A Memoir (with a biographical sketch of Wittgenstein by G. H.  von Wright), London: Oxford University Press.
  • Malcolm, Norman (1959) Dreaming, London: Routledge and Kegan Paul.
    • A classic work in the philosophy of mind on the philosophy of dreaming.
  • Malcolm, Norman (1963) Knowledge and Certainty, Englewood Cliffs, New Jersey: Prentice-Hall.
    • A collection of Malcolm’s essays published between 1958 and 1962, sometimes with slight corrections.
  • Malcolm, Norman (1971) Problems of Mind, New York: Harper and Row.
    • An excellent introduction to problems in the philosophy of mind.
  • Malcolm, Norman (1977) Memory and Mind, Ithaca, New York: Cornell University Press.
    • Arguably Malcolm’s best book.
  • Malcolm, Norman (1977) Thought and Knowledge, Ithaca, New York: Cornell University Press, 1977b.
    • A collection of Malcolm’s essays published elsewhere.
  • Malcolm, Norman (1984) Consciousness and Causality: A Debate on the Nature of Mind with D. M. Armstrong,  Oxford: Blackwell Publishers.
    • An illuminating back and forth argument between Malcolm and David Armstrong, a prominent materialist in the philosophy of mind.
  • Malcolm, Norman (1986) Wittgenstein: Nothing is Hidden, Oxford: Blackwell Publishers.
    • Malcolm’s sustained attempt to understand the actual relationship between Wittgenstein’s early Tractatus and his later philosophy beginning with the Philosophical Investigations.
  • Malcolm, Norman (1995a) Wittgenstein: A Religious Point of View, Peter Winch (ed.) Ithaca, New York: Cornell University Press.
    • Malcolm’s attempt to understand Wittgenstein’s remark to Drury that he sees problems from a religious point of view. Contains a critical essay on Malcolm’s views by Peter Winch.
  • Malcolm, Norman (1995b) Wittgensteinian Themes: Essays 1978-1989, G. Henrik von Wright (ed.) Ithaca, New York: Cornell University Press.
    • Contains 14 of Malcolm’s essays written during the last 12 years of his life on such topics as thinking, whether “I” is a referring expression, sensations of heat, the standard meter bar, language and instinctive behaviour, idealism, the intentionality of sense impressions, subjectivity, turning to stone (as one thinks), language rules, language games, the mystery of thought, and Moore’s paradox.

b. Articles

  • Malcolm, Norman (1940) “Are Necessary Propositions Really Verbal?” Mind 49 (194): 189-203.
  • Malcolm, Norman (1940) “The Nature of Entailment,” Mind 49 (195): 333-347.
    • This essay discusses only the nature of entailment between contingent propositions.
  • Malcolm, Norman (1942) “Certainty and Empirical Statements,” Mind 51: 18-46.
  • Malcolm, Norman (1942) “Moore and Ordinary Language, The Philosophy of G. E. Moore,” Paul Arthur  Schilpp (ed.) Chicago: Northwestern University Press. Reprinted in (1970) The Linguistic Turn, Richard Rorty (ed.) Chicago: University of Chicago Press.
  • Malcolm’s controversial argument that Moore holds that any philosophical proposition that violates ordinary language is false.
  • Malcolm, Norman (1950) “Defending Common Sense,” Philosophical Review 58 (1949): 201-21.
    • Discusses Wittgenstein’s view that philosophy can deliver only a series of truisms in connection with Moore’s “Proof of an External World.”
  • Malcolm, Norman (1950) “The Verification Argument” in Philosophical Analysis, M. Black (ed.) Ithaca, New York: Cornell University Press. Reprinted with revisions and additional footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1950) “Russell’s Human Knowledge,” The Philosophical Review 59 (1): 94-106.
    • Discusses Russell’s view that the data for all human knowledge are private sensations.
  • Malcolm, Norman (1951) “Philosophy for Philosophers,” Philosophical Review 60: 329-40.
    • Malcolm had originally intended the title to be “Philosophy and Ordinary Language.”
  • Malcolm, Norman (1952) “Knowledge and Belief,” Mind 61 (242): 178-189.
    • Reprinted with certain revisions and additional footnotes in Knowledge and Certainty
  • Malcolm, Norman (1953) “Direct Perception,” Philosophical Quarterly 3 (13): 301-316.
    • Reprinted with revisions and additional footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1953) “Moore’s Use of ‘Know,’” Mind 62 (246): 241-247.
  • Malcolm, Norman (1954) “On Knowledge and Belief,” Analysis 14: 94-97.
  • Malcolm, Norman (1956) “Dreaming and Skepticism,” The Philosophical Review 65: 14-37.
  • Malcolm, Norman (1957) “Dreaming and Skepticism: A Rejoinder,” Australasian Journal of Philosophy 35: 201-211.
  • Malcolm, Norman (1958) “Knowledge of Other Minds,” The Journal of Philosophy 55 (23): 969-78.
    • Reprinted in Knowledge and Certainty.
  • Malcolm, Norman (1959) “Stern’s Dreaming,” Analysis 20 (74): 47.
  • Malcolm, Norman (1960) “Anselm’s Ontological Arguments,” The Philosophical Review 69: 41-60.
    • Reprinted with new footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1961) “Professor Ayer on Dreaming,” The Journal of Philosophy 58 (11): 294-97.
  • Malcolm, Norman, (1962) “Three Lectures on Memory,” (“Memory and the Past,” “Three Forms of Memory,” and “A Definition of Factual Memory”), The Monist 45 (1962): 247-66.
    • Reprinted in Knowledge and Certainty.
  • Malcolm, Norman (1962) “George Edward Moore,” Ajatus.
    • Finnish translation of a paper first published in English in Knowledge and Certainty.
  • Malcolm, Norman (1962) “Memory and the Past,” The Monist 42 (2): 247-266.
    • Reprinted as one of the “Three Lectures of Memory” in 1963 in Knowledge and Certainty.
  • Malcolm, Norman (1963) “Three Lectures on Memory” (“Memory and the Past,” “Three Forms of Memory,”  “A Definition of Factual Memory”) in Knowledge and Certainty.
  • Malcolm, Norman (1964) “Is it a Religious Belief that ‘God Exists,’” John Hick (ed.) Faith and the Philosophers New York: St. Martin’s Press.
  • Malcolm, Norman (1964) “Scientific Materialism and the Identity Theory,” Dialogue 3: 115-25.
    • A classic paper on the identity theory of mind and body.
  • Malcolm, Norman (1965) “Descartes’ Proof that His Essence is Thinking,” Philosophical Review 74: 315-38.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1965) “Rejoinder to Mr. Sosa’s ‘Professor Malcolm on Scientific Materialism and the   Identity Theory,’” Dialogue 3: 424-25.
  • Malcolm, Norman (1967) “Explaining Behaviour,” The Philosophical Review 76 (1): 97-104.
  • Malcolm, Norman (1967) “The Privacy of Experience,” Avrum Stroll (ed.) Epistemology: New Essays in the Theory of Knowledge New York: Harper and Row.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1967) “Wittgenstein, Ludwig Joseph Johann,” Paul Edwards (ed.) The Encyclopedia of Philosophy, v. 5 New York: Macmillan and the Free Press: 327-340.
  • Malcolm, Norman (1968) “The Conceivability of Mechanism,” The Philosophical Review 77: 45-72.
    • Classic but controversial statement of Malcolm’s early arguments against the mechanistic view of human beings.
  • Malcolm, Norman (1970) “Memory and Representation,” Nous 4 (1): 59-71.
    • This paper begins to display the influence of Goldberg’s ideas on Malcolm’s account of memory.
  • Malcolm, Norman (1971) “The Myth of Cognitive Processes and Structures,” T. Mischel (ed.) Cognitive     Development and Epistemology New York: The Free Press.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1972) “Ludwig Wittgenstein: Purity and Passion,” B. Mazlish (ed.) The Horizon Book of Makers of Modern Thought New York: American Heritage.
  • Malcolm, Norman (1973) “Thoughtless Brutes,” Presidential Address, Proceedings of the American Philosophical Association 46: 5-20.
    • Argues against Descartes that some of the higher animals can be said to have thoughts and beliefs.
  • Malcolm, Norman (1974) “Behaviourism as a Philosophy of Psychology,” T.W. Wann (ed.) Behaviourism and Phenomenology: Contrasting Bases for Modern Psychology Chicago: University of Chicago Press.
  • Malcolm, Norman (1975) “Author’s Response,” part of an author-reviewer symposium on Problems of Mind: Descartes to Wittgenstein. Philosophical Forum 14: 289-306.
  • Malcolm, Norman (1975) “The Groundlessness of Belief,” Stuart Brown (ed.) Reason and Religion Ithaca: Cornell University Press.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1976) “Memory as Direct Awareness of the Past,” Godfrey Vesey (ed.) Impressions of Empiricism, Royal Institute of Philosophy Lecture 1974-75 London: St Martin’s Press.
  • Malcolm, Norman (1976) “Wittgenstein and Moore on the Sense of ‘I Know,’” Jaakko Hintikka (ed.) Essays on Wittgenstein in Honour of G. H. von Wright, Acta Philosophica Fennica 28 (1-3): 216-240.
    • Reprinted with revisions in Thought and Knowledge.
  • Malcolm, Norman (1977) “Descartes’ Proof that He is Essentially a Non-Material Thing,” Philosophy Forum 14.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1978) “Wittgenstein’s Conception of First Person Psychological Sentences as ‘Expressions,’” Philosophical Exchange 2 (1978): 59-72.
  • Malcolm, Norman (1980) “Functionalism in Philosophy of Psychology,” Proceedings of the Aristotelian Society, New Series 80: 211-29.
  • Malcolm, Norman (1980) “Kripke on Heat and Sensation of Heat,” Philosophical Investigations 3 (1): 12-20.
  • Malcolm, Norman (1981) “Kripke and the Standard Meter,” Philosophical Investigations 4 (1):1 9-24.
  • Malcolm, Norman (1981) “Misunderstanding Wittgenstein,” Philosophical Investigations 4 (2): 67-71.
  • Malcolm, Norman (1981) “The Relation of Language to Instinctive Behaviour,” J. R. Jones Memorial Lecture, University College of Swansea.
    • Malcolm remarks here that the editor’s chosen title for Wittgenstein’s notes, Culture and Value, would make Wittgenstein “turn in his grave.”
  • Malcolm, Norman (1982) “Wittgenstein and Idealism,” Godfrey Vesey (ed.) Idealism Past and Present Royal Institute of Philosophy Series: 13, Supplement to Philosophy Cambridge: Cambridge University Press.
  • Malcolm, Norman (1987) Reply to Stephen’s Review Behaviorism 15 (2): 155-156.
  • Malcolm, Norman (2015) Notes of a Discussion between Wittgenstein and Moore on Certainty Mind 124 (493): 73-84.

c. Reviews

  • Malcolm, Norman (1954) Review of “Wittgenstein’s Philosophical Investigations,” The Philosophical Review 63 (4): 530-59.
    • Reprinted with corrections and additional notes in Knowledge and Certainty.
  • Malcolm, Norman (1967) Review of Wittgenstein’s Philosophische Bemerkungen, The Philosophical Review 76 (2): 220-229.
  • Malcolm, Norman (1981) “Wittgenstein’s Bag of Raisins” (review of Ludwig Wittgenstein’s Culture and Value), London Review of Books 3 (3): 7-8.

d. Secondary Sources

  • Alanen, Lilli (1996) “Reconsidering Descartes’ Notion of the Mind-Body Union,” Synthese 106 (1): 3-20.
  • Allen, R. E. (1961) “The Ontological Argument,” The Philosophical Review 70 (1): 56-66.
  • Arrington, Robert (1979) Review of Thought and Knowledge: Essays by Norman Malcolm. Philosophical   Inquiry 1 (1): 164-166.
  • Averill, Edward (1978) Review of Norman Malcolm’s Memory and Mind in Philosophy and Phenomenological Research 39 (1): 1.
  • Baker, G. P. (1990) “Malcolm on Language and Rules,” Philosophy 65 (252): 167-179.
  • Baxi, Madhusudan (1977) “Norman Malcolm’s Analysis of Dreaming,” Indian Philosophical Quarterly 4 (4): 515-526.
  • Baylis, Charles (1951) Review of Norman Malcolm’s “The Verification Argument,” Journal of Symbolic Logic 16 (4): 300-330.
  • Beebe, James. “Logical Problem of Evil,” Internet Encyclopedia of Philosophy.
  • Bedford, Errol (1961) Review of Norman Malcolm’s Dreaming in Philosophy 36: 377.
  • Bernecker, Sven (2007) “Remembering Without Knowing,” Australasian Journal of Philosophy 85 (1): 137-156.
  • Bestor, Thomas (1976) “Dualism and Bodily Movements,” Inquiry 19 (1-4): 1-26.
  • Bouwsma, O. K. (1986) Wittgenstein: Conversations 1949-1951 Indianapolis: Hackett.
  • Britton, Karl (1959) Review of Ludwig Wittgenstein—A Memoir by Norman Malcolm Philosophy 34 (130): 277.
  • Bronstein, Daniel (1940) Review of Norman Malcolm’s “Are Necessary Propositions Really Verbal?” Journal of Symbolic Logic 5 (3): 121-122.
  • Brown, T. Patterson (1961) Professor Malcolm on “Anselm’s Ontological Arguments,” Analysis 22 (1): 12-14.
  • Bursen, Howard (1978) Dismantling the Memory Machine: A Philosophical Investigation of Machine Theories of Memory Springer.
    • Excellent application of Malcolm’s and Goldberg’s insights on memory.
  • Carney, James (1960) Review of Norman Malcolm’s Dreaming in Philosophy of Science 27 (4): 414.
  • Carney, James (1962) “Malcolm and Moore’s Rebuttals,” Mind 71 (283): 353-363.
  • Canfield, J. (1961) “Judgements in Sleep,” The Philosophical Review 70 (2): 224-230.
  • Canfield, John (1981) Review of Wittgenstein’s Lectures on the Foundations of Mathematics from the notes of R. G. Bosanquet, Norman Malcolm, Rush Rhees, and Yorick Smythies Canadian Journal of Philosophy 11 (2): 333.
  • Caldwell, Robert (1965) “Malcolm and the Criterion of Sleep,” Australasian Journal of Philosophy (December): 339-353.
  • Carruthers, P. (1987) Review of Norman Malcolm’s Nothing is Hidden in Philosophical Quarterly 37 (48): 99-100.
  • Carter, Walter (1964) Review of Norman Malcolm’s Knowledge and Certainty: Essays and Lectures in Dialogue 3 (1): 99-100.
  • Castaneda, Hector Neri (1965) “Knowledge and Certainty,” The Review of Metaphysics 18 (3): 508-547.
    • Castaneda argues that in this collection of Malcolm’s chronologically ordered essays, one can detect a drift away from Wittgensteinian “prejudices” and toward a more Chisholm-like method.
  • Cerf, Walter (1962) “Studies in Philosophical Psychology,” Philosophy and Phenomenological Research 22 (4): 537-558.
  • Chan, Timothy (2010) “Moore’s paradox is not just another pragmatic paradox,” Synthese 173: 211-229.
  • Chappell, V. C. (1963) “The Concept of Dreaming,” Philosophical Quarterly 13 (July): 193-213.
  • Chappell, V. C. (1961) “Malcolm on Moore,” Mind 70 (279): 17-425.
  • Chihara, C. S. and Fodor, J. (1965) “Operationalism and Ordinary Language: A Critique of Wittgenstein,”  American Philosophical Quarterly 2: 281-295.
  • Collingwood, Francis (1987) Review of Consciousness and Causality: A Debate on the Nature of Mind by Norman Malcolm and D. M. Armstrong, Modern Schoolman 64 (3): 199-201.
  • Cook, John (1981) “Malcolm’s Misunderstandings,” Philosophical Investigations 4 (2): 72-90.
  • Cornman, James (1965) “Malcolm’s Mistaken Memory,” Analysis 25: 161-167.
  • Davidson, Donald (1982) “Rational Animals,” Dialectica 36 (4): 317-327.
  • Davies, Alex (2012) “How to Use (Ordinary) Language Offensively,” Nordic Wittgenstein Review 1 (1): 55-80.
  • Deangelis, William James (1997) “Ludwig Wittgenstein—A Religious Point of View? Thoughts on Norman Malcolm’s Last Philosophical Project,” Dialogue 36 (4): 819.
  • Dennett, Daniel (1976) “Are Dreams Experiences?” The Philosophical Review 85 (2): 151-171.
    • Dennett here argues that dreams might not, after all, be experiences that occur during sleep.
  • Dennett, Daniel (1979) “The Onus Re Experiences: A Reply to Emmett,” Philosophical Studies 35 (April): 315- 318.
  • Descartes, Rene (1969) Meditations on First Philosophy in The Philosophical Works of Descartes, vol. 1.  Elizabeth S. Haldane and G. R. T. Ross (trans.) Cambridge: Cambridge University Press: 131-200.
  • Deshpande, D. (1976) “Professor Malcolm on Dreaming,” Indian Philosophical Quarterly 3 (3): 259-272.
  • Dilham, Ilham (1966) “Professor Malcolm on Dreams,” Analysis 26 (March): 129-134.
  • Doppelt, Gerald (1979) “The Austin-Malcolm Argument for the Incorrigibility of Perceptual Reports,” Dialectica 32 (2): 59-75.
  • Dunlop, Charles (1974) “Performatives and Dream Skepticism,” Philosophical Studies: An International  Journal for Philosophy in the Analytic Tradition 25 (4): 295-297.
  • Dunlop, C. E. M. (ed.) (1977) Philosophical Essays on Dreaming Ithaca and London: Cornell University Press.
  • Engelmann, Mauro (2013) “Wittgenstein’s ‘Most Fruitful Ideas’ and Sraffa,” Philosophical Investigations 36 (2): 155-178.
  • Fitch, Frederic (1940) Review of Norman Malcolm’s “The Nature of Entailment,” Journal of Symbolic Logic 5 (4): 160-161.
  • Garver, Newton (1994) This Complicated Form of Life Chicago: Open Court.
  • Garver, Newton (2006) Wittgenstein and Approaches to Clarity Amherst: Humanity Books.
  • Hacker, Peter (1987) ‘Critical notice : Norman Malcolm – Nothing is Hidden’, Philosophical Investigations, 10: 142-50.
  • Hacker, Peter (co-authored with G.P. Baker) (1990) ‘Malcolm on Language and Rules’, Philosophy, 65: 167-79.
  • Hacker, Peter (1992) “Malcolm and Searle on ‘Intentional Mental States'”, Philosophical Investigations 15: 245-75.
  • Hacker, Peter (2004) “Malcolm, Norman Adrian (1911–1990)”, Oxford: Oxford University Press.
  • Hamlyn, D. W. (1965) Review of Norman Malcolm’s Knowledge and Certainty, Philosophy 40 (152): 169.
  • Hanfling, Oswald (2003) Review of Norman Malcolm’s Nothing is Hidden, Philosophy 62: 529.
  • Hanfling, Oswald (2003) Wittgenstein and the Human Form of Life London: Routledge.
  • Hartshorne, C. (1965) Anselm’s Discovery: A Re-Examination of the Ontological Proof for God’s Existence, La Salle, Illinois: Open Court.
  • Himma, Kenneth “Anselm: Ontological Argument for God’s Existence,” Internet Encyclopedia of Philosophy
  • Hoffman, Robert (1967) “Malcolm and Smart on Brain-Mind Identity,” Philosophy 42 (160): 128-136.
  • Hyslop, Alec (1973) “Criteria and Other Minds,” Australasian Journal of Philosophy 51 (August): 105-114.
  • Ginet, Carl, and Shoemaker, Sydney (1983) Knowledge and Mind: Philosophical Essays Oxford: Oxford University Press.
    • This excellent collection, presented to Norman Malcolm in honour of his seventh-second birthday,  contains articles by G. E. M. Anscombe, John Canfield, John Cook, Keith Donnellan, Peter Geach, Carl Ginet, Bruce Goldberg, Hide Ishiguro, Thomas Nagel, David Sanford, Sydney Shoemaker, and G. H. von Wright.
  • Ginet, Carl  (2006) “Norman Malcolm (1911-1990),” A Companion to Analytical Philosophy A. P. Martinich and David Sosa (ed’s) Oxford: Blackwell.
  • Goldberg, Bruce (1968) “The Correspondence Hypothesis,” The Philosophical Review 77 (4): 438.
  • Goldberg, Bruce (1983) “Mechanism and Meaning,” Knowledge and Mind Sydney Shoemaker and Carl Ginet (ed’s) Oxford: Oxford University Press: 191-210.
  • Hacker, P. M. S. (1987) Review of Norman Malcolm’s Nothing is Hidden in Philosophical Investigations 10 (2): 142-150.
  • Heil, John (1982) “Speechless Brutes,” Philosophy and Phenomenological Research 42 (March): 400-406.
  • Ichikawa, Jonathan (2009) “Dreaming and Imagination,” Mind and Language 24 (1): 103-121.
  • Iseminger, Gary (1969) “Malcolm on Explanations and Causes,” Philosophical Studies: An International    Journal for Philosophy in the Analytic Tradition 20 (5): 73-77.
  • Kalish, Donald (1961) “Dreaming​,”​ ​Journal of Philosophy 58 (16): 437.
  • Kattsoff, Louis (1965) Review of Norman Malcolm’s Knowledge and Certainty, Philosophy and   Phenomenological Research 26 (2): 263-267.
  • Kramer, Martin (1962) “Malcolm on Dreaming,” Mind 71 (January): 81-86.
  • Kretzmann, Norman; Shoemaker, Sydney; Miller, Richard (1990) “Norman Malcolm June 11, 1911-August 4,  1990,” Cornell University Faculty Memorial Statement.
  • La Croix, Richard (1972) “Malcolm’s Proslogion III Argument,” Sophia 11 (1): 13-19.
  • Lacewing, Michael (2014) “Malcolm’s Ontological Argument,” Philosophy for AS. London: Routledge.
  • Linsky, Leonard (1965) “Malcolm and the Use of Words,” Analysis 26 (2): 59-61.
  • Locke, Don (1978) Review of Norman Malcolm’s Memory and Mind in Mind 87: 631.
  • Long, Douglas (1987) Review of David Armstrong and Norman Malcolm’s Consciousness and Causality,   Teaching Philosophy 10 (1): 83-86.
  • Lurz, Robert (2011) “Belief Attribution in Animals: On How to Move Forward Conceptually and Empirically,” Review of Philosophy and Psychology 1 (1): 19-59.
  • Mannison, Donald (1975) “Dreaming an Impossible Dream,” Canadian Journal of Philosophy 4 (June): 663-75.
  • Martin, Michael (1973) Are Cognitive Processes and Structures a Myth? Analysis 33 (3): 83-88.
  • Martin, Michael (1971) “On the Conceivability of Mechanism,” Philosophy of Science 38 (1): 79-86.
  • Matthews, Gareth (1961) “On Conceivability in Anselm and Malcolm,” The Philosophical Review 70 (1): 110-111.
  • Mayberry, Thomas (1975) Review of Norman Malcolm’s Problems of Mind: Descartes to Wittgenstein in   World Futures 14 (3): 289-295.
  • McDonough, Richard (1986) The Argument of the ‘Tractatus’ Albany: SUNY Press.
  • McDonough, Richard (1989) “Towards a Non-Mechanistic Theory of Meaning,” Mind XCVIII (389): 1-21.
  • McDonough, Richard (1993) “The Philosophical Psychologism of the Tractatus,” The Southern Journal of Philosophy XXXI (4): 425-447.
  • McDonough, Richard (1994) “Wittgenstein’s Reversal on the Language of Thought Doctrine,” Philosophical Quarterly 44 (177): 482-494.
  • McDonough, Richard (1994) “Wittgenstein’s Clarification of Hertzian Mechanistic Cognitive Science,” History of Philosophy Quarterly 11 (2): 219-235.
  • McDonough, Richard (2015) “Wittgenstein’s Augustinian Cosmology in Zettel 608,” Philosophy and Literature 39 (1): 87-106.
  • McDonough, Richard (2016) “Wittgenstein – From a Religious Point of View?” Journal for the Study of Religions and Ideologies, vol. 15 (43): 3-2.
  • McFee, Gr. (1983) Philosophical Inquiry 5 (4): 159-167.
  • Maxwell, Grover; Feigl, Herbert (1961) “Why Ordinary Language Needs Reforming,” The Journal of Philosophy 58 (18): 488-498.
  • Miller, Richard (1978) “Absolute Certainty,” Mind New Series 87 (345): 46-65.
  • Monk, Ray (1990) Ludwig Wittgenstein: The Duty of Genius New York: Penguin.
  • Moon, Andrew (2013) “Remembering Entails Knowing,” Synthese 190 (14): 2717-2729.
  • Moore, G. E. (1903) “The Refutation of Idealism,” Mind 12: 433-53.
  • Moore, G. E. (1969) “A Defence of Common Sense,” Readings in 20th Century Philosophy William Alston and George Nakhnikian (ed’s) London: Macmillan.
  • Moore, G. E. (1992) “A Reply to My Critics,” The Philosophy of G. E. Moore Paul Arthur Schlipp (ed.)  LaSalle: Open Court.
  • Morton, Adam (1985) Review of David Armstrong’s and Norman Malcolm’s Consciousness and Causality in British Journal for the Philosophy of Science 36 (3).
  • Mulhall, S. (1987) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Critique of his Early Philosophy in Mind 96: 113.
  • Oakes, Robert (1974) “God, Electrons, and Professor Plantinga,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 25 (2): 143-147.
  • Odegard, Douglas (1978) Review of Norman Malcolm’s Thought and Knowledge and Malcolm’s Memory and Mind by Dialogue 17 (3): 566-570.
  • Odell, S. Jack (1971) “Malcolm on Remembering That,” Mind 80 (October): 593.
  • Palmieri, L. E. (1962) To Sleep, Perchance to Dream,” Philosophy and Phenomenological Research 22 (4): 583-586.
  • Pears, David (1961) Review of Norman Malcolm’s Dreaming in Mind 70 (April): 145-163.
  • Pears, David (1989) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Criticism of His Early Thought in Philosophical Review 98 (3): 379.
  • Pintado-Casas, Pablo (1997) Review of Norman Malcolm’s Wittgenstein on Mind and Language de David Stern and of Norman Malcolm’s Wittgensteinian Themes: Essays 1978-1989, Teorema: International Journal of Philosophy 16 (3): 126-129.
  • Plant, Bob (2011) “Religion, Relativism, and Wittgenstein’s Naturalism,” International Journal of Philosophical Studies 19 (2): 177-209.
  • Plantinga, Alvin (1967) God and Other Minds Ithaca: Cornell University Press.
  • Plantinga, Alvin (1974) The Nature of Necessity Oxford: Oxford University Press.
  • Preston, Aaron “George Edward Moore (1873-1958),” Internet Encyclopedia of Philosophy.
  • Preston, Aaron “Analytic Philosophy,” Internet Encyclopedia of Philosophy.
  • Putnam, Hilary (1962) “Dreaming and ‘Depth Grammar,’” Ronald Butler (ed.) Analytical Philosophy: First Series Oxford: Oxford University Press.
  • Richter, Duncan “Ludwig Wittgenstein (1889-1951),” Internet Encyclopedia of Philosophy.
  • Riesenberg-Malcolm, Ruth (1999) On Bearing Unbearable States of Mind London: Routledge.
  • Rowe, William (1971) “Neurophysiological Laws and Purposive Principles,” The Philosophical Review 80 (4): 502-508.
  • Ryan, Sally Parker (2010) “Reconsidering Ordinary Language Philosophy: Malcolm’s (Moore’s) Ordinary Language Argument,” Essays in Philosophy 11 (2): 123-149.
  • Ryan, Sally Parker “Ordinary Language Philosophy,” Internet Encyclopedia of Philosophy.
  • Sayward, Charles (2004) “Malcolm on Criteria,” Behaviour and Philosophy 32: 349-358.
  • Schaffer, Jerome (1984) “Dreaming,” American Philosophical Quarterly 21 (2): 135-146.
  • Schröder, Severin (1997) “The Concept of Dreaming: On Three Theses by Malcolm,” Philosophical Investigations 20 (1): 15-38.
  • Serafini, Anthony (1993) “Norman Malcolm: A Memoir,” Philosophy 68 (265): 309-324.
  • Scott, Frederick (1965) “Scotus, Malcolm, and Anselm,” The Monist 49 (4): 634-638.
  • Shoemaker, Sydney; Swineburne, Richard (1985) Review of Norman Malcolm’s and David Armstrong’s Consciousness and Causality in Mind 94 (374): 302-306.
  • Shope, Robert (1973) “Remembering, Knowledge and Memory Traces,” Philosophy and Phenomenological Research 33 (3): 303-322.
  • Siegler, F. A. (1967) “Remembering Dreams,” The Philosophical Quarterly, 17: 14-24.
  • Soames, Scott (2003) Philosophical Analysis in the Twentieth Century, Volume II: The Age of Meaning. Princeton: Princeton University Press.
  • Soames, Scott (2004) “Malcolm’s Paradigm Case Argument,” Philosophical Analysis in the Twentieth Century. Princeton: Princeton University Press: 157-170.
  • Springett, Ben “The Philosophy of Dreaming,” Internet Encyclopedia of Philosophy.
  • Stern, K. (1959) “Malcolm’s Dreaming,” Analysis 19 (December): 44-46.
  • Stern, David (1991) “Models of Memory: Wittgenstein and Cognitive Science,” Philosophical Psychology 4 (2): 203-218.
  • Sturgeon, Nicholas; Brown, Stuart (1991) “Norman Malcolm 1911-1990,” Proceedings and Addresses of the  American Philosophical Association 64 (5): 70.
  • Swiggers, P. (1987) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Criticism of his Early Thought in Tijdschrift Voor Filosofie 49: 120.
  • Tang, Hao (2015) “A Meeting of the Conceptual and the Natural: Wittgenstein on Learning a Sensation-Language,” Philosophy and Phenomenological Research 91 (1): 103-135.
  • Thornton, Stephen “Solipsism and the Problem of Other Minds,” Internet Encyclopedia of Philosophy.
  • Tomberlin, James (1972) “Malcolm on the Ontological Argument,” Religious Studies 8 (1): 65-70.
  • Trakakis, Nick “The Evidential Problem of Evil,” The Internet Encyclopedia of Philosophy.
  • Uschanov, T. P. (2002) Ernest Gellner’s Criticisms of Wittgenstein and Ordinary Language Philosophy,” Gavin Kitching and Nigel Pleasants (ed’s) Marx and Wittgenstein: Knowledge, Morality and Politics. London: Routledge.
    • A variant of this paper is titled “The Strange Death of Ordinary Language Philosophy.”
  • Winch, Peter (1995) “Discussion of Malcolm’s Essay” in Norman Malcolm’s Wittgenstein: A Religious Point of View? Peter Winch (ed.) Ithaca: Cornell University Press.
  • Windt, Jennifer (2015) A Conceptual Framework for Philosophy of Mind and Empirical Research. Cambridge:  MIT.
  • Wittgenstein, Ludwig (1958) Philosoph­ical Investigations, Elizabeth Anscombe (trans.). Oxford: Blackwell.
  • Wittgenstein, Ludwig (1961) Tractatus-Logico-Philosophicus, David Pears and B. F. McGuiness (trans..) (London: Routledge and Kegan Paul.
  • Wittgenstein, Ludwig (1970) Zettel, G. E. M. Anscombe (trans.) Berkeley and Los Angeles: University of California Press.
  • Wittgenstein, Ludwig; Moore, G. E.; Malcolm, Norman; Citron, Gabriel (2015) “A Discussion between Wittgenstein and Moore on Certainty: From the Notes of Norman Malcolm,” Mind 124 (494): 73-84.
  • Wolf, Fred Allen (1995) The Dreaming Universe: A Mind Expanding Journey into the Realm in which Psyche and Physics Meet New York: Simon and Schuster Inc.
  • Wright, G. H. (1992) “In Memory of Malcolm, Norman 1911-1990,” Philosophical Investigations 15 (3): 224-226.
  • Yost, Jr., R. M. (1959) “Professor Malcolm on Dreaming and Scepticism—I,” Philosophical Quarterly 9 (April): 142-151.
  • Yost, Jr., R. M. (1959) “Professor Malcolm on Dreaming and Scepticism—II,” Philosophical Quarterly 9 (36): 231-243.

Author Information

Richard McDonough
Email: rmm249@cornell.edu
Arium School of Arts and Sciences
Singapore

Aesthetic Taste

Taste is the most common trope when talking about the intellectual judgment of an object’s aesthetic merit. This popularity rose to an unprecedented degree in the eighteenth century, which is the main focus of this article. Taste became a major concept in aesthetics. This prominence was so pronounced that it might seem that taste as an aesthetic idea developed from nothing during this time. However, the roots for theories of taste stretch back, as many things do, to Plato and Aristotle. In talking about the human soul, for example, Aristotle emphasized the role the senses play in obtaining knowledge and making judgments. As a condition for sentient beings, touch is the main component of taste, since the tongue must touch what it tastes. So, the idea that taste can be used to make judgments was present early on, as the embryonic idea for the more robust theories of taste.

Though it is no secret that theories of taste thrived in the seventeenth and eighteenth centuries, it might still be surprising because of the new intellectual focus. Science and the higher faculties of reason received a greater emphasis, while Alexander Baumgarten began using the word aesthetics to refer to the lower faculties of judgment. Why these lower faculties came to be so popular is unusual in the wake of the scientific developments and ideas of the day. But these philosophers realized there was something in common experience when confronted with beauty that they didn’t understand. Perhaps, people began to believe that humans really are the measure, since they were making these new intellectual advancements. And the ability to judge beauty would become more important, as they believed their judgments were more accurate or substantial. However, they still did not agree about the specifics of the judgments. For David Hume, taste is a subjective feeling with a standard found within the beholders. For Alexander Gerard, taste is an act of the imagination. For Immanuel Kant, taste is subjective, but beautiful objects present themselves as having universal appeal. And this is just a smattering of the different ideas.

Despite this strong beginning, the importance of taste dropped out of most theories of aesthetics by the twentieth century. Yet on a popular level, people continue to refer to good and bad taste in what are meaningful exchanges. Many subsequent philosophers have tried to develop a more involved theory of gustatory taste as a branch of aesthetics. Though this might have its own value, taste in the more traditional sense has not completely faded away, even though people do not any longer devote as much time to theories of taste.

Table of Contents

  1. Early Foundations for Taste: Ancient to Medieval Philosophers
    1. Plato and Aristotle
    2. Plotinus
    3. Augustine and Aquinas
  2. Why Taste Became the Metaphor for Aesthetic Judgment
  3. Eighteenth Century Philosophers: The Century of Taste
    1. Joseph Addison
    2. Anthony Cooper, Third Earl of Shaftesbury
    3. Francis Hutcheson
    4. Moses Mendelssohn
    5. Johann Gottfried Herder
    6. Alexander Gerard and Archibald Alison
    7. David Hume
    8. Edmund Burke
    9. Immanuel Kant
  4. Nineteenth and Twentieth Century Philosophers: The Step Away from Taste
  5. Contemporary Philosophy and Beyond
    1. Pierre Bourdieu
    2. Gustatory Taste
    3. Some Developments in Analytic Philosophy
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Early Foundations for Taste: Ancient to Medieval Philosophers

Theories of taste did not explicitly come to the forefront until the eighteenth century; however, most of the foundational ideas were in place many years prior. The focus was more on beauty and truth rather than on what the beholder felt about a given work. These ideas came to influence the theories of later thinkers as they revitalized, revised, and responded to the writings of these early Greek and medieval philosophers. Here, just a cursory glance of these preliminary thoughts will be reviewed.

a. Plato and Aristotle

Plato, Aristotle, and the other ancient Greeks did not have any specific notion of taste as a means of aesthetic judgment. However, many of their ideas inspired the later developments of theories of taste. Plato’s metaphysical beliefs, especially his view of the perfect forms, had an acute influence on the later Neoplatonists, even on those who did not specifically believe in a realm of the Forms. The traditional understanding of Plato holds that there is a heavenly realm where the perfect Forms of reality exist. Whether or not Plato believed in a literal realm of Forms is open for discussion, but it seems clear that he believed in perfect versions of everything we experience on earth. These Forms are like the templates of reality, and reality is therefore less perfect than these Forms. For instance, the Form of Beauty is the standard by which all other beautiful things are measured. Necessarily, all of the particular things are less beautiful than this perfect Beauty. To reach this higher Beauty, one must rise up to it through a dialectical method, as Socrates learned from Diotima in Symposium. Almost like ascending stairs, one uses the lower beauties of the world to climb up to the higher realms. To explain, we start with physical beauty, move to intellectual beauty, and then arrive at spiritual (or perfect) beauty. More could be said about Plato’s overall view of aesthetics and beauty, but it is important to note here simply that the apprehension of the beautiful is connected with knowledge. As one obtains knowledge, one continues to learn more of beauty. So, knowledge is the key component to developing a better appreciation of beauty and, for Plato, arriving at Beauty itself.

Aristotle, like Plato, did not have a concept of taste per se. The quest was to uncover the principles of beauty. Rather than believing real beauty existed in another world, Aristotle wrote that beauty was a property of objects; it was related to their size and proportion. Though Aristotle did not develop a system of aesthetics as such, he is the first on record to have developed an extended treatment about one of the arts, namely poetry. Since this is the main art that Plato criticized in the Republic, one might wonder whether this was Aristotle’s attempt to further distinguish his own system of philosophy.

While Plato’s view is that Beauty has the same nature but with different degrees in different objects, Aristotle seems to hold the idea that beauty’s nature varies with the different objects (or types of art) in which it is found. Therefore, the beauty of an object might relate to that object’s purpose, though he never directly says so. More important to Aristotle’s view are the concepts of form and unity. As form and unity are necessary for knowledge in the strict sense, so they also provide a kind of knowledge in art as an object imitates something else. While Plato believed that art diminishes the knowledge of something to an almost unrecognizable degree, Aristotle holds that the imitation helps the idea become simpler and therefore more easily understood. The imitation can actually be a useful, sometimes necessary, step in obtaining knowledge. And the imitation, though not always complete, is correct. Rather than rising to some higher forms of beauty beyond this physical world, Aristotle seems to have a more experiential approach to discovering and judging beauty. Each kind of thing has its own form and therefore its own beauty. To develop what we might call aesthetic judgment, though Aristotle does not use that expression, one would have to observe enough samples of different objects of the same kind to discover the order and arrangement proper to those things. Both Aristotle and Plato have beauty located outside of human experience, so taste would have been the search for beauty in things.

b. Plotinus

Plotinus, the most recognized Neoplatonist, developed his metaphysical philosophy around three principles: The One, Intellect, and Soul. Following earlier philosophers’ attempts to derive the more complex things from the simple, Plotinus posits the One as the simplest first principle of everything else, and everything is derived (or emanates) from this first principle. The One is even more foundational for reality than Plato’s Forms, without which the Forms would have no unifying principle. The second principle, Intellect, is where the other Forms reside, so to speak. These Forms, like in Plato’s view, are what give everything else their respective properties. The third principle, Soul, is the principle of desire for those things that are external to the individual. Plotinus, like Plato, posits degrees of beauty, with the lowest being physical beauty, leading up to the Beauty present in the Intellect.

Likely influenced by the ideas in Diotima’s speech in Plato’s Symposium, we begin our ascent from physical beauties until we climb up to the highest Beauty. Where he might differ from Plato is in the hierarchy of beautiful things. Presumably, Plato thought natural things were more beautiful than artifacts because they were closer to the Forms of those things. However, Plotinus claims that the manipulation of a natural object through art made it more beautiful. To demonstrate, Plotinus uses an example of two stones: one is naturally occurring and the other has been wrought by an artist into the image of a god. Which one is more beautiful? Plotinus thinks it is clear that the one that has been imbued with the soul of a human artist has achieved a higher degree of beauty. The soul of the spectator can enjoy this object more than the natural stone because he recognizes the work of a like soul. As one ascends toward Beauty, the goal is to rely less on the senses, though they are the necessary condition for beginning the ascent. It seems for Plotinus that taste is not necessarily developed or reasoned about. Rather, it is almost like a reaction in the soul, based on the knowledge that the soul possesses.

c. Augustine and Aquinas

Medieval philosophers were concerned with metaphysical properties like beauty more than any notion of individual preference or taste. This was partly because beauty, for most of the philosophers of this time, was an objective property. There wasn’t any room for disagreement about whether something was actually beautiful, though they presumably debated about whether one’s particular knowledge about the beauty of an object was correct. It was widely believed that the true, the good, and the beautiful were linked to each other. Talking about any one of these concepts involved overlapping discussions of the other two. For instance, theories of beauty consisted of some discussion about its relation to the true and good. For present purposes, the main two representatives of the early and later middle ages will be discussed: Augustine and Aquinas (See also the article on Medieval Theories of Aesthetics).

Continuing the basic ideas of Plato, Augustine thought perfect beauty existed in God rather than the impersonal realm of the Forms. In fact, God is the highest beauty, and everything participates in beauty because everything is created by God. In physical objects, Augustine believed two primary attributes made those things beautiful: equality and unity. Unity is found in everything that exists, but equality (proportion or symmetry) is not necessarily found in everything, especially those things made by people. Augustine provides an example that shows we at least aim for equality in our work: If you want to put two windows on the side of a house, you do not want one to be gigantic and the other tiny. You want them to be the same size, assuming the wall is an even rectangle. For Augustine, the judgment of beauty is founded upon a person’s apprehension of the unity and equality of an object. And this involves reason, which isn’t that different from previous thinkers who thought knowledge was a necessary aspect of grasping beauty. The standard of beauty is in God’s mind, so the beholder must come to understand this standard through some divine illumination. Without God’s help, a person might see vaguely the beauty of an object, but it is God alone who can help the beholder grasp the fullness of beauty. Though Augustine does not have a theory of taste, we might say that one’s taste is perfected the closer one is aligned with God.

Unlike Augustine, Aquinas adheres more carefully to the overall philosophical views of Aristotle rather than Plato, though Plato’s influence is not absent. Finding beauty present in physical objects, Aquinas famously asserts that beauty is that which pleases when seen. It might appear that Aquinas’s definition asserts a subjective understanding of beauty; namely, whatever pleases the onlooker becomes beautiful. However, the word seen implies contemplation of the object. Once again, knowledge comes into the apprehension of the beautiful. Aquinas’ view of beauty differs from the platonic view in that beauty is really present in the object, though similar to Augustine’s view, beauty is still God, who is the ultimate cause of all beauty.

Recall that Augustine offered two main traits of beautiful things: equality and unity. Similarly, Aquinas presents us with three conditions of beauty: proportion, wholeness, and radiance. Proportion involves symmetry but is not limited to this one aspect. It involves whether there is an overall balance achieved in the object. Wholeness (or integrity) is the condition that involves the degree to which something attains its proper form. For example, a dancer sitting down is less beautiful—as a dancer—than when he or she is actually dancing. The last condition, radiance, is the most evasive. It might just be, for Aquinas, the most important condition because objects might have proportion and be whole yet still not be radiant. Generally speaking, it is that quality of an object that makes us want to perceive it again. It involves the way an object “shines” before the beholder. For Aquinas, perceiving the beauty in an object is not passive; it is an activity of the intellect. Like judging the truth of a proposition, a judgment of beauty begins with cognition. Then, the beholder makes a judgment based on these three conditions. Taste, if Aquinas had a theory, might be the ability to recognize these three universal conditions in their specific instantiations.

2. Why Taste Became the Metaphor for Aesthetic Judgment

We have mentioned briefly the basic ideas that created the foundation for theories of taste, but we still need an explanation as to why taste became the metaphor for aesthetic judgment. The sense of taste, in ancient times, was connected with the appetite, not with rational judgments. Seeing and hearing provide the most information and thus were considered the best senses for gaining knowledge. The other three senses simply help round out that knowledge. Therefore, it would have been natural to assume that seeing and hearing are also the best senses for pronouncing a judgment of beauty or sublimity. After all, these two senses were thought to be necessary for making intelligent judgments. But it was taste that became the main faculty for making aesthetic judgments, especially for the 18th century philosophers. Of course, it was not literal tasting, but metaphorical, that was at work here. Because of this, some posited a sixth, internal sense that they referred to as taste. Still, one might wonder why taste suddenly emerged as the metaphor for making judgments about beauty. Though there is no exact reason why taste rose to this prominent role, there are a couple of ideas about it worth mentioning here.

In the Aristotelian tradition, taste is connected strongly with the sense of touch. Though he maintained five senses as we do today, Aristotle considered whether there might only be four. It is necessary for the tongue to touch food, for example, for the food to be tasted. In the middle ages, this became more significant as different tastes were believed to elicit healing and nutrition on the body. It was believed that different flavors held different properties for the body, and a mixture of flavors was necessary in order to maintain healthy balance. Flavor was not accidental in different foods. Thus, good taste, in the sense of diet, was necessary for one’s physical well-being.

In the later middle ages, taste was occasionally related to the term honest by referring to objects as, for example, an honest painting. This description might seem unusual since honesty is often connected with truth. And we consider honesty describing only a being with a will to choose, because one has to decide to be honest in a given situation. An inanimate object cannot make such a choice. Calling an object honest, however, was a reflection on the viewer or, more specifically, ideal type of viewer. Basically, an honest object is one that an honorable person would consider to be beautiful. It would be an object that is well suited for its purpose. This idea is connected with the belief that the good and beautiful are related, so of course, the good person is better suited to apprehend the beauty of an object. Taste was recognized as the sense associated with the ability to discriminate, namely flavors. But taste also became the metaphor for discriminating or judging the beauty of an object.

3. Eighteenth Century Philosophers: The Century of Taste

Theories of taste sprung up in the eighteenth century, which is why George Dickie refers to it as “the century of taste,” which is also the name of his book. Everyone had to contribute something to the discussion, and then it seemed to die down as quickly as it had arisen. Prior to this century, most of the discussion centered on theories about beauty, which was deemed objective, but now philosophers began to look more toward themselves to understand their reactions and preferences to such things in both art and nature. This shift began their theorizing about taste, which turned the discussion toward the subjective. And then, a century later, this discussion transformed into theories about the aesthetic attitude.

a. Joseph Addison

Joseph Addison (1672-1719) did not present a systematic treatment of aesthetics, but he did promise and deliver original ideas spread throughout his essays for the Spectator in 1712. Specifically, Addison set out to investigate the pleasures of the imagination. The first essay in this series of eleven is devoted to taste. He writes that most languages employ the metaphor of taste to indicate the faculty of the mind that distinguishes between faults and perfections in writing. This faculty of mental taste (which involves the perception of beauty), like that of sensitive (or physical) taste, has degrees of refinement. So Addison was trying to help those in the middle-class utilize their brief moments of leisure for these kinds of pleasures of imagination. The pleasures of cognition, which involve intellectual thought, might not be possible for some people who have lesser intelligence or lack access to education. But the pleasures of the imagination—eyesight furnishes the ideas here—are just as good and are more easily obtained. After all, every image, says Addison, enters our minds through sight.

Addison asserts that taste is a person’s psychological response to literature. Though his remarks are mostly framed in the context of literature, Addison’s basic ideas became the foundation for people’s thoughts about other kinds of art and nature. Even though the faculty of taste is present in people at birth, it must still be cultivated to be brought to its fullest ability to judge. This should not be surprising, because the same is true for the sensitive taste. Putting something in one’s mouth enables the sensation of taste to work quite automatically. But it can take years of experience and practice to develop a sensitive taste refined enough to detect the subtle differences between two glasses of whisky.

The pleasures of the imagination are found in two types. Following Locke, Addison maintains that no images enter one’s mind without going through the sense of sight. The primary pleasures come directly from the visual objects, which are present to the observer. The secondary pleasures arrive from those objects that are remembered or fictitious, being only in the mind (at least at the present moment). A person, through imagination, can manipulate or alter the images that are in the mind. The aesthetic pleasure arises solely through the contemplation of these ideas or those images in the mind. These pleasures of the imagination are greater than sensual pleasure and are as great as the cognitive pleasures. They have at least one advantage over the cognitive pleasures—the pleasures of the imagination are much easier to obtain because one has to simply open one’s eyes.

b. Anthony Cooper, Third Earl of Shaftesbury

Anthony Cooper (1671-1713), the Third Earl of Shaftesbury (usually just called Shaftesbury), started his thoughts on aesthetics from Neoplatonic metaphysics. Shaftesbury developed his belief that taste was inborn in human beings, an idea perhaps similar to recollection in Plato. For anyone reading Shaftesbury, it becomes clear early on that he is not interested in developing a “system” of aesthetics. His thoughts cycle through his narrative, especially in his work “The Moralists.” Woven throughout these works are many important ideas that Shaftesbury does not always fully develop but were still highly influential to those writing after him.

Shaftesbury maintains that people grasp beauty and goodness in exactly the same way, which involves the moral sense. The beautiful is closely related to virtue in his thinking; hence, moral theory permeates most aspects of Shaftesbury’s understanding of aesthetics. One’s sensibility in the realm of morality is intertwined with one’s apprehension of beauty. Not only does he borrow from Neoplatonism, but Shaftesbury also emphasizes experience, showing elements of empiricism in the development of his ideas. He holds that forms of the beautiful and good are embedded in people’s minds, but each person has an internal (moral) sense to which he or she can appeal. These two attitudes provide tension among the characters—Theocles and Philocles—in his prose as they seek to sort out opinions concerning taste. It was likely this interplay of contrary ideas that led Shaftesbury to utilize the prose style where he told a story of a group of people having discussions about taste and beauty. This style invites the reader into the discussion, similar to Plato’s use of dialogue.

The key aesthetic property is harmony, which is found in nature as created by God. Seeing God as the ultimate artist, Shaftesbury extends aesthetic appreciation to the natural world as the ultimate aesthetic object. An important part of Shaftesbury’s belief is that the moral sense allows a person to comprehend an object’s beauty immediately, without the need to use reason. Intuition is at work here more than sensation. Obviously, the object is initially perceived by the senses, but then it is immediately judged by the internal (or moral) sense.

Though it seems possible for variation among different people’s internal senses, Shaftesbury did not think that aesthetic judgments were relative. He believed in a universal standard of judgment for beauty. Philocles claims that an inward eye immediately differentiates the fair and admirable from the deformed and foul. This ability must be natural, since it differentiates as soon as objects are perceived by the senses. If the ability to discriminate (through one’s taste) between beauty and ugliness is immediate, then taste cannot have its ultimate grounding in the process of reason, which takes time. Experience affirms the immediacy of one’s judgments concerning objects of perception. For example, being captivated by a sunset usually does not require more than a glance to draw the viewer in. Therefore Shaftesbury, through Theocles, maintains that taste cannot have its ultimate source in discursive reasoning.

Some things can block people or cloud their minds from being able to make sound judgments. Even though taste resides innately in human beings, passions and ignorance prevent one’s internal sense from successfully comprehending beauty in sensible things. Shaftesbury acknowledges that one cannot escape these obstacles; however, one can learn to control them in order to avoid being tossed around by one’s whimsical feelings. The internal sense connects goodness and beauty for Shaftesbury; therefore, one can allow beauty to affect oneself more fully by cultivating a virtuous or harmonious life. Someone deprived of virtue will be less able to perceive beauty than one who lives a virtuous life.

Theocles declares several times that beauty and good are the same thing, which the inward eye enables people to immediately perceive. Then, the beauty and goodness of these objects are compared to the innate concept of harmony. It seems that the closer they are to this notion of harmony, the more beauty they are judged to possess, remembering that this happens without reasoning about the object. The same would also apply to one’s judgment about the actions of others, whether they are noble or evil. When one builds good foundations of “order, peace, and concord,” then one is able to immediately connect with beauty. The reverse is also true: if one is unable to experience the beautiful, then it is indicative that one’s life is disharmonious. Philocles raises an interesting objection to Theocles’ schema. He wonders why there are so many different people believed to be virtuous, yet their actions are often conflicting. Theocles agrees that seemingly virtuous people differ in their opinions about heroes and whether gardens or paintings are better. These differences create tension when seeking which opinions should have authority. Shaftesbury is vague on this point: It seems he is trying to claim that happiness is the measure of a successful life. And virtue, which leads to success and happiness, is the prerequisite for developing taste to comprehend beauty. In the end, it seems that the individual is responsible for his or her own happiness and has to make decisions accordingly. If there are different, even contrary, examples of a virtuous life, then it seems difficult to know whether one is actually living the virtuous life and able to apprehend beauty to a fuller degree. But he still maintains that a developed sensibility allowing one’s innate taste to have full play is the result of guiding oneself toward a moral life with happiness as the standard of measure.

c. Francis Hutcheson

Francis Hutcheson (1694-1746) started from a Lockean view of sensation, which is divided into simple and complex ideas and primary and secondary qualities. With that foundation and his belief in a moral sense, Hutcheson also posits an innate and internal sense that was necessary for perceiving beauty. One reason for this view is exemplified by the fact that some people’s external senses are fully functioning, yet they find no enjoyment in the arts. If their five senses are working properly, then the hindrance must come from another sense. Moreover, there are things like mathematical or logical theorems that are deemed beautiful, but they are perceived by the mind and not the five senses. Finally, as further proof, Hutcheson notes that beauty is perceived immediately and does not require any knowledge; people make aesthetic judgments quite instantly.  So, for Hutcheson the ability to grasp beauty must be another, internal “sense.

The internal sense—Hutcheson does not clearly define it—is a mental faculty that functions much like one of the five senses. However, it recognizes beauty in both sensuous and mental experiences, which makes it sufficiently distinct. Hutcheson holds a complementary place to Shaftesbury in the development of the idea of innate taste. Hutcheson also blends his aesthetic theories with his moral theories, and both contexts allow for innate elements in human beings. Like an external sense, this internal sense is natural and is not governed by one’s will. Hutcheson points out that the will does not determine whether an object causes pain or pleasure. It is a natural instinct to pull one’s hand away upon touching something hot. Experiencing some objects causes pleasure, while other objects inevitably cause pain. As an analogy, Hutcheson demonstrates that pleasure in artistic objects—architecture, painting, musical composition, and so on—is also innate and necessary. Though he finds the faculty of taste to be an internal sense, Hutcheson explains that the pleasure arises out of the harmony, order, and design of the object. But he does not think that simple ideas, like color, sound, or mode of extension, can provide the same pleasure.

Concerning taste, Hutcheson believed that beauty represents the idea, while the sense of beauty represents our ability to grasp this idea. The combination—one’s ability to perceive beauty internally—is what he refers to as taste. When perceiving beauty, we should note that Hutcheson proposes a distinction between absolute (or original) beauty and comparative (or relative) beauty. Objects have absolute beauty when they are beautiful in themselves without a comparison with any other object. Comparative beauty, on the other hand, is grounded in the comparison between the object of the perception and the object that it imitates.

Beauty, for Hutcheson, is mostly comparative, which means it would not exist without relating to the mind of a perceiver. Objects play their part by exciting in people feelings of beauty when there is “uniformity amidst variety,” which is the primary property of beauty. When the uniformity is multiplied, then the beauty increases. For example, an equilateral triangle has less beauty than a square, while a perfect hexagon has more beauty than both of them. On the other side, more uniformity enhances the beauty when variety is multiplied. For example, a square is more beautiful than a rhombus. This uniformity with variety triggers the internal (and innate) sense of taste in human beings, causing them to apprehend the beauty of the object. External things only contribute by relating to this internal sense, causing it to activate feelings of pleasure. This activation of pleasure notifies observers that they are experiencing something that is beautiful.

d. Moses Mendelssohn

As a committed rationalist, Moses Mendelssohn (1729-1786) did not want to rely on emotional responses for aesthetic experiences. He was dedicated to the principles of Leibnizian metaphysics. Mendelssohn’s goal of understanding the world could only come from rational principles applied to reality. The rationalists advanced the notion that clear and distinct ideas are present when one understands the interconnectedness of things. Taste also falls under the rationalist scheme and is something acquired and developed rather than an internal sense that is natural. Since clear and distinct ideas are not easily realized, Leibniz suggests that most of our knowledge consists of clear and confused ideas. Clear ideas arise from an object that is distinguishable in a sense perception, but they can be confused (that is, not distinct) because their contents are not distinguishable. Clear and confused ideas usually result when one knows the whole and not the parts, that is, the interconnectedness of the parts is not known.

In “On Sentiments,” Mendelssohn presents a series of letters, written by Theocles, that was a reaction to Shaftesbury who had a character with the same name. Mendelssohn believed that views like Shaftesbury’s, though freethinking, lacked the rigor necessary for precision. Mendelssohn’s Theocles admits that when someone does not have the requisite experience of beauty, it was likely from lack of preparation. Theocles claims that he prepares himself to experience beauty, and this preparation is necessary for the experience. It might be similar to a runner stretching before running a race. People ready themselves in many different contexts, so it should not seem odd to prepare for an aesthetic experience. Mendelssohn’s Theocles explains that he actually prepares to experience something pleasurable by initially striving to perceive it distinctly. Making a transfer from parts to whole, the distinct ideas fade out into the background and become confused. Since it is necessary for the whole to be present to the senses at once, the universe can only be a beautiful object for the mind of God. Hence, the finitude of mankind prevents objects too massive or too miniscule from being perceived as beautiful.

Mendelssohn describes some criteria for explaining why an object is effective at presenting a perfection or an imperfection, which aids in apprehending beauty. He describes three proportions that act on our impulses: (1) the proportion to the magnitude of the good, (2) the proportion to the magnitude of our insight, and (3) the proportion to the time required to consider this good. The first proportion relates to perfection, implying that things which possess a higher degree of perfection are more pleasing to the mind. The second one relates to knowledge: the more distinct one’s knowledge is of something, then the more impact that thing has on the individual. The last proportion requires more explanation. It relates to the speed of the perception. The less time it takes to perceive a perfection, then the more pleasant is the knowledge of that object. Something that can be perceived quickly might produce greater desire in the perceiver than something that is more perfect. By learning to see things clear and confused, that is, the whole but not the parts, one can learn to perceive more quickly. One learns to train the soul through habit and practice; the goal is to become so trained that an action no longer requires thought (or at least requires less thought). Practice and intuitive knowledge are the two main ways to increase the speed of one’s thoughts. Practice involves constantly reviewing things, such as inferences in practical philosophy until it becomes ingrained in one’s mind. Intuitive knowledge entails continually learning to apply the practiced inferences to concrete situations. In terms of aesthetic experience, one learns through reason things that are supremely beautiful by being often exposed to beauty. Eventually, one practices and applies taste through the instrument of reason until it becomes embedded, and it will eventually function without thought.

Mixed sentiments—those combining pleasure and displeasure—are another indicator of Mendelssohn’s belief that taste is acquired. Sympathy is the primary example Mendelssohn employs to illustrate the notion of mixed sentiments. Sympathy expresses love for an object, while also being discontent at the object’s or person’s misfortune. He demonstrates this idea using examples from drama. When a tragedy is about to occur, the audience can appreciate the ability of the actors, directors, and writers to make them feel terror; however, the audience is not afraid for themselves but the characters who are about to suffer. The interesting thing about mixed sentiments is that they penetrate more deeply and vividly into one’s mind than any type of pure pleasure. Like learning to recognize the three proportions, habit is also required to develop an understanding of the mixed sentiments. One must practice utilizing mixed sentiments to discover and experience beauty and sublimity.

Mixed sentiments lead Mendelssohn into thoughts on perceptions and one’s reaction. An extremely large object that we could think about as a whole but could not comprehend in person causes a mixed sentiment of gratification and trembling if we continue to think about it. As examples, he suggests the depths of the ocean, a desert stretching out to the horizon, or the seemingly endless stream of stars in the sky. One feels euphoric, a pleasing nausea. Pure pleasure will eventually breed boredom induced from monotony, but mixed sentiments will overpower one’s senses, making one want to perceive it again and again. Mixed sentiments and training the mind are two important facets of Mendelssohn’s understanding for how people develop or acquire taste.

e. Johann Gottfried Herder

Johann GottfriedHerder (1744-1803) shared the notion of reasoned or developed taste with Mendelssohn. He deviated from Mendelssohn by grounding everything in nature, while Mendelssohn was a staunch advocate of Leibnizian metaphysics, grounding everything in reason. It might seem that a belief in the supremacy of nature would lead one to the view of innate taste, like the view held by Shaftesbury. However, Herder does not begin with innate ideas like those in the Platonist school; he places more emphasis on discovering and developing an ability to perceive beauty. Herder adds a step to Mendelssohn’s view, rather than opting for innate taste. Mendelssohn basically believed that reason develops taste, while Herder believed that nature leads to reason, which then leads to taste. In commenting on the natural aspect of taste, Herder explicitly claims that truth and beauty are disclosed through the use of reasons. When one is induced by reasons, then one will naturally expect everyone to accept the same reasons as evidence of truth or beauty. He was well aware that not everyone would actually agree with the same type of reasoning concerning the beauty of a given object. He merely asserts that it is natural to expect (or want) others to be in agreement.

People tend to have differing views about what counts as beautiful or ugly, and it is important for Herder’s view that this occurrence be explained. Dealing with people’s differences of opinion is one of Herder’s distinguishing characteristics. He was very interested in the way that diverse people develop and come to think and act in distinct ways from other people, and he points out the fact that taste changes throughout time and from place to place. He links this change, as well as others, with culture and upbringing. Everyone, according to Herder, possesses an aesthetic nature, which is one’s capacity to apprehend beauty through the senses. This aesthetic nature is the starting point for each person, but it develops in different ways depending on one’s culture, background, and experience. For example, if someone immerses oneself completely in the art of music, then one will be exceptionally trained to hear the melody of music. At the same time, this person might be ill-equipped to perceive visual beauty because one’s eyes might not be as well trained as one’s ears. Nature has equipped everyone with similar capacities to perceive beauty, but each person is responsible for developing these capacities. On the other hand, people are restricted by how much their society and environment have contributed to developing their tastes as a whole. Beauty is not always obvious in every culture, but Herder claims that it is always present, at least in a foundational way. Utilizing one’s reason and overcoming one’s background are necessary for developing good or refined taste.

f. Alexander Gerard and Archibald Alison

Much like Hutcheson, Alexander Gerard (1728-1795) and Archibald Alison (1757-1839) built their theories of taste upon a foundation of Locke’s notion of ideas. They each developed from this foundation views of taste called associationism—a view that the mind (or imagination) relates ideas that are similar to each other or conjoined by custom or experience. Even though their theories differed in degree, there is enough overlap to list them together.

Gerard believed that taste was a kind of internal sense similar to the external senses. Like those five senses, experiences for this one were also simple and immediate. As soon as something comes into your field of vision, the sense of sight perceives it immediately. Likewise, as soon as beauty—or another aesthetic property—enters into your perception, you can immediately experience its beauty. Gerard divided up his study into seven principles of the internal sense (or powers of the imagination), not only a sense of beauty like Hutcheson. The seven principles are novelty, sublimity, beauty, imitation, harmony, oddity (humorousness), and virtue. This may seem like a curious list for contemporary theories of aesthetic taste, but Gerard’s association theory makes sense of these principles.

Taste, for Gerard, is a kind of critical perception, which he calls relishing. It went beyond simply perceiving an object. Anyone who ingests food can taste it in the most primal sense of the term. But to discern differences and subtleties requires a whole other set of abilities. The pleasure is derived from the seven categories because they require moderate difficulty to formulate or comprehend this new idea. Basically, the new object is associated to previous ideas in the mind of the perceiver, and this is an act of the imagination. Rather than being a mere feeling, the imagination follows rules to make these associations. Strong passions conjure up these associations, in a sense, but then the mind continues the process of associating these feelings with the appropriate concepts.

People improve their taste when judgment and imagination are combined through the following factors: sensibility, refinement, correctness, and proportion (or comparative adjustment). All but the last refer to a single property among various objects. Sensibility is basically a person’s range of feeling pleasure and pain, which, Gerard notes, differs from person to person. Refinement involves making comparisons, especially between lower and higher degrees of a particular quality. Correctness, for Gerard, means alleviating the confusion between what are merits and what are blemishes. Proportion, on the other hand, compares whole objects with each other, rather than mere properties. One’s taste improves as one develops a refined ability to utilize these four factors to unite the seven principles when apprehending an object of beauty.

Alison provides an overly detailed association theory of taste, but here only the basic ideas of his view will be presented. To begin with, beauty is found in the mind of the perceiver; he does not consider it a property of the object. He maintains this opinion because, when describing an experience of beauty, one always resorts to talking about how it made him or her feel. Imagine someone claiming that a given object is extremely beautiful, and yet it is an object of indifference. That seems impossible, which is why Alison believed feeling is necessary for beauty. And this feeling of beauty arises through what he calls a train of taste. This is similar to someone having a train of thought, where one thought is associated with or leads to another thought and so on. A train of taste begins with a simple emotion—such as cheerfulness—that arises when perceiving an object. This simple emotion becomes the starting point for a train that associates the ideas of emotions. While this is the necessary starting point for an aesthetic experience, this train must also produce emotions.

Alison’s association view claims that an affective quality of an object becomes associated with ideas of emotions as a train of taste. The constant conjunction between the material quality and the abstract or emotional quality become correlated through experience. To illustrate, thunder might produce fear in a child because the child associates the noise and lightning with the emotion of fear; on the other hand, a farmer might feel joy upon hearing thunder if this season has been particularly dry. Unlike these examples, differences in people’s tastes result from an absence of the right associations. People, for different reasons, may fail to produce the requisite trains of taste that lead to the right emotion. This can be caused by different concerns interfering with one’s ability to allow the trains of taste to develop. Worrying about paying next month’s rent, for example, could hinder one’s ability to follow the train of taste where it will naturally lead. Thus Alison, like many others, posits a notion of disinterestedness as a necessary condition—one must not be distracted by cares in order to allow one’s taste to apprehend and appreciate beauty.

g. David Hume

Even though David Hume (1711-1776) wrote little on aesthetics, his condensed essay “Of the Standard of Taste” was highly regarded by those who came after him. One cannot successfully treat the subject of taste thoroughly without some reference to this essay. Hume is generally labeled an empiricist, but in terms of taste, we could classify him as an ideal observer theorist who allows for some individual and cultural preferences. Empiricism, however, seems an apt label when considering certain elements of his essay on taste, namely that its foundation is experience. Art as a social practice is contained, for Hume, under the general theory of human action that he presents elsewhere but does not develop explicitly for his aesthetic views.

Hume draws a distinction between sentiments and determinations. Sentiments are always right because they do not reference anything beyond themselves. However, determinations are not all correct because they make reference to something beyond themselves, something that could be verified or falsified. Beauty is not a quality of objects; therefore, judgments of beauty and taste are sentiments, not determinations. If beauty was a quality of objects, then we would have a standard of beauty contained within those beautiful objects. Despite this result, Hume still wants to allow for certain kinds of opinions that seem correct from experience. While there are some objects that might be close in beauty to each other, there are others that clearly seem to be more beautiful than other objects.

As a prime example, Hume claims quite famously that no one—with a right mind—would think that Ogilby and Milton have no difference in excellence. But this difference is not something in the object itself, for beauty is not a property of objects but is in the mind only. So the objects that affect the higher sentiments of the person are the ones that we deem more beautiful. These affects are the result of cultural convention and therefore are subject to change. But within a culture, there is a standard of taste that isn’t explicit, like the law, but is based on experience (comprising practice and comparison), especially experience of the right kind of person. Hume appeals to a true judge that would be able to perfectly assess the beauty of an object because this person would possess “strong sense, united to delicate sentiment, improved by practice, perfected by comparison, and cleared of all prejudice.” The combined opinion of these very rare individuals would compose the standard of taste. The standard of taste lives within these true judges. By recognizing the better judgments certain people have displayed, the standard of taste they represent becomes public.

It is important, however, not to confuse Hume’s true judge with someone like a contemporary art critic. The judge is not applying a standard of taste to the different objects of perception. If so, then beauty would be found in the objects or in some other realm. To better understand what Hume means, we can explain it this way: Many people have experienced looking at something and not understanding what they are seeing. And then someone else comes along and shows them what to look for (or how to see it properly). All of a sudden, they are able to perceive the object properly. For another example, take seeing someone in the distance. You might think the person is your friend. But as the distance becomes smaller and the perception clearer, it is now obvious that the person is a stranger. These analogies are what Hume has in mind. The true judge does not apply a standard, but the true judge has more perfect perception. And ideal perception is the key to having good taste. It follows that becoming better at perceiving objects will make one’s taste better.

h. Edmund Burke

Like Hume and others, Edmund Burke (1729-1797) recognized that nothing seems more indeterminate than taste. Hume tried to show that since we believe there are expert opinions on matters of taste, then taste cannot be simply a personal whim. He even asserted it is likely that the standard of reason and taste are the same in human beings. The explanation, Burke claims, for thinking that reason and taste seem so different is because more people cultivate reason to a higher level. An error in reasoning could have far more negative consequences than an error in taste. For example, a heart surgeon considering which kind of operation is necessary will have greater direct consequences than someone trying to reason about whether Pablo Picasso is a better painter than Marc Chagall. The urge to cultivate taste is not present, so most people do not devote much time to it.

Though Burke recognized the ambiguities surrounding taste, he set his goal to try to uncover principles of taste. One of his starting points was the uniformity of people’s organs of perception. Many people emphasize the differences between people’s perception of the same event, which leads to the belief that people perceive things differently. Burke, however, maintains that if people’s sense organs functioned completely differently, then every kind of reasoning would be impossible. If two people were looking at a tree, for example, then there would be nothing on which to ground their separate claims that it is a tree. They might choose to describe what they see in contradictory terms, but as Burke claims, their sense organs must actually perceive the same object. Part of his method was to catalogue the different kinds of objects and how they affect the senses and which senses they affect. Specifically, Burke chose to categorize objects according to their giving pleasure or pain. Through this catalog, Burke believed he demonstrated that people have the same physical responses of pain and pleasure to various objects. This catalog further gives foundation for a more precise theory of taste by showing the similar responses people have toward different sense stimuli.

For Burke, pleasure and pain compose the two, main aesthetic starting points for a judgment of taste, first going through the senses and the imagination. Since one rarely moves from pain to pleasure or the reverse, Burke introduces indifference as the neutral starting point for experience. In other words, one moves from a state of indifference to either pleasure or pain. If one is in an indifferent (or neutral) state, then music, for example, compels one to move to a state of pleasure. The power of the imagination utilizes the pleasure or pain to recognize the property of the object that led to that particular feeling. Depending on which one, the object is judged to be beautiful or ugly in accordance with the degree of pleasure or pain. So, Burke’s notion of taste consists of three things: primary pleasures of sense, secondary pleasures of the imagination, and conclusions of the reasoning faculty.

I. Immanuel Kant

Owing to his third major critique about aesthetic judgment, Immanuel Kant (1724-1804) remains an overwhelming influence in aesthetics. Much has been written about different aspects of Kant’s aesthetic theory, so this section will focus solely on his ideas surrounding taste. Though Kant fully believed that taste is subjective, he nevertheless referred to judgments of taste rather than something like feelings of taste. This choice was not a denial that feelings are relevant, since taste has to do with pleasure, but he wanted to uncover whether there were any a priori principles for taste.

As someone who liked theoretical systems, it is no surprise that Kant divides judgments of taste into moments. There are four moments that correspond with the four judgments (quality, quantity, relation, and modality) found in the Critique of Pure Reason. The first moment, disinterested pleasure, corresponds with quality. It means that in order for a judgment to be one of taste, it must not involve any interest beyond itself. Disinterested is not the same as uninterested. Disinterested is closer to a kind of detachment. The object has nothing to give other than the pleasure of itself; there is not an interest beyond itself. If one found an expensive object, one might declare that it is beautiful. However, this would not, strictly speaking, be a judgment of taste, if one were also thinking about the amount of money to be gained from its sale.

The second moment, universal pleasure, corresponds with quantity. It preserves the common belief (or feeling) that judgments about beauty are not completely subjective. We often expect others to share this belief. For example, we would find it highly unusual, if not disturbing, that someone literally believed that a sunset did not possess at least some beauty. Since Kant does not assert a specific standard of beauty, he doesn’t claim everyone will actually agree about which objects are beautiful. Judgments of beauty are singular; they are about one object at a time, and each judgment presents itself as having universal appeal.

The third moment, the form of purposiveness, corresponds with relation. Specifically, he is focusing on the relation of an end or purpose, a final cause. The purpose for which an object is made governs the way it is made. A hammer has a purpose as it was made to put nails into wood; so, the idea of its purpose existed before the actual hammer. However, judgments of taste (or beauty) do not depend on concepts, so it seems that they could not have purpose. But Kant believes that a judgment of beauty cannot be solely a feeling: it must be based on formal properties. To overcome this problem, Kant employs the expression “purposiveness without a purpose.” This is subjective; we must imagine that the object has a purpose even though, for an aesthetic judgment, it does not.

The fourth and last moment, necessary pleasure, corresponds with modality. Unsurprisingly, Kant does not think people find something beautiful because they must necessarily find it so. Kant explains that this necessity implies that the beautiful object is exemplary. When we see a beautiful work of art, we want to imitate it as if there were rules to follow to produce an equally beautiful object. Artists employ techniques that can be learned, but Kant believes that it is not possible to teach someone how to make a beautiful work of art even if that person masters all the techniques of a given art. Taken together, these four moments compose the basic aspects involved in making a judgment of taste.

4. Nineteenth and Twentieth Century Philosophers: The Step Away from Taste

Theories of taste rose up in the 18th century and diminished almost as quickly. As demonstrated by sheer numbers, 19th century philosophers were less concerned with taste than 18th century thinkers. They didn’t abandon aesthetic taste; rather, they moved from talk of aesthetic taste to talk of an aesthetic attitude. On some level, this change might seem like a mere semantic difference, but though it overlaps with taste, talk of an aesthetic attitude offers certain differences. (See also the article on Aesthetic Attitude for a fuller treatment.)

Taste is very outward looking, especially as it relates to aesthetic judgment. The object possesses concrete properties that the perceiver ought to judge as beautiful or not. Failure to make the correct judgment was considered as something deficient with the beholder. For some previous philosophers, it could be a flaw with a person’s virtue that hinders the ability to perceive the beauty of the object. For others, it might be more connected to a lack of knowledge or at least the right kind of knowledge. The key idea for most traditional theories of taste was that the object has properties that the beholder must discover, though the views of people like Hume start to show a shift.

In contrast, aesthetic attitude brings the individual onlooker more to the forefront. The beholder’s state of mind becomes more important as his or her attitude helps or hinders the possibility of aesthetic experience. Whether or not the original aesthetic attitude theorists believed so, these theories allow for a wider range of objects to be considered aesthetic objects. Just by adopting an aesthetic attitude, it seems like any object could be viewed as an aesthetic object. With the taste theorists, the object, apart from the spectator, must be worthy of the aesthetic appreciation it receives. Another difference lies in the fact that the aesthetic attitude can seemingly be turned on and off. Someone could adopt the aesthetic attitude in a given instance, but ignore it in a very similar situation the next day. There seems to be some truth to this because you could walk into an art museum wanting and expecting to experience wonderful things, but you could also enter with a refusal to see anything in an aesthetic light. Taste, according to the respective theories, is not something that is turned on and off. A person either has a developed or attuned sense of taste or not. In other words, the aesthetic attitude is a point of view one adopts, while aesthetic taste seems to be more connected with one’s development and nature.

Two main versions of aesthetic attitude theories occur in the writings of Arthur Schopenhauer and Edward Bullough. Schopenhauer’s (1788-1860) thoughts on aesthetics, we might say, mark the transition from theories of aesthetic taste to aesthetic attitude. Schopenhauer often uses the term aesthetic contemplation rather than attitude. But it seems clear that the later use of attitude can be applied retroactively to his use of contemplation. In order to have an aesthetic experience, the perceiver must have a different kind of perception about the object. No longer focused on the particulars, the perceiver experiences the ideas that are embedded in the object. We might postulate that this shift from particulars to ideas occurs when the perceiver has adopted the aesthetic attitude, though Schopenhauer never clearly spells this out. This attitude and experience are only temporary; it’s an impermanent rest from the suffering of life. The attitude is very important for Schopenhauer. Most things, when viewed with the right aesthetic attitude, will become beautiful in the mind (or perception) of that specific person.

Edward Bullough (1880-1934) is not a common name in the larger history of philosophy, but he made a small but significant contribution in the field of aesthetics. Working as a psychologist, he developed a notion of psychical distance (a continuation of disinterestedness) that was to ground his idea of aesthetic attitude. He often uses the expression aesthetic consciousness instead of aesthetic attitude.

Bullough wanted to develop a notion of the experience of art without appealing to any single characteristic found in all art, since he did not believe there was such a characteristic. This belief helps to illustrate the shift that had taken place since the 18th century, when many still believed beauty was the main characteristic of art. Bullough was more concerned with focusing on the experience that the work of art causes for the beholder. Two people looking at the same object, for instance, might have very different experiences. His solution to this dilemma is what he calls psychical distance. Bullough believed that the beholder must have the correct amount of distance between herself and the work of art. Too much or too little distance will prevent the complete aesthetic experience. It might be similar to having a conversation: Imagine trying to talk to someone in a normal conversation, and he moves his face one inch away from yours. It would be much too distracting to continue. Likewise, if someone were standing one hundred feet away from you, it would not be possible to have the intimacy that a good conversation requires. While there isn’t an exact distance one must have when experiencing art (or having a conversation), there is a range of distances, and the beholder must be in that range in order for the possibility of the aesthetic experience. For Bullough, this distance is what directly affects one’s taste in works of art. It is important for the beholder to learn to gauge the right distance, which can vary from person to person. People create the distance by removing practical interest from the object.

5. Contemporary Philosophy and Beyond

Theories of taste reached their peak in the 18th century. They diminished and then changed in the 19th century. They were left without much significance in the 20th century. Now, in the 21st century, few people really speak about a theory of taste. Are these theories merely relics of the past that we should find interesting only as historical artifacts? How can we account for the fact that people speak commonly and meaningfully about aesthetic taste while it seems to have diminished in academic discourse? It is not obvious how we should answer such questions. However, even though taste is no longer a prominent idea, there have been some notable contributions in the contemporary world.

a. Pierre Bourdieu

A French sociologist, Pierre Bourdieu (1930-2002) attempted to apply the methods of the social sciences to an understanding of aesthetics. In this way, he is unique because he did not work in the traditional philosophical framework that surround questions of beauty, taste, and aesthetic experience. He studied how people come to develop their tastes in various areas, but especially music. While money and time are important for developing cultural knowledge, Bourdieu claims that a crucial component comes from how someone is raised in the home and other institutions, like school. He uses the term cultural capital to refer to someone’s social assets, such as education. While money may help someone gain some social assets, the salient idea is that cultural capital helps one achieve a higher class beyond their purely financial assets.

To the embedded responses a particular individual has to cultural objects, Bourdieu gives the name habitus. People belong to different aesthetic spheres, and their preferences are very similar within the sphere. He concludes that there is no value that guides one’s aesthetic taste; it is developed within a person’s class. This differs from views in traditional philosophy that tend to favor notions of beauty and taste from beyond one’s vantage point in the realm of ideas, or even God, without reference to a person’s class or context. Since people approach things from a particular situation, Bourdieu maintains that people’s social context contributed significantly to their approach to aesthetic taste. In order to demonstrate this idea, Bourdieu surveyed many people belonging to different social classes. He discovered, for instance, that people from the working class believe that objects should serve a function, even aesthetic objects. However, those from the upper classes believe an object could be valuable for its own sake. One class, thought Bourdieu, would almost be disgusted by the dominant art in another class. Thus, for Bourdieu, taste is developed within one’s social context, but one could move to a different class by acquiring cultural capital.

b. Gustatory Taste

Aesthetic taste began as a convenient metaphor for the judgment of the beautiful. Some recent philosophers have begun to examine whether taste must be considered only a metaphor disconnected from its natural setting. In other words, might real gustatory taste have a substantial connection with the traditional and more metaphorical notion of taste? It is a contentious topic with very few middle ground positions.

Gustatory taste can be altered—positively and negatively—with experience or education. People have different methods of preparing foods all over the world that produce different flavors. Knowing how to blend flavors and how to properly consume certain items will refine one’s taste and enjoyment of certain foods and drinks. Scotch, for example, is a complex drink that can contain sweet, smoky, spicy, citrus, and other flavors. Knowing how to drink scotch to taste all of these flavors is not automatic. While there might not be an absolutely correct way to drink it, there are ways to drink it so that you taste all it has to offer. Similarly, in the context of art, one could learn how to appreciate certain kinds of art by learning how to appropriately perceive and experience them. This has nothing to do with whether the person will actually like certain art. The point is merely that one can alter or improve one’s taste by learning more about the object or type of object. This education and refinement will usually increase the pleasure received in both contexts.

Whether gustatory taste is on par with traditional aesthetic taste seems to hinge on the status of food as art. This is where the larger questions loom for connecting the two kinds of taste. There are some more generally agreed upon characteristics of art that could help negotiate this question. Art is generally considered a kind of expression of emotions or ideas. While someone cooking might have positive emotions about the food or those who will consume it, the food itself does not seem to express emotion. Now there might be a situation where one person claims that the cook must love her because the cook prepared her favorite meal. There is some communication here, but the question is whether the communication was through the food as art, because something similar could be communicated with store-bought chocolate or even a bottle of wine. These things might not carry meaning (or the same meaning) for anyone else. Insofar as meaning or expression is necessary for art, gustatory taste seems to fall short of the traditional (and metaphorical) theories of taste, as suggested by Elizabeth Telfer, though she believes food to be a minor art.

c. Some Developments in Analytic Philosophy

Even though the zenith for theories of taste has passed, it has found some interest among contemporary analytic philosophers. Talk of aesthetic judgment and interpretation are more prevalent, but there are some important themes that have received attention in recent discussion. With the rise in connecting gustatory taste with aesthetic taste, some philosophers have given more weight to the personal interaction one has with an aesthetic object. Carolyn Korsmeyer and others have pointed out that taste in both the literal and metaphorical senses require a personal experience with the object. It would be suspect for people to claim that they dislike bananas, for example, if they had never eaten one or even seen one. Similarly, people cannot reasonably claim to dislike an opera or painting that they have never observed or experienced. This lack of personal familiarity becomes even more acute were they to try making a specific claim, such as stating that the colors of a painting are not well balanced throughout the composition. This claim seems impossible without actually seeing the painting. While we might trust our friend’s negative review and decide not to see a work of art, we cannot reasonably make the stronger claim that something is wrong with the work without actually experiencing it. Furthermore, there is a difference between claims of taste and other kinds of factual claims. From second-hand testimony, we could learn that a sculpture is made of bronze, but we could not learn how beautiful it is unless we see it for ourselves. It seems reasonable that some kind of personal experience of an object itself or similar object (including audio or visual representations) is important for an evaluative judgment of taste. Even if it was possible to make a judgment of taste without direct experience, it would at least be necessary for someone to have a little knowledge of the kind of object under discussion.

Some questions arise about which objects are appropriate for one’s judgment of taste and which people’s opinions matter. Imagine a person who claimed that a toaster was the most beautiful object he had ever seen. While it seems likely that most people would not agree, is this person wrong? Frank Sibley claims that anyone can notice the non-aesthetic qualities of an object, but only some people notice its aesthetic qualities. These qualities help an observer recognize an object that is admirable. But they are not easily recognizable because of the experiences and training that each observer possesses. An issue with this view is that there can be a wide variety of legitimate opinions. One person claims an object is mildly beautiful, while another claims the same object is supremely beautiful. Both views are based on their perceptions of the same object’s aesthetic qualities. Some might say that one person’s taste is more refined. However, there are two residual questions. Which person’s taste is refined? Plus, Jerrold Levinson raises the question about what might motivate someone to cultivate her taste to be able to perceive the finer aspects of an aesthetic object. Answers to questions about the right observer and the right object never seem to lead to a concrete answer, which creates problems for theories of taste.

In the 18th century, many connected taste with a robust account of moral goodness. With that connection dismissed by many over the last century, theories of taste, along with theories of beauty and sublimity, suffered as well. The early 21st century, however, has brought a renewed interest in several related areas: ideas of beauty with people like Roger Scruton and Nick Zangwill, the sublime with Emily Brady, and aesthetic experience with Richard Shusterman. This reappearance suggests that these traditional aesthetic concepts were perhaps ignored for too long. Thus, taste might also have the possibility of new life in the 21st century.

6. References and Further Reading

a. Primary Sources

  • Bullough, Edward. “‘Psychical Distance’ as a Factor in Art and an Aesthetic Principle.” The British Journal of Psychology, vol. 5, no. 2, 1912, pp. 87–118.
    • This article presents his most famous idea: psychical distance.
  • Burke, Edmund. A Philosophical Enquiry into the Origin of Our Ideas of the Sublime and Beautiful. London, 1757.
    • The earlier version has his essay “On Taste,” which presents his main ideas concerning taste.
  • Cooper, Anthony. Third Earl of Shaftesbury. Characteristics of Men, Manners, Opinions, Times. London, 1711.
    • The section called “The Moralists” is where Shaftesbury spells much of his view of taste.
  • Herder, Johann Gottfried. Selected Writings on Aesthetics. Edited and translated by Gregory Moore, Princeton University Press, 2006.
    • This is a compilation of Herder’s works on aesthetics, and a main discussion of taste is found in the chapters called “Critical Forests: Fourth Grove” and “The Causes of Sunken Taste.”
  • Hume, David. “Of the Standard of Taste.” In Four Dissertations, Edinburgh, 1757.
    • Hume introduces his notion of the ideal judge in this essay.
  • Hutcheson, Francis. An Inquiry into the Original of Our Ideas of Beauty and Virtue. London, 1725.
    • Section VI develops his belief that people have a universal sense of beauty.
  • Kant, Immanuel. Critique of Judgment. Berlin, 1790.
    • His section discussion of the four moments are of particular importance to this topic.
  • Mendelssohn, Moses. Philosophical Writings. Berlin, 1761.
    • In the section “On Sentiments,” Mendelssohn (or his Theocles) talks about how he prepares himself to experience art and beauty.
  • Plotinus, The Enneads.
    • In the first Ennead, tractate 1, section 1, Plotinus discusses beauty, especially his belief that symmetry cannot be the only requirement of beauty.
  • Schopenhauer, Arthur. The World as Will and Idea. Leipzig, 1819.
    • His major work dealing with the major branches of philosophy, but Book 3 (Volume 1) is where he focuses on aesthetics.

b. Secondary Sources

  • Beardsley, Monroe. Aesthetics from Classical Greece to the Present: A Short History. The University of Alabama Press, 1966.
    • A very accessible history of the development of aesthetic ideas.
  • Cahn, Steven M. and Aaron Meskin. Aesthetics: A Comprehensive Anthology. Blackwell Publishing, 2008.
    • This is one of the best anthologies for the history of aesthetics, incorporating selections from most of the main philosophers throughout history.
  • Carruthers, Mary. The Experience of Beauty in the Middle Ages. Oxford University Press, 2013.
    • Chapter 4 offers an insightful analysis of how taste rose to prominence during the medieval period.
  • Dickie, George. The Century of Taste: The Philosophical Odyssey of Taste in the Eighteenth Century. Oxford University Press, 1996.
    • An excellent resource on five of the major philosophers on taste: Hutcheson, Gerard, Alison, Hume, and Kant.
  • Gaut, Berys and Dominic McIver Lopes, editors. The Routledge Companion to Aesthetics. 3rd ed., Routledge, 2013.
    • This is a great resource for an introduction to a wide array of issues in aesthetics, but Carolyn Kormeyer’s entry on “Taste” is most relevant for this article.
  • Neill, Alex and Aaron Ridley, editors. Arguing about Art: Contemporary Philosophical Debates. 2nd ed., Routledge, 2002.
    • This book features some competing arguments on a variety of issues, but offers a helpful exchange about whether food is art in Part 1.
  • Wenzel, Christian Helmut. An Introduction to Kant’s Aesthetics: Core Concepts and Problems. Blackwell Publishing, 2005.
    • A very accessible explanation of the main ideas of Kant’s aesthetic theory.

 

Author Information

Michael R. Spicher
Email: mrspicher@massart.edu
Massachusetts College of Art and Design
U. S. A.

History of Love

What is love? We all wish to have the answer to one of the most universal, mysterious, and all-permeating phenomena on this planet. And even if we perhaps have a special feeling and intuitive insight that love “is related to everything else, but near things are more related than distant things,” as Waldo Tobler said, we still have not found and offered a full or finite definition of this multifaceted, dynamic, creative and all-encompassing phenomenon that is love. Another view, held by Spinoza, is that love elevates us up to an expansive love of all nature. For him, an act of love is an ontological event that ruptures existing being and creates new being.

However, since love is an ontological event, creation of new being also coincides with different concepts throughout history, since each period brings a new way of being and living. Thus, each period in history offers a prevailing concept of love: in ancient, pre-Socratic times, we have Empedocles’ Love (Philotes) and Strife (Neikos); in Socratic times, Plato’s Eros and Aristotle’s Philia; in the middle ages, St. Paul’s Agape and St. Augustine’s Caritas; in the Renaissance, Rousseau’s notion of a modern romantic pair of Emile and Sophie; in modern times, Freud’s love as transference; and finally, in postmodern times we tackle the notion of duties to children. These concepts of love are not always independent of one another, as later philosophers often implement earlier conceptions into their own interpretations.

Table of Contents

  1. Presocratic Period
    1. Empedocles
  2. The Classical (Socratic) Period
    1. Plato
    2. Aristotle
  3. Christian Period
    1. St. Paul
    2. St. Augustine of Hippo
  4. The Enlightenment Period
    1. Rousseau
  5. The Modern and Postmodern Periods
    1. Sigmund Freud
    2. Duties to Children
  6. References and Further Reading

1. Presocratic Period

a. Empedocles

Empedocles was a Sicilian, a high-born citizen of Acragas and a pre-Socratic philosopher, among whom were also Heraclitus and Parmenides. Empedocles is the last Greek philosopher who wrote in verse, which suggests that he knew the work of Parmenides, who also wrote in verse. Empedocles’ work should be understood in relation not only to Parmenides’ but also to Pythagoras’’ and the Sensualists, who emphasized the importance of our senses. On the other hand, Empedocles’ notion of Love and Strife being fundamental cosmic forces on which his cosmology and ethics lie is an  original thesis that no other philosopher afterwards continued (in some ways Freud was the only one who used Empedocles’ notions of Love and Strife in his writings on Eros and Thanatos).

In Empedocles’ cosmology, Love stands as a cosmic, consistent principle due to which the world exists through mixing of the elements (earth, air, fire, and water), or as he says:

From these [Elements] come all things that were and are and will be; and trees spring up, and men and women, and beasts and birds and water-nurtured fish, and even the long-lived gods who are highest in honour. For these [Elements] alone exist, but by running through one another they become different; to such a degree does mixing change them. (Fragment 21)

For Empedocles, elements are like letters in an alphabet, which emphasizes the ability of elements to form different types of matter in the same way a limited number of letters can form different words through combinations of letters, or basic colours can be used to create different hues and patterns. The cause of this mixture and of these combinations are the cosmic forces of Love (Philotes)—the force of attraction and combination, and Strife (Neikos)—the force of repulsion and fragmentation. These two forces are engaged in the eternal dialectic and they each prevail in turn in an endless cosmic cycle:

I shall tell thee a twofold tale. At one time, it grew to be one only out of many; at another, it divided up to be many instead of one. There is a double becoming of perishable things and a double passing away. The coming together of all things brings one generation into being and destroys it; the other grows up and is scattered as things become divided. And these things never cease continually changing places, at one time all uniting in one through Love, at another each borne in different directions by the repulsion of Strife. (Fragment 17)

This cycle of love-strife consists of four phases in a Sphere: two full phases, one governed by Love and another by strife, and two transitional phases: a phase from Strife to Love, and a phase from Love to Strife. In the beginning, the Sphere was filled with love and the four elements were so close together that we could not discern them. After some time, however, Strife came into the Sphere and Love started to outflow from it. When Strife gained enough concentration in the Sphere, it resulted in the movement and fragmentation of the four elements into separate forms.

But it seems that Empedocles needed “evolution” (development) in his cosmology and the ensuing dynamic movement of the cosmos, so he introduced movement through two transitional phases: phases from Love to Strife and from Strife to Love. In this way, he got a third phase in which, as a consequence of the previous phases, Love regains power through coming into the centre of the Sphere, while Strife moves to its margin. And then, in the fourth and last phase of the cycle, Strife returns to the centre, and Love moves to the margin. This process then repeats over and over again. The idea of Love and Strife moving in and out of the Sphere may be an echo of Empedocles’ medical knowledge (he was a well-known physician), especially of the function of the heart. Thus, according to Empedocles, the world exists in continuous movement through different phases of a cycle, in which a certain type of stability exists in eternal elements. And it is precisely this continuous movement of the elements which produces a continuous state of organic evolution and from which all beings originate.

2. The Classical (Socratic) Period

a. Plato

Plato, born a nobleman in an aristocratic family, was not only a philosopher but also a mathematician, a student of Socrates, and later, a teacher of Aristotle. He was the first to lay the foundation of the Western philosophy and science. He also founded the first known academy, which can be considered the first institution of higher education in the Western world.

Plato’s most important works on love are presented in Symposium (although he changed his abstract outlook on love as universal Ideas (of Truth, Beauty and Goodness) later in Pheadrus to meet also the erotic and “subjective” aspects of the ideal Love.) In Symposium, meaning a feast, he presents seven speeches about love going from speaker to speaker as they sit at the table. He introduces seven speakers represent five types of love known at that time, with Socrates offering a unique and new philosophical concept of love he learned from Diotima, and concluding with Alcibiades, the final speaker, presenting his own love experience with Socrates.

Phaedrus, who is the “father” of the idea of talking about love, claims that Love is a God, and is one of the most ancient Gods. According to Hesiod, Love was born to Chaos and Earth. Love gives us the greatest goods and guidance. Phaedrus prefers love between an older man (erast) and a young boy (eromenos) because it encourages a sense for honour and dishonour (shame), two necessary virtues of citizenship, for love will convert the coward into an inspired hero who will, for instance, die for his beloved.

Pausanias, who was sitting next, speaks next. He says that Phaedrus should have distinguished heavenly and earthly loves. The first has a noble purpose, delights only in the spiritual nature of man, and does not act on lust. The second one is the love of the body, and is of women and boys as well as men. And when we are in the domain of earthly love, which operates on lust, we can see the powerful influence that pursuing sexual pleasure has on a person’s actions and life: we become slaves to our passions and subservient to others, a distinct threat to freedom and thus a happy life.

Aristophanes comes next, but he has the hiccups and requests that Eryximachus the physician either cure him or speak in his turn. Eryximachus does both, and after prescribing for the hiccups, speaks as follows: he agrees with Pausanias that there are two kinds of love; furthermore, he concludes that this double love extends over all things— animals and plants, as well as humans. In the human body lies both good and bad love, and medicine is the art of showing the body how to distinguish the two.

Aristophanes is the next speaker. He argues that “original” humans used to be beings with two faces and four arms and legs, but we were cut into two by Zeus due to our arrogance and disobedience of the Gods. Since then people go around the world seeking their missing half. Eros, the God of love, is here to assist us in finding this missing half, who is our spiritual kin. Aristophanes also claims there were three genders of the original human beings: male (two males), female (two females), and androgynous (male-female). Males were descended from the sun, females from the earth, and androgynes from the moon. Thus, Eros’ task is to make our race happy again through our completion and regression to the original state. However, making us complete again is not as easy task as we would expect. When Zeus cut people in half, they were at first cut in such a way that the halves could not sexually merge; they were able to just kiss and hug and were kept in this unsatisfied situation until they died. For this reason, Zeus gave them sexual organs. Sexual organs enabled the halves to merge in coitus and, at least for a little while, release the halves from their tension of desire for each other. Martha Nussbaum, however, has observed that this option pushes people to live within a domain of repetitive needs and desires which distract them from other businesses in life. It is very difficult to meet such halves, and an even bigger puzzle is how we would recognize them (what are the signs of meeting the right half?) (Nussbaum, 2001).

Socrates, being aware of this problem of Aristhopanes’ Eros, offered a response to Aristophanes and claims that a) “love is neither love for the half nor the whole, if one or the other has not some good, beauty and truth” (Plato, 1960, p. 94); and b) love, or Eros, is primarily a relationship between a knowledge-lover (philosopher) and ultimate knowledge/wisdom (Love which is Goodness/Beauty/Truth and part of the Heaven/Angelic domain). Thus, our love is based on the notion that the aim of love is not a person but something immaterial (the ultimate Heavenly Ideas of Goodness/Beauty/Truth), which enables an anchor within ourselves. And how can we achieve this? The next four steps up the ladder from the material towards the immaterial will show us. But before we introduce the four steps upwards into the angelic domain, we must say that the originator of the theory of Eros is not Socrates, but the Greek priestess Diotima. Socrates says that he merely repeats what he was told by her, and that is

(a) the general description of Eros or love is a desire for something that we do not have—we desire what we lack. And what do we lack? We desire beauty, goodness, and truth. But if we desire something that we do not have—does that mean Eros is ugly, bad, and foul? Eros is neither beautiful nor ugly, neither good nor bad, neither wise nor stupid, neither God nor mortal: Eros is something in between: Eros is an intermediary power, transferring prayers from men to gods and commands from gods to men. We must also distinguish Eros from a beloved one, because Eros is the loving one. And such a notion of Eros resembles the position of a philosopher: “Sophia (wisdom) is one of the most beautiful things in the world. Sophia is the love of wisdom; therefore Eros must be a philosopher, that is a lover of wisdom who stands in between the fair and the foul, the good and the bad, the ugly and the beautiful.” (Plato, 1960, p. 96).

(b) If Eros desires the beautiful, then the question arises: What does Eros desire of the beautiful? He desires possession of the beautiful which, if we substitute it with the good, means desire to possess happiness. And when something makes us happy, we wish to have the everlasting possession of the good. And how do we achieve that? By reproducing it. This is the reason men and women at a certain age desire to produce offspring, as with birth comes beauty and mortal men and women reach immortality.

(c) Eros, as desire of the good and beauty, brings forth a desire for immortality; this principle extends not only to men but also to animals. This is also why parents love their children—for the sake of their own immortality—and why men love the immortality of fame. Intellectuals and artists do not ‘create children; instead, they conceive concepts of wisdom, virtue, and legislation.

(d) Thus, men who are concerned more with the physical level take care of children and love a woman, and those who are concerned about the spiritual level take an interest in justice, virtue, and philosophy (world of ideas of Goodness/Beauty/Truth per se), and love Man (as mankind). And how do we get to this Goodness/Beauty/Truth? Love starts with loving beautiful forms, and proceeds to beautiful minds. From minds, one can learn to love laws and institutions, then sciences; he sees that there is a single science uniting all of nature’s beauty. In knowing this, he can perceive beauty with the mind’s eye, not the body’s eye, and will know true wisdom and the friendship of God.

The last speech is by Alcibiades. We learn that Alcibiades (who is stunningly beautiful, an acclaimed war and strategic leader, winner of many prestigious awards, and praised and adored by many Athenians) is in love with Socrates. He fell in love because as he said: “I have heard Pericles and other great orators, and I thought that they spoke well, but I never had any similar feeling…. He is the great speaker and enchanter who ravishes the souls of men; the convincer of hearts, too” (Plato, 1960, p.104). So Alcibiades was surprised that beneath an ugly and neglected appearance there was great treasure, and he explains his love for Socrates by first comparing him to the busts of Silenus, and secondly, to Marsyas the flute-player. “For Socrates produces the same effect with the voice which Marsyas did with the flute—he uses the commonest words as the outward mask of the divinest truths with which he touches the soul” (Plato, 1960, p. 105). Then he proceeds: “Socrates is exactly like the busts of Silenus, which are set up in the statuaries and shops, holding pipes and flutes in their mouths; and they are made to open in the middle, and have images of gods inside them … and we will learn that his words hold the light of truth, and even more, that they are divine.” (Plato, 1960, p. 106).

This uniqueness of Socrates is his main attraction. According to Lacan, however, we should consider a bust as an agalma—a source (or rather an object) of a lover’s desire or desire of (his) love.

A particular agalma someone sees in the other is that something he desires in this and not in the other person. Desire as such points towards a peculiar object (of desire) because it emphasizes and chooses exactly this and not any other object and makes it incomparable and incommensurable with the others. (Lacan, 1994, p. 16)

And that desire aims strictly for a subjective and particular choice (or projection), maybe not reflecting something real in the person at all, as Socrates reveals with his “mysterious” reply to Alcibiades: “But look again sweet friend, and see whether you are not deceived in me. The mind begins to grow critical when the bodily eyes fail and it will be a long time before you get old” (Plato, 1960, p. 107). So, Socrates wanted to show Alcibiades that what he has sought and loved in him is actually in himself as well. Discovering your true self gives you the greatest self-satisfaction and, at the same time, knowledge of how to become a better person; and this treasure can be shared with others, too, becoming good, beautiful, and truthful—something Socrates did by calling his endeavour a midwifery, that is, helping others bring forth into the light what was already in themselves.

In Plato’s second work on love, Phaedrus, he discusses another notion of love. He begins this work by denying the good of any love because he connects it with irrational behaviour conditioned by lust and desire. Sometimes a lover acts against the good of the beloved because of his desire, jealousy, possessiveness, and envy, and sometimes he acts even against himself when, as a rejected lover in the worse-case scenario, he takes his own life. For these reasons, Socrates favours a friend over a lover. Socrates thinks that if a lover behaves against his or his beloved’s goodness, then Eros must not be God. After all, God should do Men good and should uplift lovers into the realms of Heavenly bliss. Socrates, however, a little later on, changes his mind and says that he was wrong by stating that Eros is not a God. In fact, Eros is connected with the true love(r). “The ‘’true lover’’ has a mania for the good, and this kind of mania, coming from the divine, is superior to human self-control of irrational passions … and is an expression of the desire of the immortal soul, which has experienced the supreme good/beauty of the divine and wants to reclaim it.” (A. H. Kissel).

The soul, however, has the elements of the rational, harmonious, good and the disharmonious, aggressive, bad which are like the “good horse” (metaphorically presented as a white horse) and the ”bad horse” (metaphorically presented as a black horse) that must be driven in concord; when these elements are disordered, the soul loses its wings and adds a mortal body (Plato, 1963). “The goal of the incarnated soul is to learn how to manage the ‘bad horse’ through habitual reining-in, in order that its wings grow again; the soul must regain self-control and true knowledge” (A. H. Kissel. But many souls mistake “their own opinions for true knowledge” (Plato, 1963, 248b). Souls which have better and deeper knowledge and understanding of our heavenly origins and are in better accord with their heavenly nature are incarnated as better beings. According to this, the true lover of wisdom and the good, that is, the philosopher, is on the top of all Man. The same holds for an artist (the true lover of beauty). Others follow in the next order: the just king, the statesman, the doctor, the prophet and priest, the representational artist (poet), the manual labourer, the sophist, and last, the tyrant. The just are reincarnated to a higher level, and the unjust to a lower level, until the wings grow back and heaven is regained. True and divine love occurs when a lover meets his lover on the same level (as lovers are like mirrors to each other) which is why Socrates states that people who attract one another do so because they are the followers of a certain deity who help each other to ascend. (That is the reason why, for instance, people who love wisdom and justice follow Zeus, the ones who love royal treats follow Apollo, the ones who like to fight follow Ares, and so on.) But most importantly, a “true love is a divine one as far as it is connected with virtue, justice, modesty, inspiration, enthusiasm and self-control, and it only occurs when lovers bring of each other their best godlike qualities” (Plato, 1963, 253b).

In the last part of the Pheadrus, Socrates states that those who know divine love also know how to discern a good speech that conveys truth, goodness, and beauty from a false one by drawing on analogy of irrational and true love as stated above. “Writing speeches is not in itself a shameful thing. It’s not speaking or writing well that’s shameful; what’s really shameful is to engage in either of them shamefully or badly” (ibid., 258d).

b. Aristotle

Upon Plato’s death, Aristotle left for Assos in Mysia (today known as Turkey), where he and Xenocrates (c. 396 B.C.E.-c. 314 B.C.E.) joined a small circle of Platonists who had already settled there under Hermias, the ruler of Atarneus. Under the protection of Antipater, Alexander’s representative in Athens, Aristotle established a philosophical school of his own, the Lyceum, also known as the Peripatetic School due to its colonnaded walk.

Aristotle speaks about love mostly in Nicomachean Ethics, books VIII and IX. He speaks about Philia (friendship-like love) as the highest form of spiritual love and having the highest spiritual value. This kind of friendship is friendship of the same and not based on any external benefits. It is led by reciprocal sympathy, support and encouragement of virtues, emotions, intellectual aspirations, and spirit. “For all friendship is for the sake of good or of pleasure-good … and is based on a certain resemblance; and to a friendship of good men all the qualities we have named belong in virtue of the nature of the friends themselves….” (VIII:3, 1156b, trans. Ross). We can’t have many such friends, however, because our time is limited.

But when Aristotle says that a person needs to abandon his Philia for a friend if he changes or becomes vicious, this does not mean that he terminates friendship due to his own interest. He means that it happens because one of the friends realizes that he can’t do anything to contribute to the goodness of the other. He describes an example when we cannot talk of a true honest friendship any longer—when friendship is based only on pleasure and benefit. In the case of friendship based on benefits, friends are used only as a means to achieve a certain purpose (some goods, whether symbolic or material) and those who are together with others only for pleasure do not love the friend for his own sake but for their own pleasure. Such friendships cannot last long because when the reasons for friendship vanish, the friendship itself disappears. Friendships formed on the basis of pleasure or benefit can be formed between two bad people or between good and bad people, but true friendship can be formed only between two good people. Good people are friends because they themselves are good. Bad people do not feel any pleasant feelings towards a friend unless he offers some kind of benefit. According to Aristotle, friendship does not show only the values and preferences of the society and the country, but also, more importantly, the moral character of a person.

Friends who love each other love in them what they themselves believe to be of value:

We love in friends that which represents a value for us—a friend is a representation of a certain value. Thus, when a good person becomes our friend he himself is of value to us. Friends receive and give the same amount of good wishes and time, and feel the same joy or happiness in each other. True friendship is equality in all aspects, as a true friend is another self. (VIII:3, 1166a–­1172)

And what does Aristotle say on the relationship between man and woman, as seen in Book VIII? Friendship between men and women, in his eyes, seems to exist by nature and humans are tend to form couples more than they form cities, as the household came earlier and is just as necessary as the city. Other animals unite only for the purpose of reproduction, but human beings live together also for other purposes of life. However, Aristotle still thought a lot within the biological domain, meaning that for him

… from the start the functions are divided, and those of man and woman are different; so they help each other by throwing their peculiar gifts into the common stock. It is for these reasons that both utility and pleasure seem to be found in this kind of friendship. But this friendship may be based also on virtue, if the parties are good; for each has its own virtue and they will delight in the fact. (VIII:12, 1162a)

And children seem to be a bond of union; for “children are a good common to both, and what is common holds them together” (VIII:12, 1162a, 14–31). Parents love their children as they love themselves, and children love their parents because their being comes from them. Siblings love each other because of their common parentage. The friendship between siblings and kinsmen is like being comrades. The friendship between parents and children is much more pleasurable than other friendships due to the long sharing of lives. However, friendship between parents and children is not equal, as they have contributed different things to the relationship and the parents hold a superior position. The same, Aristotle thinks, holds for man (husband) being superior to woman (wife). However, even Stoics a little later on thought of man and woman, husband and wife, as equal since we are all endowed with a divine mind/spirit. Being loved is desirable in itself, preferable even to being honoured.

3. Christian Period

a. St. Paul

St. Paul is the most important of the Apostles who taught the Gospel of Christ in the first century. Fourteen epistles in the New Testament have been credited to Paul. Seven are considered to be genuine (Romans, First Corinthians, Second Corinthians, Galatians, Philippians, First Thessalonians, and Philemon), three are doubtful, and the final four are believed not to have been written by him. Paul’s works contain the first written account of what it means to be a Christian and thus the first account of Christian spirituality.

St. Paul is most known by his letters to Romans and Corinthians. In the Letter to the Romans he says: “For with the heart, one believes resulting in righteousness; and with the mouth confession is made resulting in salvation” (Romans 10:10, World English Bible). One who speaks about faith in God makes others happy, offers consolation, and invites other people on the path of Jesus Christ, and secondly, one who talks about God and His revelation, recognition, prophecy, and teaching, is building a church of God. Through annunciation of the holy wisdom he addresses those ready to be redeemed and consecrated into eternal life through love, hope, and faith and by leaving behind their carnal body. According to St. Paul there exist two bodies: the carnal (lustful) and the heavenly (pure) within a unity called God’s temple or the Holy Spirit.

But what is spiritual and heavenly cannot be seen with the eyes nor heard with the ears. “However, we acquire a spiritual body only through the death of the carnal, sensual body. We have a carnal body which needs to die in order to allow a spiritual body to be born through Jesus Christ, crucified God” (Nygren, 1953, p. 203). But this raises a paradoxical question: how did we come to this transient world if there is no other God; are things flowing into the world from two different sources? We should approach the God who is (in) this world and more than this world differently from our perspective of death, law, desire, knowledge, and power. Instead, Paul talks of grace, faith, love, and hope. Jewish religion and tradition, for instance, maintains that God is a transcendence which cannot be attained by men; however, in Christianity man can reach God through becoming like Christ on the Cross. The resurrection of Christ is an event which broke the law of death and enabled a new life with God and in God through the grace of God.

And essential for this new life is unconditional love (Agape), which people were given as a gift by Jesus Christ. Christ, who sacrificed himself for all people: all we have to do is to open up to his love. And what is Agape? St. Paul in his Letter to the Corinthians says:

Love is patient and is kind; love doesn’t envy. Love doesn’t brag, is not proud, does not behave itself inappropriately … does not rejoice in unrighteousness, but rejoices with the truth; bears all things, believes all things, hopes all things, endures all things. Love never fails. (1 Corinthians 13:5)

Christ is the only source of love in the world that combines words (thoughts) and actions and gifts. If we did not experience unconditional love that was found through crucified Christ, we would not know God’s love in the Christian sense of the word. Paul sees in the Christ from the Cross an event of sacrifice, in fact God’s own sacrifice. God’s love is not one that desires but gives. With this Paul emphasizes the features of Christian Love that are spontaneous and the altruistic nature of God’s unconditional love (Agape), which manifested upon Christ’s death for the poor, weak, ill, foreigners, enemies, and atheists.

Agape, as a self-sacrificed love, is reflected in the commandments:

“You shall not commit adultery,” “You shall not murder,” “You shall not steal,” “You shall not covet,” and whatever other commandments there are, are all summed up in this saying, namely “You shall love your neighbour as yourself.” Love doesn’t harm a neighbour. Love therefore is the fulfilment of the law.” (Romans 13:09–13.11)

This law of God’s universal love, which is mapped onto the love for your neighbour as love for yourself, Paul thus defined as undivided and undefined faith with the fewest number of laws/prohibitions possible.

Concrete implications of God’s unconditional love can be seen also in the relationship between man and woman. According to Paul, women are mysterious, dark, and penetrable, while men are open, light, and penetrating, but in the face of God all people and beings are equal: men, women, Jews, Greeks, Christians. “Let the husband give his wife the affection owed her, and likewise also the wife her husband. The wife doesn’t have authority over her own body, but the husband. Likewise, also the husband doesn’t have authority over his own body, but the wife” (1 Corinthians 7:3–7:5).

God in general prefers asceticism and celibacy. However, good Christians need to give these up if they wish to marry and have children. Thus, God allows sexual intercourse but only for having children, because reproduction serves to continue the human species and does not encourage sin and desire for pleasure of flesh. On the other hand, Christianity produced the difference between men and women by stating that man is better and above woman: “But I would have you know that the head of every man is Christ, and the head of the woman is man, and the head of Christ is God.” Corinthians 11:3). It is obvious that in this view woman and man are not equal as stated, and this led to a long road of female subjugation, injustice, and suffering.

b. St. Augustine

St. Augustine was an early Christian theologian whose writings were very influential in the development of Western Christianity and Western philosophy. He was on one hand Plato’s follower, and his critic in the light of neoplatonism, and on the other hand he was an interpreter of Christian teachings, especially those of St. Paul and other apostles. He was the first to create and establish a concept of love that included Eros and Agape in the form of Caritas.

Greatly influenced by Neoplatonist versions of Symposium and his studies of Agape, St. Augustine in his early period described a positive paradigm of Christian life, in the sense of Agape through different stages, in works such as De Quantitate Animae and De Genesi contra Manicheos. In these works, he fights against the teachings of the Manicheans who were inspired by Mani (3rd cn. C.E. in Babylonia). Later on, however, he refutes this kind of Platonic ascension and develops his own kind of Christian Agape and platonic Eros, which is neither Eros nor Agape, but Caritas. What is the reason for Augustine’s combination of Eros and Agape? Where does he see a flaw in Eros that must be repaired by Agape? The answer lies in pride (superbia), which is connected with Eros.

He writes in Confessions: “When the soul ascends higher and higher into the spiritual realm, person starts getting a feeling of pride and self-sufficiency which makes that person stay within himself instead of reaching beyond the self towards the heavenly.” (Augustine, 1960, p. 39). This is because man cannot reach heaven by himself. Although Platonic Eros presents love built on human will, power, and knowledge (which will bring us to heavenly domain of the Ideas), to Augustine this is false, and only God himself can free and redeem us as Augustine states in his famous work City of God: “In order to heal human pride, God’s son descended to show the way to became humble” (Augustine, 1994a, p. 273: VIII:7, ) and continues: “… pride is the beginning of the sin … Therefore, humbleness is highly advised in the city of God.” (XIV:13). This is the reason why Christian spirit emphasizes humbleness (humilitas), which is Jesus Christ. Augustine saw the remedy for Eros’s pride and self-sufficiency, preventing Eros reaching its goal, as God’s love or Caritas.

And what is God?

All people see God as the highest, most beautiful, the brightest, eternal, wise, good, true and truthful entity who ever existed at all. No one on the Earth possesses the features God has. He is life itself, pure love and the origin of everything that is: God … gives preference to that which lives before to that which is dead and he is the highest Good (Summum Bonum). (Augustine, 1994a, p. 524, note 1).

Even more, death is the biggest enemy of the heavenly kingdom, therefore Augustine concludes that: “… life will be truly happy when it is going to be eternal” (Augustine, 1994a, p. 25). Hannah Arendt correctly observes that such a concept of love was defined in two steps: “First, that which is good is an object of yearning, i.e. something useful which can be found in this world and we hope to get into everlasting possession. In the second, good is defined through fear of death and destruction” (Arendt, 1996, p. 12).

Augustine’s introduction of human (soul) yearning for the highest good (Summum Bonum) and eternal life reveals an additional difference between Man and God. Namely, people are, contrary to God, made creatures—and live solely through him. A man-made creature does not possess his own bonum but he needs to find it—which is achieved through love as a yearning to acquire good. Happiness is thus having this good and keeping it in our life. Desire and yearning is thus a sign of a created creature, whereas God himself is without desire and lives according to himself and through himself. Such a God is self-sufficient and autarkical. The fundamental difference between God’s made creatures and their Creator is in the metaphysical difference between eternity and time. Creatures belong to the world of transitions: created beings never fully exist (the past is gone, the future is yet to be), and they exist only in now which soon turns into the past—what truly exists is only now which is not in time, but in eternity, which is God.

However, this is not the whole story of love, because Augustine divides love into that which is good/proper/right and that which is bad/false, according to the object desired—the choice of the object is very important because we become what we love. Therefore, if a loving one chooses created and transient objects of this world, we have love called Cupiditas; if he chooses an eternal and non-created object (God), we have Caritas.

4. The Enlightenment Period

a. Rousseau

Jean Jacques Rousseau was a philosopher, pedagogue, composer, writer, and one of the first autobiographers in the world. His political ideas were highly influential for the French Revolution and later for socialism and even nationalism. In his early writings, Rousseau claimed again and again that human nature was corrupted by the habits and manners of society in the big cities, which made people shift from natural (moral, political, spiritual) values to artificial and immoral values, based only on looks, superficial talk, material goods, and civil and cultural conventions. Rousseau notices this corruption on social and personal levels in the relationships between men and women, thus he suggested a new way to form loving relationships.

In Julie, or the New Heloise, we follow a romantic and tragic love story between Saint-Preux and Julie. According to Rousseau, a man and a woman seal their love in marriage when they feel that they cannot change what they feel for each other: “We share the same picture of the world…. we have the same outlook on the world and why would I not believe that what we share in our hearts we also share in the level of our beliefs and judgements” (1984, book 1, p. 65). Another important component of true love is benevolence: “Man can resist almost anything but benevolence, and in order to get benevolence you give it” (ibid, p. 190). And there exists yet another feature of love: enthusiasm, which not only provides lovers and partners with enormous energy, but also drives them beyond themselves and towards the ideal of perfection and highest moral virtue. For Rousseau, love is goodness that works for and has its origin in a balanced nature of a person. Love originates in a good-natured person from a balanced combination of our instincts, heart, mind, and soul: what the heart feels, the mind confirms. Reason is also important for love, so that lovers know how to lead and handle their needs and desires properly.

However, what has not been said so far is that Saint-Preux was at first Julie’s teacher and, to his surprise and despite all they felt and discovered, she later married the older, wealthy, and educated de Wolmar, and they all lived on a property called Clarens. Even more interesting is that Rousseau wrote a love story in which, even after Julie gave birth to two children, she remained in love with Saint-Preux and later admitted her affair to de Wolmar, who was saddened upon learning this fact but continued to love her nonetheless. But why did Rousseau put an obstacle to Saint-Preux’s and Julie’s love, and why did Julie accepted to marry the older and wealthy de Wolmar? Jean Starobinski in his book Transparency and Obstruction provides a plausible insight:

By introducing a marriage with older, de Wolmar, and having children with him, Rousseau simply tried to include “all” into a new kind of society he envisioned, in which no one would be left out: Julie would fulfil her parent’s wishes and comply with the moral order of that time, de Wolmar would get the girl he wanted, Julie continues her pedigree and Saint-Preux and Julie remain in love: what we find again in a higher level is a new love and new society which coincide. Erotic demand and demand for order are eventually in peace with each other…. In the refreshed society benevolence and gentle sympathy rule, and this is the result of a total transparency of consciousness of the people living at Clarens. (1988, p.104)

All this sounds ideal, and we would expect that we reached the final level of true love and community. However, we are faced with yet another surprise—Julie’s death at the end. Why would Rousseau want Julie to die? Julie dies because she had fulfilled the duty of moral-social order but not her personal wish for a happy life together with the one she truly loved. The last words of Julie to Saint Preux clearly reveal this: “No, I am not leaving you, I go to wait for you. The virtue that set us apart on earth will bring us back together in the eternal home” (ibid., p. 409).

But if Rousseau showed us the tragic-passionate love in Julie, he clearly set up a description of a marriage in his famous work Emile: Or On Education where he, for the first time in Western society, describes a basis for a free romantic love, sealed in marriage without the pressure of social moral order or duty.

Rousseau in the first half of Emile presents the whole physical, emotional, rational, and spiritual upbringing of a child (Emile), according to which pedagogy as a field came into existence. This article won’t go into that, but will shortly present the fifth book of Emile and Rousseau’s opinion of the love between the pubescent Sophie and Emile. At this age they are both mature enough to meet and know each other and to seal their love in marriage. It is clear from the start that Rousseau does not promote equality of men and women, but sees them as complements to one another in the eyes of nature. And from the nature argument he infers that a man is (or should be) superior and a woman inferior, as they both serve the same end, their union and reproduction, but in different ways; each with their own means, capabilities, and contributions. And it is based on this inference that Rousseau proposes the first moral difference between genders: a man is active, bright, strong, a leader, proud, and a penetrator, and a woman is passive, dark, penetrable, weak, a follower, modest, and full of grace; a man needs to have power and will (and needs to develop musculature), and a woman needs to not offer too much resistance but instead possess grace and charm with which to seduce. A man, Rousseau says, is more of the head (reason, intelligence, knowledge) and spirit, while a woman is more in tune with the heart, body, and intuition. A man is made for ruling and the public sphere, and a woman for obeying and the domestic sphere: she needs to learn how to bring up children and please her husband as this is her task and the reason for her origin (design). Her domain is the house, children, husband, and garden, as Rousseau claims, and the husband is immersed in intellectual, creative, and spiritual matters and matters of controlling, manipulating, and maintaining his “garden.” A man also needs to learn how to please his wife, however, in order to not make her bitter and angry. Because a bitter and angry wife does not fulfil her marital duties and is not a good mother.

Rousseau knew that he assigned an unequal status to men and women, yet he stated that this was due to a higher unity called family, and that the new society is built on diversity and difference as seen in nature (which to a degree resembles Aristotle’s view). In this way we can read Emile: Or On Education as some sort of guide to marriage, which was highly influential in the 18th century. But it is still unclear why Rousseau, who was so liberal and open-minded in other areas, was so conservative in gender matters.

5. The Modern and Postmodern Periods

a. Sigmund Freud

Sigmund Freud was trained in medicine (neurophysiology) and later became the founding father of psychoanalysis. Freud set up a practice in neuropsychiatry with the help of Joseph Breuer. That is how he came to know Anna O., who was Joseph Breuer’s patient from 1880 through 1882. Eleven years later, Breuer and Freud wrote a book on hysteria in which they claimed that when a client becomes aware of the meanings of his or her symptoms (as can occur through hypnosis), unexpressed emotions find release and no longer exhibit themselves as symptoms. Breuer called this catharsis, from the Greek word for cleansing, and through catharsis, Anna lost many symptoms of her hysteria. Freud also noted that Breuer and Anna were falling in love with one another. (This later served as the basis for his idea of transference love.)

One of Freud’s most amazing achievements, however, was the discovery of the processes of the unconscious mind. Freud found out from his practice that the unconscious mind signals coded messages in the form of dreams and symptoms, which must be deciphered by the analyst. Freud’s way of provoking the unconscious mind was by using rememoration or associative language, which means speaking freely until the answer to the problem surfaces. At some point, however, associative language could not provide any more answers and the language was interrupted by what Freud called resistance and silence resulted. Freud found out that this silence serves as a birthplace not only for love, but also for our drives (Freud, 1995). Love is that which starts showing itself through language and moves to that which is beyond language—into drives.

And what is a drive that is not an animal instinct? In his famous work Three Essays on the Theory of Sexuality (1905), Freud tells us that drive presents itself without words, mostly through crying and meaningless shouts—some sort of stream of energy where there are no borders between subjects and objects. These shouts reach their limit with the use of swear words. Just after  swear words we come to the border, and when it is crossed language appears and the drive disappears. Subjectivity, reflection, and distance appear and the drive is transformed. The border can be crossed from the other side: When words are without power and the subject disappears, it makes a space for an uncontrolled stream of energy, which flashes away the distance and intermediary and enables a state that is solid and liquid at the same time.

Where does drive originate? Freud sees drives as a borderline between our body and psyche, composed of four components: on one side, we have the pair of tension and pressure and, on the other side, the pair of aim and goal. The first two have physical bases and the other two psychological bases. The overall source of drive, however, lies in our body, which is a combination of sexual organs, genes, and hormones that all form some sort of energetic tension inside the body, which can be released with heterosexual intercourse. But Belgian psychoanalyst Paul Verhaeghe in his work Love in a Time of Loneliness (1990) is against this notion of drives because, in his opinion, it ignores one of two important aspects of drives: each drive is always partial and autoerotic. Consequently, he thinks that a drive is neither heterosexual nor homosexual. When he says that a drive is partial, he means that something in particular attracts us to the other person (not necessarily of the opposite sex) and vice versa—this attraction includes different parts of the body and other activities as well, either passive or active, and does not necessarily lead to intercourse with the aim of procreation. Interestingly enough, a drive does not need the whole body, but only parts of the body, hence the different drives: oral, anal, voyeuristic, exhibitionistic, and the like. Also, all these body parts represent our contact with the external world: mouth, eyes, ears, nose, breasts, feet, genitals, and anus, which accompany activities such as smelling, watching, listening, touching, sucking, and penetration.

In the pleasure we get from our drive’s tendency to release tension, by tearing down the barriers of our ego (via sobbing, shouts, swears) and then putting them up again (via language), Freud recognizes drive’s connection to death and life. Freud named these two tendencies of each drive Thanatos and Eros, and claimed that they are intrinsically connected into a whole. The definitions of Eros and Thanatos are taken from Empedocles’s definitions of Philotes and Neikos as fundamental ontological principles. Eros carries the power of uniting different elements into a bigger unity: Eros is the union of different elements so division does not exist anymore. Thanatos is, on the contrary, a process of fragmentation, an explosion, a big bang which releases tension. According to Freud, drives aim at the pleasure of reaching the original, zero-tension, or unity of mind-spirit-body, which Lacan later calls jouissance, the energy of the highest pleasure.

Freud and, later, Lacan thought that love and successful relationships (partnership or marriage) depend on a solution of the internal conflict between drive and desire—this duality Freud saw in the division between pleasure of sexual drive and a desire for love. Other divisions are consciousness and unconscious, ego, id, and superego, and sensual, sexual, and emotional levels of our being.

Freud identifies the beginning of duality of drive and love in the mother/child relationship, with the first activity of pleasure being a child’s sucking to drink milk. Consequently, the birth of desire, love, and yearning bear witness to these lost original first years of the child’s relationship with his mother, which serves as a matrix for all subsequent relationships, in which people try either to replicate it or deny it and replace it with another better one. This kind of love that we as grownups try to repeat Freud calls, as mentioned earlier, transference love. Freud came to know this through sessions with his patients who fell in love with him, although he recognized that they were not actually in love with him but had transferred their original attachment to their father to him.

According to Freud this first relationship with our parents (especially mother) shows the following traits of totality and exclusivity (unity of mother and child), loss (the aforementioned totality is lost after the birth, especially with the introduction of language), and power (the mother and child relationship changes and starts to include giving, receiving, rejection, forgiveness, and reparation, which are constitutive of their relationship).

In addition, in Totem and Taboo: Resemblances between the Psychic Lives of Savages and Neurotics (1913), Freud uses the story of King Oedipus to create and illustrate the so-called Oedipus Complex, in which the superego (the universal law, the law of the father), uses guilt to prevent continuation of incestuously oriented relationships between mother and child. “In Western patriarchal societies, the boy learns that a solution to the manqué of the mother lies in replacing her with the father/man and his genital organ and by promising himself that someday he, likewise, will be a big and a powerful man” (p. 48).

b. Duties to Children

At one time, it was thought that children had only duties and did not have rights as well: we used to believe that children had duties to their parents, duties such as to love thy parents, obey them, and care for them when they grow old, but times change and philosophers, sociologists, anthropologists, social workers, and others started debating about the rights of children and about whether parents had duties toward their children, such as to love them, as well. For example, philosophers such as Liao, Boylan, and Feinberg in their articles present several positions regarding duties to children related to correlative claim rights, and one of the most important is to love them. But why do they take such a position, that duty must correlate with claim rights, and why do they emphasize that parents need to love their children?

It is obvious that children are the most vulnerable people on the planet and are likely to fall to poverty, illness, and death due to illnesses and violence. “Children are also very susceptible to violence and exploitation through child labor, land mines, war, sex trafficking, and other sorts of exploitation…. And … many children face dropout in the secondary school and even less of them go to college and university.” (Boylan, 2011, p. 2). All the facts listed show that children are a vulnerable group that need special care, love, understanding, and protection. Before we can take a justified position regarding the duties parents may have towards their children, however, we need to understand and define what love is in this regard. Matthew Liao, in his article “The right of children to be loved” (2006b), argues that children, as human beings, have the right to the essential goods, possibilities, and conditions necessary for human beings to pursue the good life, their own and others.

Rights are powerful tools of protection and therefore having rights to the essential conditions for a good life is of primary importance to human beings. Whatever else they may want, most human beings would want to have a good life. Children being loved is one of the most essential conditions for a good life.” (pp.424–425).

Mere provision of the structural goods necessary for as many options as possible is not the best of all possible worlds. Love and doing well for the child are also necessary.

There is something odd, however, about declaring it a duty of parents to love their children. This is because love is often considered to be under the genus of emotions. Emotions are often taken to be out of one’s direct control, and “love out of inclination cannot be commanded” (Kant, 2003, p. 161). Is this completely true, and how can we reasonably argue for parents’ duty to love their children? Again Liao, in his article “The right of children to be loved’ (2006b), presents a reasonable and favourable argument as to why parental love is a necessary component of parenting. One strong reason is that many children, despite “being well fed, have died or have suffered serious physical, social and cognitive harms as a result of lack of love. So, even granting that being fed is more urgent then being loved, we still should give the right of children to be loved a very high priority.” (p. 25). Liao thus claims that a strong sense of warmth and affection is a crucial part of the emotional aspects of parental care and love. In this way, the claim that children need to be loved is an empirical claim.

It is also argued that children need this emotional aspect of love in order to develop certain capacities necessary to pursue a good life:

Human beings need certain basic goods, such as food, water and air in order to sustain themselves corporeally. In order to be able to pursue the good life, they also need certain basic capacities such as the capacity to think, to feel, to be motivated by facts, to know, to choose and act freely (liberty), to appreciate the worth of something, to develop interpersonal relationships and to have control of the direction of their life (autonomy). Finally, in order to exercise these capacities, they need to have some opportunities for jobs, social interaction, acquiring further knowledge, evaluating and appreciating things and determining the direction of their lives.” (ibid., p. 10–11).

6. References and Further Reading

  • Arendt, Hannah (1996). Love and St. Augustine. Chicago, IL: The University of Chicago Press.
  • Augustine (1955). Treatises on marriage and other subjects. Roy J. Deferrari (Ed.). Washington, DC: Catholic University of America Press.
  • Augustine (1960). The confessions of Saint Augustine (John K. Ryan, Trans.). New York, NY: Image Books.
  • Augustine (1994a). The city of God (Marcus Dods, trans.). Peabody, MA: Hendrickson Publishers.
  • Augustine (1994b). On Christian doctrine. In Philip Schaff (Ed.), A select library of the Nicene and post-Nicene fathers. Peabody, MA: Hendrickson Publishers.
  • Boylan, Michael (2011). Duties to children. In Michael Boylan (Ed.), The morality and global justice reader (385–405). Boulder, CO: Westview.
  • Cranston, Maurice (1991). Jean-Jacques: The early life and work of Jean-Jacques Rousseau, 1712–1754. Chicago, IL: The University of Chicago Press.
  • Feinberg, M (1980). The child’s right to an open future. In W. Aiken & H. LaFollette (Eds.), Whose child? Children’s rights, parental authority, and state power (124–153). New Jersey, NJ: Littlefield, Adams, & Co.
  • Freud, Sigmund (1913). Totem und tabu: Einige übereinstimmungen im seelenleben der wilden und der neurotiker [Totem and Taboo: Resemblances between the Psychic Lives of Savages and Neurotics]. Leipzig, Germany: Hugo Heller.
  • Freud, Sigmund (1968). Moses and monotheism. Hertfordshire, United Kingdom: The Garden City Press.
  • Freud, Sigmund (1989). Totem and taboo. New York, NY: W. W. Norton & Company, Inc.
  • Freud, Sigmund (1995). Opombe o transferni ljubezni [Comments on transference love]. Problemi, 33(1–2), 53–63.
  • Freud, Sigmund (1997). Sexuality and the psychology of love. Philip Rieff (Ed.). New York, NY: Touchstone Edition.
  • Freud, Sigmund (2000). Three essays on the theory of sexuality (James Strachey, trans.). New York, NY: Basic Books.
  • Grimsley, Ronald (1973). The philosophy of Rousseau. Oxford, United Kingdom: Oxford University Press.
  • Guthrie, W. K. C. (1956). Plato: Protagoras and Meno. London, United Kingdom: Sage.
  • Kirk, Geoffrey S., & Raven, John E. (1984). The presocratic philosophers. Cambridge, United Kingdom: Cambridge University Press.
  • Kingsley, Peter (1995). Ancient philosophy, mystery, and magic: Empedocles and
  • Pythagorean tradition. Oxford, United Kingdom: Oxford University Press.
  • Kant, Immanuel (2003). Utemeljitev metafizike nravnosti [The metaphysics of morals]. Ljubljana, Slovenia: Založba ZRC.
  • Lacan, Jacques (1994). Sections from his work on transference. Filozofija skozi
  • psihoanalizo [Philosophy through psychoanalysis]. Ljubljana, Slovenia: Analecta.
  • Liao, S. Matthew (2006a). The idea of a duty to love. Journal of Value Inquiry 40(1): 1–22.
  • Liao, S. Matthew (2006b). The right of children to be loved. Journal of Political Philosophy 14(4), 420–440.
  • Liao, S. Matthew (2012). Why children need to be loved. Critical Review of International Social and Political Philosophy 15(3), 347–358.
  • Martin, Alain, & Primavesi, Oliver (1998). L’Empédocle de Strasbourg (P. Strasb. gr. Inv. 1665–1666). Berlin, Germany: Walter de Gruyter.
  • Nussbaum, Martha (1986). The fragility of goodness: Luck and ethics in Greek tragedy and philosophy. Cambridge, United Kingdom: Cambridge University Press.
  • Nussbaum, Martha (2001). Upheavals of Thought: The intelligence of emotions. New York, NY: Cambridge University Press. Nygren, Anders (1953). Agape and Eros. London, United Kingdom: S.P.C.K.
  • Plato (1960). Symposium (S. Groden, trans.). Amherst, MA: University of Massachusetts Press.
  • Plato (1963). Eutyphro and Phaedrus. In Edith Hamilton & Huntington Cairns (Eds.), The collected dialogues. Princeton, NJ: Princeton University Press.
  • Rousseau, Jean Jacques (1979). Emile: Or on education (Allan Bloom, trans.). London, United Kingdom: Basic Books.
  • Rousseau, Jean Jacques (1997). Julie, or the new Heloise: Letters of two lovers who live in a small town at the foot of the Alps (Philip Stewart, trans.). Lebanon, NH: University Press of New England.
  • Spinoza, Baruch (1992). The Ethics (Seymour Feldman, trans.). Indianapolis, IN: Hackett.
  • Starobinski, Jean (1988). Jean-Jacques Rousseau: Transparency and obstruction. Chicago, IL: University of Chicago Press.
  • Tobler, Waldo (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(2), 234–240.
  • Verhaeghe, Paul (1999). Love in a time of loneliness. London, United Kingdom: Rebus.

 

Author Information

Katarina Majerhold
Email: katarina.majerhold@gmail.com
Slovenia

Understanding in Epistemology

Epistemology is often defined as the theory of knowledge, and talk of propositional knowledge (that is, “S knows that p”) has dominated the bulk of modern literature in epistemology. However, epistemologists have recently started to turn more attention to the epistemic state or states of understanding, asking questions about its nature, relationship to knowledge, connection with explanation, and potential status as a special type of cognitive achievement. There is a common and plausible intuition that understanding might be at least as epistemically valuable as knowledge—if not more so—and relatedly that it demands more intellectual sophistication than other closely related epistemic states. For example, while it is easy to imagine a person who knows a lot yet seems to understand very little, think of the student who merely memorizes a stack of facts from a textbook; it is considerably harder to imagine someone who understands plenty yet knows hardly anything at all.

It is controversial just which epistemological issues concerning understanding should be central or primary—given that understanding is a relative newcomer in the mainstream epistemological literature. That said, this article nonetheless attempts to outline a selection of topics that have generated the most discussion and highlights what is at issue in each case and what some of the available positions are. To this end, the first section offers an overview of the different types of understanding discussed in the literature, though their features are gradually explored in more depth throughout later sections. Section 2 explores the connection between understanding and truth, with an eye to assessing in virtue of what understanding might be defended as ‘factive’. Section 3 examines the notion of ‘grasping’ which often appears in discussions of understanding in epistemology. Furthermore, Section 3 considers whether characterizations of understanding that focus on explanation provide a better alternative to views that capitalize on the idea of manipulating representations, also giving due consideration to views that appear to stand outside this divide. Section 4 examines the relationship between understanding and types of epistemic luck that are typically thought to undermine knowledge. Section 5 considers questions about what might explain the value of understanding; for example, various epistemologists have made suggestions focusing on transparency, distinctive types of achievement and curiosity, while others have challenged the assumption that understanding is of special value. Finally, Section 6 proposes various potential avenues for future research, with an eye towards anticipating how considerations relating to understanding might shed light on a range of live debates elsewhere in epistemology and in philosophy more generally.

Table of Contents

  1. Types of Understanding
  2. Is Understanding Factive?
    1. The Factivity of Understanding-Why
    2. A Weak Factivity Constraint on Objectual Understanding
    3. Moderate Views of Objectual Understanding’s Factivity
  3. Coherence and the Grasping Condition
    1. Understanding as Representation Manipulability
    2. Understanding and Knowledge of Causes
    3. Understanding, Abilities and Know-How
    4. Understanding as Explanation
    5. Understanding as Well-Connected Knowledge
  4. Understanding and Epistemic Luck
    1. Understanding as (Partially) Compatible with Epistemic Luck
    2. Newer Defenses of Understanding’s Compatibility with Epistemic Luck
  5. Understanding and Epistemic Value
    1. Transparency
    2. Cognitive Achievement
    3. Curiosity
  6. Future Research on Understanding
  7. References and Further Reading

1. Types of Understanding

We regularly claim that people can understand everything from theories to pieces of technology, accounts of historical events and the psychology of other individuals. Consequently, engaging with the project of clarifying and exploring the epistemic states or states attributed when we attribute understanding is a complex matter. As Zagzebski (2009: 141) remarks, different uses of understanding seem to mean so many different things that it is “hard to identify the state that has been ignored” (italics added). Zagzebski notes that this easily leads to a vicious circle because “neglect leads to fragmentation of meaning, which seems to justify further neglect and further fragmentation until eventually a concept can disappear entirely.”

It will accordingly be helpful to narrow our focus to the varieties of understanding that feature most prominently in the epistemological literature. For one thing, it is prudent to note up front that there are uses of ‘understanding’ that, while important more generally in philosophy, fall outside the purview of mainstream epistemology. Most notably here is what we can call linguistic understanding—namely, the kind of understanding that is of particular interest to philosophers of language in connection with our competence with words and their meanings (see, for example, Longworth 2008). In addition, it is important to make explicit differences in terminology that can sometimes confuse discussions of some types of understanding.

An influential discussion of understanding is Kvanvig’s (2003). Firstly, Kvanvig introduces propositional understanding as what is attributed in sentences that take the form “I understand that X” (for example, John understands that he needs to meet Harold at 2pm). Some (for example, Gordon 2012) suggest that attributions of propositional understanding typically involve attributes of propositional knowledge or a more comprehensive type of understanding—understanding-why, or objectual understanding (these types are examined more closely below).

A second variety of understanding that has generated interest amongst epistemologists is, understanding-why. This type of understanding is ascribed in sentences that take the form ‘I understand why X’ (for example, “I understand why the house burnt down”). Some of Pritchard’s (for example, 2009) earlier work on understanding uses the terminology ‘atomistic understanding’ as synonymous with ‘understanding-why’ and indeed his more recent work shifts to using the latter term. There is debate about both (i) whether understanding-why might fairly be called explanatory understanding and (ii) how understanding-why might differ from propositional knowledge.

Thirdly, and perhaps most interestingly, objectual understanding is attributed in sentences that take the form “I understand X” where X is or can be treated as a body of information or subject matter. For example, Kvanvig describes it as obtaining “when understanding grammatically is followed by an object/subject matter, as in understanding the presidency, or the president, or politics” (2003: 191). Objectual understanding is equivalent to what Pritchard has at some points termed ‘holistic understanding’ (2009: 12). Grimm (2011) suggests that what we should regard as being understood in cases of objectual understanding—namely, the ‘object’ of the objectual attitude relation—can be helpfully thought of as akin to a “system or structure [that has] parts or elements that depend upon one another in various ways.”

With these three types of understanding in mind—propositional understanding, understanding-why and objectual understanding—the next section considers some of the key questions that arise when one attempts to think about when, and under what conditions, understanding should be ascribed to epistemic agents.

2. Is Understanding Factive?

Knowledge is almost universally taken to be to be factive (compare, Hazlett 2010). In other words, S knows that p only if p is true. But is understanding factive? This is not so obvious, and at least, not as obvious as it is in the case of knowledge. This section considers the connection between understanding-why and truth, and then engages with the more complex issue of whether objectual understanding is factive.

a. The Factivity of Understanding-Why

There is little work focusing exclusively on the prospects of a non-factive construal of understanding-why; most authors, with a few exceptions, take it that understanding-why is obviously factive in a way that is broadly analogous to propositional knowledge. For example, Hills (2009: 4) says “you cannot understand why p if p is false” (compare: S knows that p only if p).  Pritchard (2008: 8) points out that—for example—if one believes that one’s house burned down because of the actions of an arsonist when it really burnt down because of faulty wiring, it just seems plain that one lacks understanding of why one’s house burned down.

However, Baker (2003) has offered an account on which at least some instances of understanding-why are non-factive. Her line is that understanding-why involves (i) knowing what something is, and (ii) making reasonable sense of it. If making reasonable sense merely requires that some event or experience make sense to the epistemic agent herself, Baker’s view appears open, as Grimm (2011) has suggested, to counterexamples according to which an agent knows that something happened and yet accounts for that occurrence by way of a poorly supported theory. For example, a self-proclaimed psychic might see someone trip and believe that he caused this person’s fall. Further, suppose that the self-proclaimed psychic even has reason to believe he is right to think he is psychic, as his friends and family deem that it is safer or kinder to buy into his delusions outwardly. A view on which the psychic’s epistemic position in this case qualifies as understanding-why would be unsatisfactorily inclusive. This is perhaps partially because there is a tendency to hold a person’s potential understanding to standards of objective appropriateness as well as subjective appropriateness.

A more charitable interpretation of Baker’s position would be to read “making reasonable sense” more strongly. For example, we might require that the agent make sense of X in a way that is reasonable—few would think that the psychic above is reasonable, though it is beyond the scope of the current discussion to stray into exploring accounts of reasonableness.

b. A Weak Factivity Constraint on Objectual Understanding

It is plausible that a factivity constraint would also be an important necessary condition on objectual understanding, but there is more nuanced debate about the precise sense in which this might be the case. A useful taxonomising question is the following: how strong a link does understanding demand between the beliefs we have about a given subject matter and the propositions that are true of that subject matter? One can split views on this question into roughly three positions that advocate varying strengths of a factivity constraint on objectual understanding.

On the weakest view, one can understand a subject matter even if none of one’s beliefs about that subject matter are true. Zagzebski (2001), whose view maintains that at least not all cases of understanding require true beliefs, gestures to something like this view. In addition, Zagzebski supports the provocative line that understanding can perhaps sometimes be more desirable when the epistemic agent does not have the relevant true beliefs. Her key thought here is that grasping the truth can actually impede the chances of one’s attaining understanding because such a grasp might come at too high a cognitive cost. Her main supporting example is of understanding the rate at which objects in a vacuum fall toward the earth (that is, 32 feet per second), a belief that ignores the gravitational attraction of everything except the earth and so is therefore not true. Nonetheless, Zagzebski thinks that believing this actually allows us more understanding for most purposes than the ‘vastly more complicated’ truth owing to our cognitive limitations.

Zagzebski’s weak approach to a factivity constraint aligns with her broadly internalist thinking about what understanding actually does involve—namely, on her view, internal consistency and what she calls ‘transparency.’ A theoretical advantage to a weak factivity constraint is that it neatly separates propositional knowledge and objectual understanding as interestingly different. Nevertheless, distinguishing between the two in this manner raises some problems for her view of objectual understanding, which should be unsurprising given the aforementioned counterexamples that can be constructed against a non-factive reading of Baker’s construal of understanding-why.

For example, and problematically for any account of objectual understanding that relaxes a factivity constraint, people frequently retract previous attributions of understanding. Consider a student saying, “I thought I understood this subject, but my recent grade suggests I don’t understand it after all”. These retractions do not t seem to make sense on the weak view. In addition, the weak view leaves it open that two agents might count as understanding some subject matter equally well in spite of the fact that for every relevant belief that one has, the other agent maintains its denial. In other words, each denies all of the other’s respective beliefs about the subject, and yet the weak view in principle permits that they might nonetheless understand the subject equally well. And furthermore, weakly factive accounts welcome the possibility that internally coherent delusions (for example, those that are drug-induced) that are cognitively disconnected from real events might nonetheless yield understanding of those events. Proponents of weak factivity must address both of these potentially problematic results.

There is arguably a further principled reason that an overly weak view of the factivity of understanding will not easily be squared with pretheoretical intuitions about understanding. Specifically, a very weak view of understanding’s factivity does not fit with the plausible and often expressed intuition that understanding is something especially epistemically valuable. For example, Kvanvig (2003: 206) observes that “we have an ordinary conception that understanding is a milestone to be achieved by long and sustained efforts at knowledge acquisition” and Whitcomb (2012: 8) reflects that “understanding is widely taken to be a “higher” epistemic good: a state that is like knowledge and true belief, but even better, epistemically speaking.” Yet, these observations do not fit with the weak view’s commitment to, for example, the claim that understanding is achievable in cases of delusional hallucinations that are disconnected from the facts about how the world is.

Elgin (2007), like Zagzebski, is sympathetic to a weak factivity constraint on objectual understanding, where the object of understanding is construed as “a fairly comprehensive, coherent body of information” (2007: 35). According to Elgin, a factive conception of understanding “neither reflects our practices in ascribing understanding nor does justice to contemporary science”.  Though her work on understanding is not limited to scientific understanding (for example, Elgin 2004), one notable argument she has made is framed to show that “a factive conception cannot do justice to the cognitive contributions of science and that a more flexible conception can” (2007: 32).

As Elgin (2007) notes, it is normal practice to attribute scientific understanding to individuals even when parts of the bodies of information that they endorse diverge somewhat from the truth. As will see, a good number of epistemologists would agree that false beliefs are compatible with understanding. However, Elgin takes this line further and insists that—with some qualifications—false central beliefs, and not merely false peripheral beliefs, are compatible with understanding a subject matter to some degree. Consider here two cases she offers to this effect:

EVOLUTION: A second grader’s understanding of human evolution might include as a central strand the proposition that human beings descended from apes. A more sophisticated understanding has it that human beings and the other great apes descended from a common hominid ancestor (who was not, strictly speaking, an ape). The child’s opinion displays some grasp of evolution. It is clearly cognitively better than the belief that humans did not evolve. But it is not strictly true. Since it is central to her take on human evolution, factivists like Kvanvig must conclude that her take on human evolution does not qualify as understanding. (2007: 37)

COPERNICUS: A central tenet of Copernicus’s theory is the contention that the Earth travels around the sun in a circular orbit. Kepler improved on Copernicus by contending that the Earth’s orbit is not circular, but elliptical. Having abandoned the commitment to absolute space, current astronomers can no longer say that the Earth travels around the sun simpliciter, but must talk about how the Earth and the sun move relative to each other. Despite the fact that Copernicus’s central claim was strictly false, the theory it belongs to constitutes a major advance in understanding over the Ptolemaic theory it replaced. Kepler’s theory is a further advance in understanding, and the current theory is yet a further advance. The advances are clearly cognitive advances. With each step in the sequence, we understand the motion of the planets better than we did before. But no one claims that science has as yet arrived at the truth about the motion of the planets. Should we say that the use of the term ‘understanding’ that applies to such cases should be of no interest to epistemology? (2007: 37-8)

How should an account of objectual understanding incorporate these types of observations—namely, where the falsity of a central belief or central beliefs appears compatible with the retention of some degree of understanding? Pritchard (2007) has put forward some ideas that may prevent the need to adopt a weak view of understanding’s factivity while nonetheless maintaining the key thrust of Elgin’s insight. In particular, as Pritchard suggests, we might want to consider that agents working with the ideal gas law or other idealizations do not necessarily have false beliefs as a result, even if the content of the proposition expressed by the law is not strictly true. This is a point Elgin is happy to grant. See Elgin (2004) for some further discussion of the role of acceptance and belief in her account.  In other words, even though there is no such gas as that referred to in the law, accepting the law need not involve believing the law to be true and thus believing there to be some gas with properties that it lacks.

The underlying idea in play here is that, in short, thinking about how things would be if it were true is an efficacious way to get to further truths; an insight has attracted endorsement in the philosophy of science (for example, Batterman 2009). Working hypotheses and idealizations need not, on this line, be viewed as representative of reality—idealizations can be taken as useful fictions, and working hypotheses are recognized as the most parsimonious theories on the table without thereby being dubbed as wholly accurate. Since, for instance, the ideal gas law (for example, Elgin 2007) is recognized as a helpful fiction and is named and taught as such, as is, naïve Copernicanism or the simple view that humans evolved from apes. It is not only unnecessary, but moreover, contentious, that a credible scientist would consider the ideal gas law true. It seems as though understanding would possibly be undermined in a case where someone relying on the ideal gas law failed to appreciate it as an idealization. That is, there is something defective about a scientist’s would-be understanding of gas behavior were that scientist, unlike all other competent scientists, to reject that the ideal gas law is an idealization and instead embraced it as a fact. Putting this all together, a scientist who embraces the ideal gas law, as an idealization, would not necessarily have any relevant false beliefs. Therefore, the need to adopt a weak factivity constraint on objectual understanding—at least on the basis of cases that feature idealizations—looks at least initially to be unmotivated in the absence of a more sophisticated view about the relationship between factivity, belief and acceptance (however, see Elgin 2004).

Nevertheless, considering weakly factive construals of objective understanding draws attention to an important point—that there are also interesting epistemic states in the neighborhood of understanding. These similar states share some of the features we typically think understanding requires, but which are not bona fide understanding specifically because a plausible factivity condition is not satisfied. A good example here is what Riggs (2003) calls intelligibility, a close cousin of understanding that also implies a grasp of order, pattern and connection, but does not seem to require a substantial connection to truth. Grimm (2011) calls this ‘subjective understanding.’ He describes subjective understanding as being merely a grasp of how specific propositions interlink—one that does not depend on their truth but rather on their forming a coherent picture. Since what Grimm is calling subjective understanding (that is, Riggs’s intelligibility) is by stipulation essentially not factive, the question of the factivity of subjective understanding simply does not arise. Though in light of this fact, it is not obvious that ‘understanding’ is the appropriate term for this state. Consider here an analogy: a false belief can be subjectively indistinguishable from knowledge. We could, for convenience, use the honorific term ‘subjective knowledge’ for false belief, though in doing so, we are no longer talking about knowledge in the sense that epistemologists are interested in, any more than we are when, as Allan Hazlett (2010) has drawn attention to, we say things like “Trapped in the forest, I knew I was going to die; I’m so lucky I was saved.” Perhaps the same should be said about alleged subjective understanding: to the extent that it is convenient to refer to non-factive states of intelligibility as states of ‘understanding’, we are no longer talking about the kind of valuable cognitive achievement of interest to epistemologists.

c. Moderate Views of Objectual Understanding’s Factivity

At the other end of the spectrum, we might consider an extremely strong view of understanding’s factivity, according to which understanding a subject matter requires that all of one’s beliefs about the subject matter in question are true. Such a constraint would preserve the intuition that understanding is a particularly desirable epistemic good and would accordingly be untroubled by the issues highlighted for the weakest view outlined at the start of the section. However, such a strong view would also make understanding nearly unobtainable and surely very rare—for example, on the extremely strong proposal under consideration, recognized experts in a field would be denied understanding if they had a single false belief about some very minor aspect of the subject matter. This is of course an unpalatable result, as we regularly attribute understanding in the presence of not just one, but often many, false beliefs. This point aligns with the datum that we often attribute understanding by degrees. That is, we often describe an individual as having a better understanding of a subject matter than some other person, perhaps when choosing whom to approach for advice or when looking for someone to teach us about a subject. While we would apply a description of ‘better understanding’ to agent A even if the major difference between her and agent B was that A had additional true beliefs, we would also describe A as having ‘better understanding’ than B if the key difference was that A had fewer false beliefs. If we sometimes attribute understanding to two people even when they differ only in terms of who has more false beliefs about a subject, this difference in degrees indicates that one can have understanding that includes some false beliefs. We can acknowledge this simply by regarding B’s understanding as, even if only marginally, relatively impoverished, rather than by claiming, implausibly, that no understanding persists in such cases. This leaves us, however, with an interesting question about the point at which there is no understanding at all, rather than merely weaker or poorer understanding.

Regarding factivity, then, it seems there is room for a view that occupies the middle ground here. We can accommodate the thought that not all beliefs relevant to an agent’s understanding must be true while nonetheless insisting that cases in which false beliefs run rampant will not count as understanding. Kvanvig (2003; 2009) offers such a view, according to which understanding of some subject matter is incompatible with false central beliefs about the subject matter. This view, while insisting that central beliefs must all be true, is flexible enough to accommodate that there are degrees of understanding—that is, that understanding varies not just according to numbers of true beliefs but also numbers of false, peripheral beliefs. It also allows attributions of understanding in the presence of peripheral false beliefs, without going so far as to grant that understanding is present in cases of internally consistent delusions—as such delusions will feature at least some false central beliefs. In this respect, then, Kvanvig’s view achieves the result of a middle ground.

However, advocates of moderate approaches to the factivity of understanding are left with some difficult questions to answer. Many of these questions have gone largely unexplored in the literature. For example:

  • In virtue of what does a belief count as ‘central’ in the relevant sense?
  • Moderate factivity implies that we should withhold attributions of understanding when an agent has a single false central belief, even in cases where the would-be understanding is of a large subject matter where all peripheral beliefs in this large subject matter are true. This consequence does not intuitively align with our practices of attributing understanding. The proponent of moderate factivity owes an explanation.
  • How should we distinguish between peripheral beliefs about a subject matter and beliefs that are not properly about the subject matter in question, while retaining a meaningful distinction between peripheral and central beliefs?

Although a moderate view of understanding’s factivity may look promising in comparison with competitor accounts, many important details remain left to be spelled out.

3. Coherence and the Grasping Condition

When considering interesting features that might set understanding apart from propositional knowledge, the idea of grasping something is often mentioned. For example, Kvanvig (2003) holds that understanding is particularly valuable in part because it requires a special “grasp of “explanatory and other coherence-making relationships.” Riggs (2003: 20) agrees, stating that understanding of a subject matter “requires a deep appreciation, grasp or awareness of how its parts fit together, what role each one plays in the context of the whole, and of the role it plays in the larger scheme of things” (italics added). Relatedly, Van Camp (2014) calls understanding a “higher level cognition” that involves recognizing connections between different pieces of knowledge, and Kosso (2007: 1) submits that inter-theoretic coherence is the hallmark of understanding, stating “knowledge of many facts does not amount to understanding unless one also has a sense of how the facts fit together.” While such remarks are made with objectual understanding (that is, understanding of a subject matter) in mind, there are similar comments about understanding-why (for example, Hills 2009) that suggest an overlapping need to consider connections between items of information, albeit on a smaller scale.

Such discussions, though they can be initially helpful, raise a nest of further questions. This in part for three principal reasons. Firstly, ‘grasping’ is often used in such a way such that it is not clear whether it should be understood metaphorically or literally. If the former, then this is unfortunate given the theoretical work the term is supposed to be doing in characterizing understanding. If the latter—that is, if we are to understand ‘grasping’ literally, then, also unfortunately, we are rarely given concrete details of its nature. A second reason that adverting to grasping-talk in the service of characterizing understanding raises further question is that it is often not clarified just what relationships or connections are being grasped, when they are grasped in a way that is distinctive of understanding. And, thirdly, two questions about what is involved in grasping can easily be run together, but should be kept separate. Call these, for short, the ‘relation question’ and the ‘object question’.

Relation question: What is the grasping relationship? (For example, is it a kind of knowledge, another kind of propositional attitude, an ability, and so on?)

Object question: What kinds of things are grasped? (For example, propositions, systems, bodies of information, the relationships thereof, and so on?)

Take first the object question. Since Kvanvig claims that the coherence-making relationships that are traditionally construed as necessary for justification on a coherentist picture are the very relations that one grasps (for example, the objects of grasping) when one understands, the justification literature may be a promising place to begin. Put generally, according to the coherentist family of proposals of the structure of justified belief, “a belief or set of beliefs is justified, or justifiably held, just in case the belief coheres with a set of beliefs, the set forms a coherent system, or some variation on these themes” (Olsson 2012: 1). Of course, many interrelated questions then emerge regarding coherence. For the purposes of thinking about understanding, some of the most important will include: (i) what makes a system of beliefs coherent? and (ii) what qualifies a group of beliefs as a system in the sense that is at issue when it is claimed that understanding involves grasping relationships or connections within a system of beliefs? For example, we might suppose that a system of beliefs contains only beliefs about a particular subject matter, and that these beliefs will ordinarily be sufficient for a rational believer who possesses them to answer questions about that subject matter reliably. Such a theory raises questions of its own, such as precisely what answering reliably, in the relevant sense, demands.

What is the grasping relation? Is it a kind of knowledge, another kind of propositional attitude, an ability, and so forth? Kvanvig does not spell out what grasping might involve, in the sense now under consideration, in his discussion of coherence, and the other remarks we considered above. He leaves grasping at the level of metaphor or uses it them literally but never develops it. Given the extent to which grasping is highly associated with understanding and left substantively unspecified, it is perhaps unsurprising that the matter of how to articulate grasping-related conditions on understanding has proven to be rather divisive. Kelp (2015) makes a helpful distinction between two broad camps here. On the one hand, we have manipulationists, who think understanding involves an ability (or abilities) to manipulate certain representations or concepts. On the other hand, there are explanationists, who argue that it is knowledge or evaluation of explanations that is doing the relevant work. However, it is not entirely clear that extant views on understanding fall so squarely into these two camps. Many seem to blend manipulationism with explanations, suggesting for example that what is required for understanding is an ability associated with mentally manipulating explanations. To complicate matters further, some of the philosophers who appear to endorse one approach over the other can elsewhere be seen considering a more mixed view (for example, Khalifa 2013b).

The next section considers some of the most prominent examples of attempts to expand on or replace a grasping condition on understanding. Some focus on understanding-why while others focus on objectual understanding.

a. Understanding as Representation Manipulability

Wilkenfeld (2013) offers the account that most clearly falls under Kelp’s characterization of manipulationist approaches to understanding. As Wilkenfeld sees it, understanding should be construed as “representational manipulability,” which is to say that understanding is, essentially, the possessing of some representation that can be manipulated in useful ways. Unlike de Regt and Dieks (2005), Wilkenfeld aims to propose an inclusive manipulation-based view that allows agents to have objectual understanding even if they do not have a theory of the phenomenon in question. His conception of mental representations defines these representations as “computational structures with content that are susceptible to mental transformations.” Wilkenfeld constructs a necessary condition on objectual understanding around this definition. His view is that understanding requires the agent to, in counterfactual situations salient to the context, be able to modify their mental representation of the subject matter. This allows the agent to produce a slightly different mental representation of the subject matter that “enables efficacious inferences” pertaining to (or manipulations of) the subject matter.

What is it to have this ability to modify some mental representation? Wilkenfeld suggests that this ability consists at least partly in being able to correct minor mistakes in one’s mental representation and use it to make assessments in similar cases. Though the demandingness of this ability need not be held fixed across practical circumstances. The context-sensitive element of Wilkenfeld’s account of understanding allows him to attribute adequate understanding to, for example, a student in an introductory history class and yet deny understanding to that student when the context shifts to place him in a room with a panel of experts.

There are three potential worries with this general style of approach. Firstly, Wilkenfeld’s context-sensitive approach is in tension with a more plausible diagnosis of the example just considered: rather than to withhold attributing understanding in the case where the student is surrounded by experts, why not—instead—and in a way that is congruous with the earlier observation that understanding comes in degrees—attribute understanding to the student surrounded by experts, but to a lesser degree (for example,  Tim has some understanding of physics, while the professor has a much more complete understanding). Carter (2014) argues that shifting to more demanding practical environments motivates attributing lower degrees of understanding rather than (as Wilkenfeld is suggests) withholding understanding.

Secondly, one might wonder if Wilkenfeld’s account of understanding as representation manipulation is too inclusive—that it rules in, as cases of bona fide understanding, representations that are based on inaccurate but internally consistent beliefs. If so, then the internally consistent delusion objection typically leveled against weakly nonfactive views raises its head. However, this concern might be abated with the addition of a moderate factivity constraint (for example, the constraint discussed in section two above) that rules out cases of mere intelligibility or subjective understanding).

Thirdly, Kelp (2015) has an objection that he thinks all who favor a manipulationist line should find worrying. Specifically, he points out that an omniscient agent who knows everything and intuitively therefore understands every phenomenon might do so while being entirely passive—not drawing interferences, making predictions or manipulating representations (in spite of knowing, for example, which propositions can be inferred from others). If Kelp’s thought experiment works, manipulation of representations cannot be a necessary condition of understanding after all. This objection is worth holding in mind when considering any further positions that incorporate representation manipulability as necessary. That said, for manipulationists who are not already inclined to accept the entailment from all-knowing to omni-understanding, the efficacy against the manipulationist is diffused as the example does not get off the ground. One reason a manipulationist will be inclined to escape the result in this fashion (by denying that all-knowing entails all-understanding) is precisely because one already (qua manipulationist) is not convinced that understanding can be attained simply through knowledge of propositions. In this respect, it seems Kelp’s move against the manipulationist might get off the ground only if certain premises are in play which manipulationists as such would themselves be inclined to resist.

b. Understanding and Knowledge of Causes

Grimm (2011) also advocates for a fairly straightforward manipulationist approach in earlier work. He suggests that manipulating the “system” allows the understander to “see” the way in which “the manipulation influences (or fails to influence) other parts of the system” (2011: 11). He argues that we can gain some traction on the nature of grasping significant to understanding if we view it along such manipulationist lines. So, on Grimm’s (2011) view, grasping the relationships between the relevant parts of the subject matter amounts to possessing the ability to work out how changing parts of that system would or would not impact on the overall system. He considers that grasping might be a “modal sense or ability” that allows the understander to, over and above registering how things are. Grasping also allows the understander to anticipate what would happen if things were relevantly different—namely, to make correct inferences about the ways in which relevant differences to the truth-values of the involved propositions would influence the inferences that obtain in the actual world. That said, Grimm’s more recent work (2014) expands on these earlier observations to form the basis of a view that spells out grasping in terms of a modal relationship between properties, objects or entities—a theory on which what is grasped when one has understanding-why will be how changes in one would lead (or fail to lead) to changes in the other. His central claim in his recent work is that understanding can be viewed as knowledge of causes, though appreciating how he is thinking of this takes some situating, given that the knowledge central to understanding is non-propositional.

Although a large number of epistemologists hold that understanding is not a species of knowledge (e.g. Kvanvig 2003; Zagzebski 2001; Riggs 2003; Pritchard 2010), Grimm’s view is rooted in a view that comes from the philosophy of science and traces originally to Aristotle. Essentially, this view traditionally holds that understanding why X is the case is equivalent to knowing why X is the case (which is in turn supposed to be equivalent to knowing that X is the case because of Y). In short: understanding is causal propositional knowledge. Sliwa 2015, however, defends a stronger view, according to which propositional knowledge is necessary and sufficient for understanding. Although, many commentators suggest that understanding requires something further, that is something in additional to merely knowing a proposition or propositions, Grimm thinks we can update the “knowledge of causes” view so that this intuition is accommodated and explained. In particular, he wants to propose a non-propositional view that has at its heart “seeing or grasping, of the terms of the casual relata, their modal relatedness”, which he suggests amounts to seeing or grasping “how things might have been if certain conditions had been different.” To be clear, the nuanced view Grimm suggests is that while understanding is a kind of knowledge of causes, it is not propositional knowledge of causes but rather non-propositional knowledge of causes, where the non-propositional knowledge is itself unpacked as a kind of ability or know-how.

Grimm develops this original position via parity of reasoning, taking as a starting point that the debate about a priori knowledge, for example, knowledge of necessary truths, makes use of metaphors of “grasping” and “seeing” that are akin to the ones in the understanding debate. An important observation Grimm makes is that merely assenting to necessary truths is insufficient for knowing necessary truths a priori—one must also grasp orsee the necessity of the necessary truth. Grimm thinks the metaphor involves something like apprehending how things stand in modal space (that is, that there are no possible worlds in which the necessary truth is false). He argues that what is grasped or seen when one attains a priori knowledge is not a proposition but a certain modal relationship between properties, objects or identities. He suggests that the primary object of a priori knowledge is the modal reality itself that is grasped by the mind and that on this basis we go on to assent to the proposition that describes these relationships. Hence, he argues that any propositional knowledge is derivative.

In terms of parallels with the understanding debate, it is important to note that the knowledge of causes formula is not limited to the traditional propositional reading. The ambiguity between assenting to a necessary proposition and the grasping or seeing of certain properties and their necessary relatedness mirrors the ambiguity between assenting to a casual proposition and grasping or seeing of the terms of the causal relata: their modal relatedness. However, Grimm is quick to point out that defending one of these two similar views does not depend on the correctness of the other. His modal model of understanding fits with the intuition that we understand not propositions but “relations between parts to wholes” or “systems of various thoughts.”

Grimm has put his finger on an important commonality at issue in his argument from parity. However, Pritchard (2014) responds to Grimm’s latest proposal with a number of criticisms. Perhaps the strongest of these is his suggestion that while the faculty of rational insight is indispensable to the grasping account of a priori, it is actually essential to knowledge of causes that it not be grasped through rational insight. This is because we don’t learn about causes a priori. On this basis Pritchard insists that Grimm’s analogy breaks down.

This aside, can we consider extending Grimm’s conception of understanding as non-propositional knowledge of causes to the domain of objectual understanding? While his view fits well with understanding-why, it is less obvious that objectual understanding involves grasping how things came to be. For one thing, abstract objects, such as mathematical truths and other atemporal phenomena, can plausibly be understood even though our understanding of them does not seem to require an appreciation of their coming to existence. For example, I can understand the quadratic formula without knowing, or caring, about who introduced it. But more deeply, atemporal phenomena such as mathematical truths have, in one clear sense, never come to be at all, but have always been, to the extent that they are the case at all. This holds regardless of whether we are Platonists or nominalists about such entities.

Secondly, even subject matters that traffic in empirical rather than abstract atemporal phenomena (for example, pure mathematics), are not clearly such that understanding them should involve any appreciation for their coming to be, or their being caused to exist. Here is one potential example to illustrate this point: consider that it is not clear that people who desire to understand chemistry generally care about “the cause of chemistry”. A potential worry then is that the achievement one attains when one understands chemistry need not involve the subject working the subject matter—in this case, chemistry’s—cause.

Grimm anticipates this point and expresses a willingness to embrace a looser conception of dependence than causal dependence, one that includes (following Kim 1994) species of dependence such as mereological dependences (that is, dependence of a whole on its parts), evaluative dependences (that is, dependence of evaluative on non-evaluative), and so on. A restatement of Grimm’s view might accordingly be: understanding is knowledge of dependence relations. This broader interpretation seems well positioned to handle abstract object cases, for example, mathematical understanding, when the kind of understanding at issue is understanding-why. For, even if understanding why 2×2=4 does not require a grasp of any causal relation, it might nonetheless involve a grasp of some kind of more general dependence, for instance the kind of dependence picked out by the metaphysical grounding relation. However, it is less clear at least initially that retreating from causal dependence to more general dependence will be of use in the kinds of objectual understanding cases noted above. For example, when the issue is understanding mathematics, as opposed to understanding why 2×2=4, it is perhaps less obvious that dependence has a central role to play.

c. Understanding, Abilities and Know-How

Another seemingly promising line—one that engages with the relation question discussed above—views grasping as intimately connected with a certain set of abilities. Hills (2009) is an advocate of such a view of understanding-why in particular. Specifically, Hills outlines six different abilities that she takes to be involved in grasping the reasons why p—abilities which effectively constitute, on her view, six necessary conditions for understanding why p. These six abilities allow one to “be able to treat q as the reason why p, not merely believe or know that q is the reason why p.” They are as follows:

(i) an ability to follow another person’s explanation of why p,

(ii) an ability to explain p in one’s own words,

(iii) an ability to draw from the information that q the conclusion that p (or that probably p),

(iv) an ability to draw from the information q’ the conclusion that p’ (or probably p’),

(v) an ability to give q (the right explanation) when given the information that p, and

(vi) an ability to give q’ (the right explanation) when given the information p’.

On the most straightforward characterization of her proposal, one fails to possess understanding why, with respect to p, if one lacks any of the abilities outlined in (i-vi), with respect to p. Note that this is compatible with one failing to possess understanding why even if one possesses knowledge that involves, as virtue epistemologists will insist, some kinds of abilities or virtues. That said, Hills adds some qualifications. For one thing, she admits that these abilities can be possessed by degrees. Secondly, she concedes that it is possible that in some cases additional abilities must be added before the set of abilities will be jointly sufficient.

Hills thinks that mere propositional knowledge does not essentially involve any of these abilities even if (as per the point above) propositional knowledge requires other kinds of abilities. To defend the claim that possessing the kinds of abilities Hills draws attention to is not a matter of simply having extra items of knowledge—she notes that one could have the extra items of knowledge and still lack the ‘good judgment’ that allows you to form new, related true beliefs. The possession of such judgment plausibly lines up more closely with ability possession (that is, (i)-(vi)) than with propositional attitude possession.

If Hills is right about this connection between grasping and possessing abilities, it might seem as though understanding-why is, at the end of the day, very similar to knowing-how (see, however, Sullivan 2017 for resistance to this suggestion).. This is a view to which Grimm (2010) is also sympathetic, remarking that the object of objectual understanding “can be profitably viewed along the lines of the object of know-how,” where Grimm has in mind here an anti-intellectualist interpretation of know-how according to which knowing how to do something is a matter of possessing abilities rather than knowing facts (compare, Stanley & Williamson 2001; Stanley 2011). Grimm (2014) also notes that his modal view of understanding fits well with the idea that understanding involves a kind of ability or know-how, as one who sees or grasps how certain propositions are modally related has the ability to answer a wide variety of questions about how things could have been different. Grimm does not make the further claim that understanding is a kind of know-how—he merely says that there is similarity regarding the object, which does not guarantee that the “activity” of understanding and know-how are so closely related.

However, if understanding-why actually is a type of knowing how then this means that intellectualist arguments to the effect that knowing how is a kind of propositional knowledge might apply, mutatis mutandis, to understanding-why as well (see Carter and Pritchard 2013). Hills herself does not believe that understanding-why is some kind of propositional knowledge, but she points out that even if it is there is nonetheless good cause to think that understanding-why is very unlike ordinary propositional knowledge. Drawing from Stanley and Williamson, she makes the distinction between knowing a proposition “under a practical mode of presentation” and knowing it “under a theoretical mode of presentation.” Stanley and Williamson admit that the former is especially tough to spell out (see Glick 2014 for a recent discussion), but it must surely involve having complex dispositions, and so it is perhaps possible to know some proposition under only one of these modes of presentation (that is, by lacking the relevant dispositions, or something else). Hills thinks that moral understanding, if it were any kind of propositional knowledge at all, would be knowing a proposition under a practical mode and “not necessarily under a theoretical mode.”

d. Understanding as Explanation

The group designated “explanationists” by Kelp (2015) share a general commitment to the idea that knowledge of explanations should play a key role in a theory of understanding (for example, Hempel 1965; Salmon 1989; Khalifa 2012; 2013). For those who wonder about whether the often-discussed “grasping” associated with understanding might just amount to the possession of further beliefs (rather than, say, the possession of manipulative abilities), this type of view may seem particularly attractive (and comparatively less mysterious). On such a view, grasping talk could simply be jettisoned altogether. However, the core explanationist insight also offers the resources to supplement a grasping account.  On such an interpretation, explanationism can be construed as offering a simple answer to the object question discussed above: the object of understanding-relevant grasping would, on this view, be explanations. As it turns out, not all philosophers who give explanation a central role in an account of understanding want to dispense with talk of grasping altogether, and this is especially so in the case of objectual understanding.

Khalifa’s (2013) view of understanding is a form of explanatory idealism. While Khalifa favors earlier accounts of scientific understanding to the more recent views that have been submitted by epistemologists, he is aware that some criticisms (for example, Lipton (2009) and Pritchard (2010)) to the effect that requiring knowledge of an explanation is too strong a necessary condition on understanding-why. His alternative suggestion is to propose explanation as the ideal of understanding, a suggestion that has as a consequence that one should measure degrees of understanding according to how well one “approximate[s] the benefits provided by knowing a good and correct explanation.” Khalifa submits that this line is supported by the existence of a correct and reasonably good explanation in the background of all cases of understanding-why that does not involve knowledge of an explanation—a background explanation that would, if known, provide a greater degree of understanding-why.

This line merits discussion not least because the idea that understanding-why comes by degrees is often ignored in favor of discussing the more obvious point that understanding a subject matter clearly comes by degrees. One issue worth bringing into sharper focus is whether knowing a good and correct explanation is really the ideal form of understanding-why. In particular, one might be tempted to suggest that some of the objections raised to Grimm’s non-propositional knowledge-of-causes model could be recast as objections to Khalifa’s own explanation-based view. For example, we might suppose an agent has a maximally complete explanation of how Michelangelo’s David came into existence between 1501 and 1504, what methods were used to craft it, what Michelangelo’s motivating reasons were at the time, how much clay was used, and so on. But when the object of understanding why   is essentially evaluative—for example, understanding why the statue is beautiful—it seems that the quality of one’s understanding could vary dramatically even when we hold fixed that one possesses a correct and complete explanation of how the statue came to be (that is, both a physical and social description of these causes). To the extent that this is correct, there is some cause for reservation about measuring degrees of understanding according to how well they “approximate the benefits provided by knowing a good and correct explanation.” A proponent of Khalifa’s position might, however, view the preceding response as question-begging. For if the view is correct, then an explanation for why one’s understanding why the painting is beautiful is richer, when it is, will simply be in terms of one’s possession of a correct answer to the question of why it is beautiful.

It is moreover of interest to note that Khalifa (2013b) also sees a potential place for the notion of ‘grasping’ in an account of understanding, though in a qualified sense. On the view he recommends, the ability to grasp explanatory or evidential connections is an ability that is central to understanding only if the relevant grasping ability is understood as involving reliable explanatory evaluation. Khalifa’s indispensability argument—which he calls the ‘Grasping Argument’ runs as follows:

  1. Understanding entails true beliefs of the form q explains p.
  2. Understanding entails that such beliefs must be the result of exercising reliable cognitive abilities.
  3. If understanding entails true beliefs of the form q explains p, and also entails that such beliefs must be the result of exercising reliable cognitive abilities, then these abilities involve evaluating (or discriminating between) explanations.
  4. So understanding entails that beliefs of the form q explains p are the result of exercising a reliable cognitive ability to evaluate explanations. (Khalifa 2013b: 5)

Khalifa is, in this argument stipulating that (1) is a “ground rule for discussion” (2013b: 5). One point that could potentially invite criticism is the move from (1) and (2) to (3). A worry about this move can be put abstractly: consider that if understanding entails true beliefs of form <q explains>, and that beliefs of form <q explains p> must themselves be the result of exercising reliable cognitive abilities, it might still be that one’s reliable <q explains p>-generating abilities are exercised in a bad environment. For example, an environment where one’s abilities so easily could generate false beliefs of form <q explains p> despite issuing (luckily) true beliefs of the form <q explains p> on this occasion. Contrary to premise (3), such abilities (of the sort referenced by Khalifa in premise 2 and 3) arguably need not involve discriminating between explanations, so long as one supposes that discriminating between explanations is something one has the reliable ability to do only if one could not very easily form a belief of the form <q explains p> when this is false.

More generally, though, it is important to note that Khalifa, via his grasping argument, is defending reliable explanatory evaluation as merely a necessary—though not sufficient—component of grasping. In so doing, he notes that the reader may be inclined to add further internalist requirements to his reliability requirement, of the sort put forward by Kvanvig (2003). As such, Khalifa is not attempting to provide an analysis of grasping.

It is worth considering how and in what way a plausible grasping condition on understanding should be held to something like a factivity or accuracy constraint. If a grasping condition is necessary for understanding, does one satisfy this condition only when one exercises a grasping ability to reflect how things are in the world? Or, should we adopt a more relaxed view of what would be required to satisfy this condition—namely, a view that focuses on the way the agent connects information.

Strevens (2013) focuses on scientific understanding in his discussion of grasping. He also suggests, like Khalifa, that grasping be linked with correct explanations. As such, his commentary here is particularly relevant to the question of whether gasping is factive. Riggs (2003: 21-22) asks whether an explanation has to be true to provide understanding, and Strevens thinks that it is implied that grasping is factive.

However, Strevens nonetheless offers a rough outline of a parallel, non-factive account of grasping, what he calls ‘grasping*’. He wants us to suppose that grasping has two components—one that is a purely psychological (that is, narrow) component and one that is the actual obtaining of the state of affairs that is grasped. He gives the name ‘grasping*’ to the purely psychological component that would continue to be satisfied even if, say, an evil demon made it the case at the moment of your grasping that there was only an appearance of the thing that appears to you to be the case. This would be the non-factive parallel to the standard view of grasping. Strevens, however, holds that than an explanation is only correct if its constitutive propositions are true, and therefore the reformulation of grasping that he provides is not intended by Strevens to be used in an actual account of understanding. The idea of grasping* is useful insofar as it makes clearer the cognitive feat involved in intelligibility, which is similar to understanding in the sense that it “implies a grasping of order, pattern and connection” between propositions (Riggs, 2004), but it does not require those propositions to be true. Just as we draw a distinction between this epistemic state (that is, intelligibility, or what Grimm calls ‘subjective understanding’) and understanding (which has a much stricter factivity requirement), it makes sense to draw a line between grasping* and grasping where one is factive and the other is not.

Likewise, just as all understanding will presumably involve achieving intelligibility even though intelligibility does not entail understanding, so too will all grasping involve grasping* even though grasping* does not entail grasping. Consider, on this point, that a conspiracy theorist might very well grasp* the connection between (false) propositions so as to achieve a coherent, intelligible, though wildly off-base, picture. The conspiracy theorist possesses something which one who grasps (rather than grasps*) a correct theory also possesses, and yet one who fails to grasp* even the conspiracy theory (for example, a would-be conspiracy theorist who has yet to form a coherent picture of how the false propositions fit together) lacks.

e. Understanding as Well-Connected Knowledge

Assuming that we need an account of degrees of understanding if we are going to give an account of outright understanding (as opposed to working the other way around, as he thinks many others are inclined to do), Kelp (2015) suggests we adopt a knowledge based account of objectual understanding according to which “maximal understanding of a given phenomenon” is to be cashed out in terms of fully comprehensive and maximally well-connected knowledge of that phenomenon. Kelp’s account, then, explains our attributions of degrees of understanding in terms of approximations to such well-connected knowledge. He says that knowledge about a phenomenon (P) is maximally well-connected when “the basing relations that obtain between the agent’s beliefs about P reflect the agent’s knowledge about the explanatory and support relations that obtain between the members of the full account of P” (2015: 12).

This view, he notes, can make sense of the example (see §3(b))—which he utilizes against manipulationists accounts—of the omniscient, omni-understanding agent who is passive (that is, an omni-understanding agent who is not actively drawing explanatory inferences) as one would likely attribute to this agent maximally well-connected knowledge in spite of that passivity. Meanwhile, when discussing outright (as opposed to ideal) understanding, Kelp suggests that we adopt a contextualist perspective. In a given context, then, one understands some subject matter P only if one approximates fully comprehensive and maximally well-connected knowledge of P “closely enough” that one is sufficiently likely to successfully perform any task relating to P that is determined by the context, assuming that one “has the skills needed to do so and to exercise them in suitably favorable conditions”. Kelp points out that this type of view is not so restrictive as to deny understanding to, for example, novice students and young children.  It should be noted that Hills 2009: 7 is also sympathetic to a similar thought, suggesting that the threshold for understanding might be contextually determined. However, Kelp admits that he wonders how his account will make sense of the link between understanding and explanation, and one might also wonder whether it is too strict to say that understanding requires knowledge as opposed to justified belief or justified true belief.

4. Understanding and Epistemic Luck

With a wide range of subtly different accounts of understanding (both objectual and understanding-why) on the table, it will be helpful to consider how understanding interfaces with certain key debates in epistemology. One natural place to start will be to examine the relationship between understanding and epistemic luck. Many epistemologists have sought to distinguish understanding from knowledge on the basis of alleged differences in the extent to which knowledge and understanding are susceptible to being undermined by certain kinds of epistemic luck.

While the matter of how to think about the incompatibility of knowledge with epistemic luck remains a contentious point—for instance, here modal accounts (for example, Pritchard 2005) are at odds with lack-of-control accounts (for example, Riggs 2007), few contemporary epistemologists dissent from the comparatively less controversial claim that knowledge excludes luck in a way that true beliefs and sometimes even justified true beliefs do not (see  Hetherington (2013) for a dissenting position). That said, the question of whether, and if so to what extent, understanding is compatible with epistemic luck, lacks any contemporary consensus, though this is an aspect of understanding that is receiving increased attention.

Zagzebski (2001) and Kvanvig (2003), have suggested that understanding’s immunity to being undermined by the kinds of epistemic luck which undermine knowledge is one of the most important ways in which understanding differs from knowledge. Riaz (2015), Rohwer (2014) and Morris (2012) have continued to uphold this line on understanding’s compatibility with epistemic luck and defend this line against some of the objections that are examined below. However, Pritchard’s work on epistemic luck (for example, 2005) and how it is incompatible with knowledge leads him to reason that understanding is immune to some but not all forms of malignant luck (that is, luck which is incompatible with knowledge). Finally, on the other side of the spectrum from Zagzebski and Kvanvig, and also in opposition with Pritchard, is the view that understanding’s immunity to epistemic luck is isomorphic to knowledge’s immunity to epistemic luck. This view, embraced by DePaul and Grimm (2009), implies that to the extent that understanding and knowledge come apart, it is not with respect to a difference in susceptibility to being undermined by epistemic luck.

a. Understanding as (Partially) Compatible with Epistemic Luck

Consider the view that the kinds of epistemic luck that suffice to undermine knowledge do not also undermine understanding. As Kvanvig sees it, knowing requires non-accidental links between (internal) mental states and external events in just the right way. But, the chief requirement of understanding, for him, is instead that there be the right coherence-making relations in some agent’s collection of information (that is, that the agent has a grasp of how all this related information fits together. In order to illustrate this point, Kvanvig invites us to imagine a case where an individual reads a book on the Comanche tribe, and she thereby acquires a belief set about the Comanche. In such a case, Kvanvig says, this individual acquires an “historical understanding of the Comanche dominance of the Southern plains of North America from the late 17th until the late 19th century” (2003: 197). Kvanvig stipulates that there are no falsehoods in the relevant class of beliefs that this individual has acquired from the book, and also that she can correctly answer all relevant questions whilst confidently believing that she is expressing the truth. He claims that while we would generally expect her to have knowledge of her relevant beliefs, this is not essential for her understanding and as a result it would not matter if these true beliefs had been Gettierised (and were therefore merely accidentally true). In short, then, Kvanvig wants to insist that the true beliefs that one attains in acquiring one’s understanding can all be Gettiered, even though the Gettier-style luck which prevents these beliefs from qualifying as knowledge does not undermine the understanding this individual acquires. So, understanding is compatible with a kind of epistemic luck that knowledge excludes.

Pritchard, meanwhile, claims that the matter of understanding’s compatibility with epistemic luck can be appreciated only against the background of a distinction between two kinds of epistemic luck—intervening and environmental—both of which are incompatible with knowledge. Both are “veritic” types of luck on Pritchard’s view—they are present when, given how one came to have one’s true belief, it is a matter of luck that this belief is true (Pritchard 2005: 146). Intervening epistemic luck is the sort present in the Gettier’s original cases (1963) which convinced most epistemologists to abandon the traditional account of knowledge as justified true belief. Cases of intervening luck take—to use a simple example—the familiar pattern of Chisholm’s “sheep in a field case”, where an agent sees a sheep-shaped rock which looks just like a sheep, and forms the belief “There is a sheep”. The agent’s belief is justified and true, thanks to the fact that there is a genuine sheep hiding behind the rock, but the belief is not knowledge, as it could easily have been false. It is just dumb luck the genuine sheep happened to be in the field. By contrast, the paradigmatic case of environmental epistemic luck is the famous ‘barn façade’ case (for example, Ginet 1975; Goldman 1979), a case where what an agent looks at is a genuine barn which unbeknownst to the individual is surrounded by façades which are indistinguishable to the agent from the genuine barn. Here, and unlike in the case of intervening epistemic luck, nothing actually goes awry, and the fact that the belief could easily have been false is owed entirely to the agent’s being in a bad environment, one with façades nearby.

Armed with this distinction, Pritchard criticizes Kvanvig’s assessment of the Comanche case by suggesting that just how we should regard understanding as being compatible or incompatible with epistemic luck depends on how we fill out the details of Kvanvig’s case, which is potentially ambiguous between two kinds of readings.  In order to make this point clear, Pritchard suggests that we first consider two versions of a case analogous with Kvanvig’s. In the first version, we are to imagine that the agent gets her beliefs from a faux-academic book filled with mere rumors that turn out to be luckily true. In this Gettier-style case, she has good reason to believe her true beliefs, but the source of these beliefs (for example, the rumor mill) is highly unreliable and this makes her beliefs only luckily true, in the sense of intervening epistemic luck. Contrast this—call it the ‘intervening reading’ of the case—with Pritchard’s corresponding environmental reading of the case, where we are to imagine that the agent is reading a reliable academic book which is the source of many true beliefs she acquires about the Comanche. But in this version of the case, suppose that, although the book is entirely authoritative, genuine and reliable, it is the only trustworthy book on the Comanche on the shelves—every book on the shelves nearby, which she easily could have grabbed rather than the genuine authoritative book, was filled with rumors and ungrounded suppositions. Pritchard’s verdict is that we should deny understanding in the intervening case and attribute it in the environmental case. Pritchard’s assessment then of whether understanding is compatible with epistemic luck that is incompatible with knowledge depends on which kind of epistemic luck incompatible with knowledge one is discussing.

While Pritchard’s point here is revealed in his diagnosis of Kvanvig’s reading of the Comanche case, he in several places prefers to illustrate the idea with reference to the case in which an agent asks a real (that is, genuine, authoritative) fire officer about the cause of a house fire and receives a correct explanation. Suppose further that the agent could have easily ended up with a made-up and incorrect explanation because (unbeknownst to the agent) everyone in the vicinity of the genuine fire officer who is consulted is dressed up as fire officers and would have given the wrong story (whilst failing to disclose that they were merely in costume). Pritchard maintains that it is intuitive that in the case just described understanding is attained—you have consulted a genuine fire officer and have received all the true beliefs required for understanding why your house burned down, and acquire this understanding in the right way. Meanwhile, he suggests that were you to ask a fake fire officer who appeared to you to be a real officer and just happened to give the correct answer, it is no longer plausible (by Pritchard’s lights) that you have understanding-why.

For a less concessionary critique of Kvanvig’s Comanche case, however, see Grimm (2006). According to Grimm, cases like Kvanvig admit of a more general characterisation, depending on how the details are filled in. Grimm puts the template formulation as follows: “A Comanche-style case is one in which we form true beliefs on the basis of trusting some source, and either (a) the source is unreliable, or (b) the source is reliable, but in the current environment one might easily have chosen an unreliable source.” After analysing variations of the Comanche case so conceived, Grimm argues that in neither (a)- or (b)-style Comanche cases do knowledge and understanding come apart. If this is right, then at least one prominent case used to illustrate a luck-based difference between knowledge and understanding does not hold up to scrutiny.

b. Newer Defenses of Understanding’s Compatibility with Epistemic Luck

In contrast with Pritchard’s “partial compatibility” view of the relationship between understanding and epistemic luck, where understanding is compatible with environmental but not with intervening luck, Rohwer (2014) defends understanding’s full compatibility with veritic epistemic luck (that is, of both intervening and environmental varieties). Rohwer argues that counterexamples like Pritchard’s intervening luck cases only appear plausible because the beliefs that make up the agent’s understanding come exclusively from a bad source. For example, Pritchard’s case of the ­fake fire officer—which recall is one in which he thinks understanding (as well as knowledge) is lacking—is one in which Rower points out taht all of the true beliefs and grasped connections between those beliefs are from a bad source. Rohwer’s inventive move involves a contrast case featuring “unifying understanding”, that is, understanding that is furnished from multiple sources, some good and some bad. Such cases she claims feature intervening luck that is compatible with understanding. While Pritchard can agree with Rohwer’s conclusion that understanding (and specifically as Rohwer is interested in, scientific understanding) is not a species of knowledge, the issue of adjudicating between Rohwer’s intuition in the case of unifying understanding and the diagnosis Pritchard will be committed to in such a case is complicated. An important question is whether there are philosophical considerations beyond simply intuition to adjudicate in a principled way why we should think about unifying understanding cases in one way rather than the other.

Morris (2012), like Rohwer, also defends lucky understanding—in particular, understanding-why, or what he calls “explanatory understanding”). He argues that intuitions that rule against lucky understanding can be explained away. For example, he attempts to explain the intuitions in Pritchard’s intervening luck spin on Kvanvig’s Comanche case by noting that some of the temptation to deny understanding here relates to the writer of the luckily-true book himself lacking the relevant understanding. Morris challenges the assumption that hearers cannot gain understanding through the testimony of those who lack understanding, and accordingly, embraces a kind of understanding transmission principle that parallels the kind of knowledge transmission principle that is presently a topic of controversy in the epistemology of testimony. Morris suggests that the writer of the Comanche book might lack understanding due to failing to endorse the relevant propositions, while the reader might have understanding because she does endorse the relevant proposition. He claims further that this description of the case undermines the intuition that the writer’s lack of understanding entails the reader’s lack of understanding. Of course, though, just as Lackey (2007) raises ‘creationist teacher’ style cases against knowledge transmission principles, one might as well raise a parallel kind of creationist teacher case against the thesis that one cannot attain understanding from a source who herself lacks it. In such a parallel case, we simply modify Lackey’s original case and suppose that Stella, a creationist teacher, who does not believe in evolution, nonetheless teaches it reliably and in accordance with the highest professional standards. As Lackey thinks students can come to know evolutionary theory from this teacher despite the teacher not knowing the propositions she asserts (given that the Stella fails the belief condition for knowledge), we might likewise think, and contra Morris, that Stella might fail to understand evolution. This is because Stella lacks beliefs on the matter, even though the students can gain understanding from her. To the extent that such a move is available, one has reason to resist Morris’s rationale for resisting Pritchard’s diagnosis of Kvanvig’s case.

5. Understanding and Epistemic Value

The topic of epistemic value has only relatively recently received sustained attention in mainstream epistemology. Even so, and especially over the past decade, there has been agreement amongst most epistemologists working on epistemic value that that understanding is particularly valuable (though see Janvid 2012 for a rare dissenting voice). It is also becoming an increasingly popular position to hold that understanding is more epistemically valuable than knowledge (see Kvanvig 2003; Pritchard 2010). Although the analysis of the value of epistemic states has roots in Plato and Aristotle, this renewed and more intense interest was initially inspired by two coinciding trends in epistemology. On the one hand, there is the increasing support for virtue epistemology that began in the 1980s, and on the other there is growing dissatisfaction with the ever-complicated attempt to generate an account of knowledge that is appropriately immune to Gettier-style counterexamples (see, for example, DePaul 2009).

Unsurprisingly, the comparison between the nature of understanding as opposed to knowledge has coincided with comparisons of their respective epistemic value, particularly since Kvanvig (2003) first defended the epistemic value of the latter to the former. For example, in Whitcomb (2010: 8), we find the observation that “understanding is widely taken to be a ‘higher’ epistemic good: a state that is like knowledge and true belief, but even better, epistemically speaking.” Meanwhile, Pritchard (2009: 11) notes “as we might be tempted to put the point, we would surely rather understand than merely know.” A helpful clarification here comes from Grimm (2012: 105), who in surveying the literature on the value of understanding points out that the suggestion seems to be that understanding (of “a complex of some kind”) is better than the corresponding item of propositional knowledge. This type of a view is a revisionist theory of epistemic value (see, for example, Pritchard 2010), which suggests that one would be warranted in turning more attention to an epistemic state other than propositional knowledge—specifically, according to Pritchard—understanding. The following sections consider why understanding might have such additional value.

a. Transparency

According to Zagzebski (2001), the epistemic value of understanding is tied not to elements of its factivity, but rather to its transparency. She claims, “it may be possible to know without knowing one knows, but it is impossible to understand without understanding one understands” (2001: 246) and suggests that this property of understanding might insulate it from skepticism. Zagzebski does not mean to say that to understand X, one must also understand one’s own understanding of X (as this threatens a psychologically implausible regress), but rather, that to understand X one must also understand that one understands X. Thus, given that understanding that p and knowing that p can in ordinary contexts be used synonymously (for example, understanding that it will rain is just to know that it will rain) we can paraphrase Zagzebski’s point with no loss as: understanding X entails knowing that one understands X. To the extent that this is right, Zagzebski is endorsing a kind of ‘KU’ principle (compare: KK).

Grimm (2006) and Pritchard (2010) counter that many of the most desirable instances of potential understanding, such as when we understand another person’s psychology or understand how the world works, are not transparent. In other words, they claim that one cannot always tell that one understands. Consider, for instance, the felicity of the question: “Am I understanding this correctly?” and “I do not know if I understand my own defense mechanisms; I think I understand them, but I am not sure.” The other side of the coin is that one often can think that one understands things that one does not (for example, Trout 2007). Consider how some people think they grasp the ways in which their zodiac sign has an influence on their life path, yet their sense of understanding is at odds with the facts of the matter. More generally, as this line of criticism goes, sometimes we simply mistake mere (non-factive) intelligibility for understanding. As it were, from “the inside”, these can be indistinguishable much as, from the first-person perspective, mere true belief and knowledge can be indistinguishable. To the extent that these worries with transparency are apt, a potential obstacle emerges for the prospects of accounting for the value of understanding in terms of its transparency. Examples of the sort considered suggest that—even if understanding has some important internalist component to it—transparency of the sort Zagzebski is suggesting when putting forward the ‘KU’ claim, is an accidental property of only some cases of understanding and not essential to understanding.

b. Cognitive Achievement

Pritchard’s (2010) account of the distinctive value of understanding is, in short, that understanding essentially involves a strong kind of finally valuable cognitive achievement, and secondly, that while knowledge comes apart from cognitive achievement in both directions, understanding does not. If, as robust virtue epistemologists have often insisted, cognitive achievement is finally valuable (that is, as an instance of achievements more generally), and understanding necessarily lines up with cognitive achievement but knowledge only sometimes does, then the result is a revisionary story about epistemic value. In other words, one mistakenly take knowledge to be distinctively valuable only because knowledge often does have something—cognitive achievement—which is essential to understanding and which is finally valuable.

Firstly, achievement is often defined as success that is because of ability (see, for example, Greco 2007), where the most sensible interpretation of this claim is to see the ‘because’ as signifying a casual-explanatory relationship—this is, at least, the dominant view. The thought is that, in cases of achievement, the relevant success must be primarily creditable to the exercise of the agent’s abilities, rather than to some other factor (for example, luck). Achievements, unlike mere successes, are regarded as valuable for their own sake, mainly because of the way in which these special sorts of successes come to be.

It is helpful to consider an example. If we consider some goal—such as the successful completion of a coronary bypass—it is obvious that our attitude towards the successful coronary bypass is different when the completion is a matter of ability as opposed to luck. Assume that the surgeon is suffering from the onset of some degenerative mental disease and the first symptom is his forgetting which blood vessel he should be using to bypass the narrowed section of the coronary artery. The surgeon’s successful bypass is valued differently when one is made aware that it was by luck that he picked an appropriate blood vessel for the bypass. Given that the result is the same (that is, the patient’s heart muscle blood supply is improved) regardless of whether he successfully completes the operation by luck or by skill, the instrumental value of the action is the same. Given that the instrumental value is the same, our reaction to the two contrasting bypass cases seems to count in favor of the final value of successes because of ability—achievements. So too does the fact that one would rather have a success involving an achievement than a mere success, even when this difference has no pragmatic consequences. To borrow a case from Riggs, stealing an Olympic medal or otherwise cheating to attain it lacks the kind of value one associates with earning the medal, through one’s own skill. Achievements are thought of as being intrinsically good, though the existence of evil achievements (for example, skillfully committing genocide) and trivial achievements (for example, competently counting the blades of grass on a lawn) shows that we are thinking of successes that have distinctive value as achievements (Pritchard 2010: 30) rather than successes that have all-things-considered value.

Due to the possibility of overly simple or passive successes qualifying as cognitive achievements (for example, coming to truly believe that it is dark just by looking out of the window in normal conditions after 10pm), Pritchard cautions that we should distinguish between two classes of cognitive achievement—strong and weak:

Weak cognitive achievement: Cognitive success that is because of one’s cognitive ability.

Strong cognitive achievement: Cognitive success that is because of one’s cognitive ability where the success in question either involves the overcoming of a significant obstacle or the exercise of a significant level of cognitive ability.

On the basis of considerations Pritchard argues for in various places (2010; 2012; 2013; 2014), relating to cognitive achievement’s presence in the absence of knowledge (for example. in barn façade cases, where environmental luck is incompatible with knowledge but compatible with cognitive achievement) and the absence of cognitive achievement in the presence of knowledge (e.g. as in testimony cases in friendly environments, where knowledge acquisition demands very little on the part of the agent), he argues that cognitive achievement is not essentially wedded to knowledge (as robust virtue epistemologists would hold). In fact, he claims, the two come apart in both directions: yielding knowledge without strong cognitive achievement and—as in the case of understanding that lacks corresponding knowledge—strong cognitive achievement without knowledge. By contrast, Pritchard believes that understanding always involves strong cognitive achievement, that is, an achievement that necessarily involves either a significant exercise of skill or the overcoming of a significant obstacle. If Pritchard is right to claim that understanding is always a strong cognitive achievement, then understanding is always finally valuable if cognitive achievement is also always finally valuable, and moreover, valuable in a way that knowledge is not. See, however, Carter & Gordon (2014) for a recent criticism on the point of identifying understanding with strong cognitive achievement. See further Bradford (2013; 2015) for resistance to the very suggestion that there can be weak achievements on Pritchard’s sense—namely, achievements that do not necessarily involve great effort, regardless of whether they are primarily due to ability.

c. Curiosity

Taking curiosity to be of epistemic significance is not a new idea. Whitcomb (2010) notes that Goldman (1999)  has considered that the significance or value of some item of knowledge might be at least in part determined by whether, and to what extent, it provides the knower with answers to questions that they are curious about. Whitcomb also cites Alston (2005) as endorsing a stronger view, according to which true belief or knowledge gets at least some of its epistemic value from its connection to, and satisfaction of, curiosity.

What is curiosity? According to Goldman (1991) curiosity is a desire for true belief; by contrast, Williamson views curiosity as a desire for knowledge. Kvanvig (2013) claims that both of these views are mistaken, and in the course of doing so, locates curiosity at the center of his account of understanding’s value. More specifically, Kvanvig aims to support the contention that objectual understanding has a special value knowledge lacks by arguing that the nature of curiosity—the “motivational element that drives cognitive machinery” (2013: 152)—underwrites a way of vindicating understanding’s final value.

The notion of curiosity that plays a role in Kvanvig’s line is a broadly inclusive one that is meant to include not just obvious problem-solving examples but also what he calls more “spontaneous” examples, such as turning around to see what caused a noise you just heard. He takes his account to be roughly in line with the layman’s concept of curiosity. His central claim is that curiosity “provides hope for a response-dependent or behaviour-centred explanation of the value of whatever curiosity involves or aims at”. This is explained in the following way: “If it is central to ordinary cognitive function that one is motivated to pursue X, then X has value in virtue of its place in this functional story.” Regarding the comparison between the value of understanding and the value of knowledge, then, he will say that if understanding is fundamental to curiosity then this provides at least a partial explanation for why it is superior to the value of knowledge. Kvanvig identifies the main opponent to his view, that the scope of curiosity is enough to support the unrestricted value of understanding, to be one on which knowledge is what is fundamental to curiosity.  Specifically, he takes his opponent’s view to be that knowledge through direct experience is what sates curiosity, a view that traces to Aristotle. A central component of Kvanvig’s argument is negative; he regards knowledge as ill-suited to play the role of satisfying curiosity, and in particular, by rejecting three arguments from Whitcomb to this effect. According to his positive proposal, objectual understanding is the goal and what typically sates the “appetite” associated with curiosity. He concedes, though, that sometimes curiosity on a smaller scale can be sated by epistemic justification, and that what seems like understanding, but is actually just intelligibility, can sate the appetite when one is deceived.

Grimm (2012) has wondered whether this view might get things “explanatorily backwards”. This is because we might be tempted to say instead that we desire to make sense of things because it is good to do so rather than saying that it is good to make sense of things because we desire it. He also suggests that what epistemic agents want is not just to feel like they are making sense of things but to actually make sense of them. Owing to Kvanvig’s use of the words “perceived achievement”, Grimm thinks that the curiosity account of understanding’s value suggests that subjective understanding (or what is referred to as ‘intelligibility’ above) can satisfy the desire to make sense of the world or “really marks the legitimate end of inquiry.”

6. Future Research on Understanding

Where should an investigation of understanding in epistemology take us next? Although a range of epistemologists highlighting some of the important features of understanding-why and objectual understanding have been discussed, there are many interesting topics that warrant further research. For one thing, if understanding is both a factive and strongly internalist notion then a radical skeptical argument that threatens to show that we have no understanding is a very intimidating prospect (as Pritchard 2010:86 points out). This skeptical argument is worth engaging with, presumably with the goal of showing that understanding does not turn out to be internally indistinguishable from mere intelligibility.

Secondly, there is plenty of scope for understanding to play a more significant role in social epistemology. For example, Carter and Gordon (2011) consider that there might be cases in which understanding, and not just knowledge, is the required epistemic credential to warrant assertion. Questions about when and what type of understanding is required for permissible assertion connect with issues related to expertise. In particular, how we might define expertise and who has it. And, relatedly in social epistemology, we might wonder what if any testimonial transmission principles hold for understanding, and whether there are any special hearer conditions demanded by testimonial understanding acquisition that are not shared in cases of testimonial knowledge acquisition.

Thirdly, even if one accepts something like a moderate factivity requirement on objectual understanding—and thus demand of at least a certain class of beliefs one has of a subject matter that they be true—one can also ask further and more nuanced questions about the epistemic status of these true beliefs. Must they be known or can they be Gettiered true beliefs? Or—and this is a point that has received little attention—even more weakly, can the true beliefs be themselves unreliably formed or held on the basis of bad reasons. For example, if I competently grasp the relevant coherence-making and explanatory relations between propositions about chemistry which I believe and which are true but which I believed on an improper basis. For example, by trusting someone I should not have trusted, or even worse, by reading tea leaves which happened to afford me true beliefs about chemistry. Would this impede one’s understanding? If so, why, and if not why not? Relatedly, if framed in terms of credence, what credence threshold must be met, with respect to propositions in some set, for the agent to understand that subject matter? One helpful way to think about this is as follows: if one takes a paradigmatic case of an individual who understands a subject matter thoroughly, and manipulates the credence the agent has toward the propositions constituting the subject matter, how low can one go before the agent no longer understands the subject matter in question?

Fourthly, a relatively fertile area for further research concerns the semantics of understanding attribution. To what extent do the advantages and disadvantages of, for example, sensitive invariantist, contextualist, insensitive invariantist and relativist approaches to knowledge attributions find parallels in the case of understanding attributions. Is it problematic to embrace, for example, a contextualist semantics for knowledge attributions while embracing, say, invariantism about understanding?

Fifthly, to what extent might active externalist approaches (for example, extended mind and extended cognition) in epistemology, the ramifications of which have recently been brought to bear on the theory of knowledge (see Carter, et. al 2014), have for understanding? Toon (2015) has recently suggested, with reference to the hypothesis of extended cognition, that understanding can be located partly outside the head. Are the prospects of extending understanding via active externalism on a par with the prospects for extending knowledge, or is understanding essentially internal in a way that knowledge need not be?

Finally, there is fruitful work to do concerning the relationship between understanding and wisdom. For example, in Whitcomb (2011) we find the suggestion that theoretical wisdom is a form of particularly deep understanding. Whether wisdom might be a type of understanding or understanding might be a component of wisdom is a fascinating question that can draw on both work in virtue ethics and epistemology.

7. References and Further Reading

  • Alston, W. Beyond ‘Justification’: Dimensions of Epistemic Evaluation. Ithaca, N.Y.: Cornell University Press, 2005.
    • Includes Alston’s view of curiosity, according to which the epistemic value of true belief and knowledge partially comes from a link to curiosity.
  • Baker, L. R. “Third Person Understanding” in A. Sanford (ed.), The Nature and Limits of Human Understanding. London: Continuum, 2003.
    • Outlines a view on which understanding something requires making reasonable sense of it.
  • Batterman, R. W. “Idealization and modelling.” Synthese, 169(3) (2009): 427-446.
    • Endorses the idea that when we consider how things would be if something was true, we increase our access to further truths.
  • Bradford, G. “The Value of Achievements.” Pacific Philosophical Quarterly, 94(2) (2013): 204-224.
    • Resists Pritchard’s claim that there can be weak achievements, that is, ones that do not necessarily involve great effort.
  • Bradford, G. Achievement. Oxford: Oxford University Press, 2015.
    • A monograph that explores the nature and value of achievements in great depth.
  • Carter, J. A. and Gordon, E. C. “Norms of Assertion: The Quantity and Quality of Epistemic Support.” Philosophia 39(4) (2011): 615-635.
    • Argues that a type of understanding might be the norm that warrants assertion in a restricted class of cases.
  • Carter, J. A. and Gordon, E. C. “On Pritchard, Objectual Understanding and the Value Problem.” American Philosophical Quarterly 51 (2014): 1-14.
    • Criticizes the claim that understanding-why should be identified with strong cognitive achievement.
  • Carter, J. A., Kallestrup, J. Palermos, S.O. and Pritchard, D. “Varieties of Externalism.” Philosophical Issues 41(1) (2014): 63-109.
    • Considers some of the ramifications that active externalist approaches might have for epistemology.
  • Carter, J. A. and Pritchard, D. “Knowledge-How and Epistemic Luck.” Noûs (2013).
    • Discusses whether intellectualist arguments for reducing know-how to propositional knowledge might also apply to understanding-why (if it is a type of knowing how).
  • DePaul, M. “Ugly Analysis and Value” in A. Haddock, A. Millar and D. Pritchard (eds.), Epistemic Value. Oxford: Oxford University Press, 2009.
    • Looks at the increasing dissatisfaction with ever-more complicated attempts to generate a theory of knowledge immune to counterexamples.
  • Elgin, C. Z. “True enough.” Philosophical issues, 14(1) (2004): 113-131.
    • Includes further discussion of the role of acceptance and belief in her view of understanding.
  • Elgin, C. “Understanding and the Facts.” Philosophical Studies 132 (2007): 33-42.
    • Argues against a factive conception of scientific understanding.
  • Elgin, C. “Exemplification, Idealization, and Understanding” in M. Suárez (ed.), Fictions in Science: Essays on Idealization and Modeling. London: Routledge, 2009.
    • Explores the epistemological role of exemplification and aims to illuminate the relationship between understanding and scientific idealizations construed as fictions.
  • DePaul, M. and Grimm, S. “Review Essay: Kvanvig’s The Value of Knowledge and the Pursuit of Understanding.” Philosophy and Phenomenological Research 74 (2007): 498-514.
    • Includes criticism of Kvanvig’s line on epistemic luck and understanding.
  • De Regt, H. and Dieks, D. “A Contextual Approach to Scientific Understanding.” Synthese 144 (2005): 137-170.
    • Offers an account of understanding that requires having a theory of the relevant phenomenon.
  • Gettier, E. “Is Justified True Belief Knowledge?” Analysis 23 (6) (1963). 121-132.
    • Contains the famous counterexamples to the Justified True Belief account of knowledge.
  • Ginet, C. Knowledge, Perception and Memory. Dordrecht: Reidel, 1975.
    • Contains the paradigmatic case of environmental epistemic luck (that is, the fake barn case).
  • Goldman, A. “What is Justified Belief?” In G. S. Pappas (ed.), Justification and Knowledge. Dordrecht: Reidel, 1979.
    • Often-cited discussion of the fake barn counterexample to traditional accounts of knowledge that focus on justified true belief.
  • Goldman, A. “Stephen P. Stitch: The Fragmentation of Reason.” Philosophy and Phenomenological Research 51(1) (1991): 189-193.
    • Discusses the connection between curiosity and true belief.
  • Goldman, A. Knowledge in a Social World. Oxford: Oxford University Press, 1999.
    • Contains exploration of whether the value knowledge may be in part determined by the extent to which it provides answers to questions one is curious about.
  • Gordon, E. C. “Is There Propositional Understanding?” Logos & Episteme 3 (2012): 181-192.
    • Examines reasons to suppose that attributions of understanding are typically attributions of knowledge, understanding-why or objectual understanding.
  • Greco, J. “The Nature of Ability and the Purpose of Knowledge.” Philosophical Issues 17 (2007): 57-69.
    • Discusses and defines ability in the sense often appealed to in work on cognitive ability and the value of knowledge.
  • Grimm, S. “Is Understanding a Species of Knowledge?” British Journal for the Philosophy of Science 57 (2006): 515-535.
    • Analyzes Kvanvig’s Comanche case and argues that knowledge and understanding do not come apart in this example.
  • Grimm, S. “Understanding” In S. Bernecker and D. Pritchard (eds.), The Routledge Companion to Epistemology. New York: Routledge, 2011.
    • An overview of the object, psychology, and normativity of understanding.
  • Grimm, S. “The Value of Understanding.” Philosophy Compass 7(2) (2012): 103-177.
    • Gives an overview of recent arguments for revisionist theories of epistemic value that suggest understanding is more valuable than knowledge.
  • Grimm, S. “Understanding as Knowledge of Causes” in A. Fairweather (ed.), Virtue Epistemology Naturalized: Bridges Between Virtue Epistemology and Philosophy of Science. Dordrecht: Springer, 2014.
    • A novel interpretation of the traditional view according to which understanding-why can be explained in terms of knowledge of causes.
  • Hazlett, A. “The Myth of Factive Verbs.” Philosophy and Phenomenological Research 80:3 (2010): 497-522.
    • Argues that the ordinary concept of knowledge is not factive and that epistemologists should therefore not concern themselves with said ordinary concept.
  • Hempel, C. Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. New York: Free Press, 1965.
    • Early defence of explanation’s key role in understanding.
  • Hetherington, S. “There Can be Lucky Knowledge” in M. Steup, J. Turri and E. Sosa (eds.), Contemporary Debates in Epistemology (2nd Edition). Oxford: Wiley-Blackwell, 2013.
    • A paper in which it is argued that (contrary to popular opinion) knowledge does not exclude luck.
  • Hills, A. “Moral Testimony and Moral Epistemology.” Ethics 120 (2009): 94-127.
    • In looking at moral understanding-why, outlines some key abilities that may be necessary to the “grasping” component of understanding.
  • Janvid, M. “Knowledge versus Understanding: The Cost of Avoiding Gettier.” Acta Analytica 27 (2012): 183-197.
    • Disputes the popular claim that understanding is more epistemically valuable than knowledge.
  • Kim, J. “Explanatory Knowledge and Metaphysical Dependence.” In his Essays in the Metaphysics of Mind. New York: Oxford University Press, 1994.
    • Contains Kim’s classic discussion of species of dependence (for example, mereological dependence).
  • Kelp, C. “Understanding Phenomena.” Synthese (2015).
    • Divides recent views of understanding according to whether they are “manipulationist” or “explanationst”; argues for a different view according to which understanding is maximally well-connected knowledge.
  • Khalifa, K. Inaugurating understanding or repackaging explanation. Philosophy of Science, 79(1) (2012): 15-37.
    • Argues that we should replace the main developed accounts of understanding with earlier accounts of scientific explanation.
  • Khalifa, K. “Is Understanding Explanatory or Objectual?” Synthese 190(6) (2013a): 1153-1171.
    • Proposes a framework for reducing objectual understanding to what he calls explanatory understanding.
  • Khalifa, K. “Understanding, Grasping and Luck.” Episteme 10 (1) (2013b): 1-17.
    • Argues against compatibility between understanding and epistemic luck.
  • Kvanvig, J. The Value of Knowledge and the Pursuit of Understanding. NY: Cambridge University Press, 2003.
    • The root of the recent resurgence of interest in understanding in epistemology. This paper proposes a revisionist view of epistemic value and an outline of different types of understanding.
  • Kvanvig, J. “The Value of Understanding” In D. Pritchard, A. Haddock and A. Millar (eds.), Epistemic Value. Oxford: Oxford University Press, 2009.
    • Argues that the concerns plaguing theories of knowledge do not cause problems for a theory of understanding.
  • Kvanvig, J. “Curiosity and a Response-Dependent Account of the Value of Understanding.” In T. Henning and D. Schweikard (eds.), Knowledge, Virtue and Action. Boston: Routledge, 2013.
    • Proposes an account of understanding’s value that is related to its connection with curiosity.
  • Lackey, J. “Why We Don’t Deserve Credit for Everything We Know.” Synthese 156 (2007).
    • Contains Lackey’s counterexamples to the knowledge transmission principles.
  • Lipton, P. “Understanding Without Explanation” in H. de Regt, S. Leonelli, and K. Eigner (eds.), Scientific Understanding: Philosophical Perspectives. Pittsburgh, PA: University of Pittsburgh Press, 2009.
    • Argues that requiring knowledge of an explanation is too strong a condition on understanding-why.
  • Longworth, G. “Linguistic Understanding and Knowledge.” Nous 42 (2008): 50-79.
    • A discussion of whether linguistic understanding is a form of knowledge.
  • Morris, K. “A Defense of Lucky Understanding.” The British Journal for the Philosophy of Science 63 (2012): 357-371.
    • Attempts to explain away the intuitions suggesting that lucky understanding is incompatible with epistemic luck.
  • Olsson, E. “Coherentist Theories of Epistemic Justification” in E. Zalta (ed.), The Stanford Enclopedia of Philosophy.
    • An overview of coherentism that can be useful when considering how theories of coherence might be used to flesh out the grasping condition on understanding.
  • Pritchard, D. Epistemic Luck. Oxford: Oxford University Press, 2005.
    • An in-depth exploration of different types of epistemic luck.
  • Pritchard, D. “Recent Work on Epistemic Value.” American Philosophical Quarterly 44 (2007): 85-110.
    • Looks at understanding’s role in recent debates about epistemic value and contains key arguments against Elgin’s non-factive view of understanding.
  • Pritchard, D. “Knowing the Answer, Understanding and Epistemic Value.” Grazer Philosophische Studien 77 (2008): 325-39.
    • Explores understanding as the proper goal of inquiry, in addition to discussing understanding’s distinctive value.
  • Pritchard, D. “Knowledge, Understanding and Epistemic Value” In A. O’Hear (ed.), Epistemology (Royal Institute of Philosophy Lectures). Cambridge: Cambridge University Press, 2009.
    • Argues that understanding (unlike knowledge) is a type of cognitive achievement and therefore of distinctive value.
  • Pritchard, D. “The Value of Knowledge: Understanding.” In A. Haddock, A. Millar and D. Pritchard (eds.), The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press, 2010.
    • A longer discussion of the nature of understanding and its distinctive value (in relation to the value of knowledge) than in his related papers.
  • Pritchard, D. “Knowledge and Understanding” in A. Fairweather (ed.), Virtue Epistemology Naturalized: Bridges Between Virtue Epistemology and Philosophy of Science. Dordecht: Springer, 2014.
    • Criticizes Grimm’s view of understanding as knowledge of causes.
  • Riaz, A. “Moral Understanding and Knowledge.” Philosophical Studies 172(2) (2015): 113-128.
    • Argues against the view that moral understanding can be immune to luck while moral knowledge is not.
  • Riggs, W. “Understanding Virtue and the Virtue of Understanding” In M. DePaul and L. Zagzebski (eds.), Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Oxford University Press, 2003.
    • Introduces intelligibility as an epistemic state similar to understanding but less valuable.
  • Riggs, W. “Why Epistemologists Are So Down on Their Luck.” Synthese 158 (3) (2007): 329-344.
    • Defends a lack of control account of luck.
  • Rohwer, Y. “Lucky Understanding Without Knowledge.” Synthese 191 (2014): 945-959.
    • Claims that understanding is entirely compatible with both intervening and environmental forms of veritic luck.
  • Salmon, W. “Four Decades of Scientific Explanation.” In Minnesota Studies in the Philosophy of Science, vol. 13. Eds. Philip Kitcher and Wesley Salmon. Minneapolis: University of Minnesota Press, 1989.
    • Another significant paper endorsing the claim that knowledge of explanations should play a vital role in our theories of understanding.
  • Sliwa, P. IV—Understanding and Knowing. In Proceedings of the Aristotelian Society (Hardback) (Vol. 115, No. 1pt1): pp. 57-74, 2015.
    • Defends the strong claim that propositional knowledge is necessary and sufficient for understanding.
  • Stanley, J. Know How. Oxford: Oxford University Press, 2011.
    • Outlines and evaluates the anti-intellectualist and intellectualist views of know-how.
  • Stanley, J and Williamson, T. “Knowing How.” Journal of Philosophy 98(8) (2001): 411-444.
    • An earlier paper defending the intellectualist view of know-how.
  • Strevens, M. “No Understanding Without Explanation.” Studies in History and Philosophy of Science 44 (2013): 510-515.
    • Defends views that hold explanation as indispensable for account of understanding and discusses what a non-factive account of grasping would look like.
  • Sullivan, E. “Understanding: Not Know-How.” Philosohpical Studies (2017). https://doi.org/10.1007/s11098-017-0863-z
    • Resists the alleged similarity between understanding and knowing-how.
  • Toon, A. “Where is the Understanding?” Synthese, 2015.
    • Uses the hypothesis of extended cognition to argue that understanding can be located (at least partly) outside the head.
  • Trout, J.D. “The Psychology of Scientific Explanation.” Philosophy Compass 2(3) (2007): 564-591.
    • Contains a discussion of the fact that we often take ourselves to understand things we do not.
  • Van Camp, W. “Explaining Understanding (or Understanding Explanation.” European Journal for Philosophy of Science 4(1) (2014): 95-114.
    • Uses the concept of understanding to underwrite a theory of explanation.
  • Whitcomb, D. “Wisdom.” In S. Bernecker and D. Pritchard (eds.), The Routledge Companion to Epistemology. New York: Routledge, 2011.
    • An overview of wisdom, including its potential relationship to understanding.
  • Whitcomb, D. “Epistemic Value” In A. Cullison (ed.), The Continuum Companion to Epistemology. London: Continuum, 2012.
    • An overview of issues relating to epistemic value, including discussion of understanding as a “higher” epistemic state.
  • Wilkenfeld, D. “Understanding as Representation Manipulability.” Synthese 190 (2013): 997-1016.
    • Builds an account of understanding according to which understanding a subject matter involves possessing a representation that could be manipulated in a useful way.
  • Zagzebski, L. “Recovering Understanding” In M. Steup (ed.), Knowledge, Truth and Obligation. Oxford: Oxford University Press, 2001.
    • Incudes arguments for the position that understanding need not be factive.
  • Zagzebski, L. On Epistemology. CA: Wadsworth, 2009.
    • An overview of the background, development and recent issues in epistemology, including a chapter on understanding as an epistemic good.

 

Author Information

Emma C. Gordon
Email: emma.gordon@ed.ac.uk
University of Edinburgh
Scotland, U.K.

Ayn Rand (1905—1982)

Ayn Rand was a major intellectual of the twentieth century. Born in Russia in 1905 and educated there, she immigrated to the United States after graduating from university. Upon becoming proficient in English and establishing herself as a writer of fiction, she became well-known as a passionate advocate of a philosophy she called Objectivism. This philosophy is in the Aristotelian tradition, with that tradition’s emphasis upon metaphysical naturalism, empirical reason in epistemology, and self-realization in ethics. Her political philosophy is in the classical liberal tradition, with that tradition’s emphasis upon individualism, the constitutional protection of individual rights to life, liberty, and property, and limited government. She wrote both technical and popular works of philosophy, and she presented her philosophy in both fictional and nonfictional forms. Her philosophy has influenced several generations of academics and public intellectuals, and has had widespread popular appeal.

Regarding human nature, Rand said, “Man is a being of self-made soul.” Rand believes human beings are not born in sin or with destructive desires; nor do they necessarily acquire them in the course of growing to maturity. Instead one is born morally tabula rasa (a blank slate), and through one’s choices and actions one acquires one’s character traits and habits. Having chronic desires to steal, rape, or kill others is the result of mistaken development and the acquisition of bad habits, just as are chronic laziness or the habit of eating too much junk food. And just as one is not born lazy but can by one’s choices develop oneself into a person of vigor or sloth, so also one is not born antisocial but can by one’s choices develop oneself into a person of cooperativeness or conflict.

Table of Contents

  1. Life
  2. Rand’s Ethical Theory: Rational Egoism
  3. Reason and Ethics
  4. Criticisms of Rand’s Ethics
  5. Conflicts of Interest
  6. Rand’s Influence
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Ayn Rand’s life was often as colorful as those of her heroes in her best-selling novels The Fountainhead and Atlas Shrugged. Rand first made her name as a novelist, publishing We the Living (1936), The Fountainhead (1943), and her magnum opus Atlas Shrugged (1957). These philosophical novels embodied themes she subsequently developed in nonfiction form in a series of essays and books written in the 1960s and 1970s.

Born in St. Petersburg, Russia, on February 2, 1905, Rand was raised in a middle-class family. As a child, she loved storytelling, and at age nine she decided to become a writer. In school she showed academic promise, particularly in mathematics. Her family was devastated by the communist revolution of 1917, both by the social upheavals that the revolution and the ensuing civil war brought and by her father’s pharmacy being confiscated by the Soviets. The family moved to the Crimea to recover financially and to escape the harshness of life the revolution brought to St. Petersburg. They later returned to Petrograd (the new name given to St. Petersburg by the Soviets), where Rand was to attend university.

At the University of Petrograd, Rand concentrated her studies on history, with secondary focuses on philosophy and literature. At university, she was repelled by the dominance of communist ideas and strong-arm tactics that suppressed free inquiry and discussion. As a youth, she had been repelled by the communists’ political program, and now an adult, she was also more fully aware of the destructive effects that the revolution had had on Russian society more broadly.

Having studied American history and politics at university, and having long been an admirer of Western plays, music, and movies, she became an admirer of American individualism, vigor, and optimism, seeing them as the opposites of Russian collectivism, decay, and gloom. Not believing, however, that she would be free under the Soviet system to write the kinds of books she wanted to write, she resolved to leave Russia and go to America.

Rand graduated from the University of Petrograd in 1924. She then enrolled at the State Institute for Cinema Arts in order to study screenwriting. In 1925, she finally received permission from the Soviet authorities to leave the country in order to visit relatives in the United States. Officially, her visit was to be brief; Rand, however, had already decided not to return to the Soviet Union.

After several stops in western European cities, Rand arrived in New York City in February 1926. From New York, she traveled on to Chicago, Illinois, where she spent the next six months living with relatives, learning English, and developing ideas for stories and movies. She had decided to become a screenwriter, and, having received an extension to her visa, she left for Hollywood, California.

On Rand’s second day in Hollywood, an event occurred that was worthy of her fiction. She was spotted by Cecil B. DeMille, one of Hollywood’s leading directors, while she was standing at the gate of his studio. She had recognized him as he was passing by in his car, and he had noticed her staring at him. He stopped to ask why she was staring, and Rand explained that she had recently arrived from Russia, that she had long been passionate about Hollywood movies, and that she dreamed of being a screenwriter. DeMille was then working on “The King of Kings,” and gave her a ride to his movie set and signed her on as an extra. During her second week at DeMille’s studio, another significant event occurred: Rand met Frank O’Connor, a young actor also working as an extra. Rand and O’Connor were married in 1929, and they remained married for fifty years until his death in 1979.

Rand worked for DeMille as a reader of scripts and struggled financially while working on her own writing. She also held a variety of non-writing jobs until in 1932 she was able to sell her first screenplay, “Red Pawn,” to Universal Studios. Also in 1932 her first stage play, “Night of January 16th,” was produced in Hollywood and later on Broadway.

Rand had been working for years on her first significant novel, We the Living, and finished it in 1933. However, for several years it was rejected by various publishers, until in 1936 it was published by Macmillan in the U.S. and Cassell in England. Rand described We the Living as the most autobiographical of her novels, its theme being the brutality of life under communist rule in Russia. We the Living did not receive a positive reaction from American reviewers and intellectuals. It was published in the 1930s, a decade sometimes called the “Red Decade,” during which American intellectuals were often pro-communist and respectful and admiring of the Soviet experiment.

Rand’s next major project was The Fountainhead, which she had begun to work on in 1935. While the theme of We the Living was political, the theme of The Fountainhead was ethical, focusing on individualist themes of independence and integrity. The novel’s hero, the architect Howard Roark, is Rand’s first embodiment of her ideal man, the man who lives on a principled and heroic scale of achievement.

As with We the Living, Rand had difficulties getting The Fountainhead published. Twelve publishers rejected it before it was published by Bobbs-Merrill in 1943. Again not well received by reviewers and intellectuals, the novel nonetheless became a best seller, primarily through word-of-mouth recommendation. The Fountainhead made Rand famous as an exponent of individualist ideas, and its continuing to sell well brought her financial security. Warner Brothers produced a movie version of the novel in 1949, starring Gary Cooper and Patricia Neal, for which Rand wrote the screenplay.

In 1946, Rand began work on her most ambitious novel, Atlas Shrugged. At the time, she was working part-time as a screenwriter for producer Hal Wallis. In 1951, she and her husband moved to New York City, where she began to work full-time on Atlas. Published by Random House in 1957, Atlas Shrugged is her most complete expression of her literary and philosophical vision. Dramatized in the form of a mystery about a man who stopped the motor of the world, the plot and characters embody the political and ethical themes first developed in We the Living and The Fountainhead and integrates them into a comprehensive philosophy including metaphysics, epistemology, economics, and the psychology of love and sex.

Atlas Shrugged was an immediate best seller and Rand’s last work of fiction. Her novels had expressed philosophical themes, although Rand considered herself primarily a novelist and only secondarily a philosopher. The creation of plots and characters and the dramatization of achievements and conflicts were her central purposes in writing fiction, rather than presenting an abstracted and didactic set of philosophical theses.

The Fountainhead and Atlas Shrugged, however, had attracted to Rand many readers who were strongly interested in the philosophical ideas the novels embodied and in pursuing them further. Among the earliest of those with whom Rand became associated and who later became prominent were psychologist Nathaniel Branden and economist Alan Greenspan, later Chairman of the Federal Reserve. Her interactions with these and several other key individuals were partly responsible for Rand’s turning from fiction to nonfiction writing in order to develop her philosophy more systematically.

From 1962 until 1976, Rand wrote and lectured on her philosophy, now named “Objectivism.” Her essays during this period were mostly published in a series of periodicals: The Objectivist Newsletter, published from 1962 to 1965; the larger periodical The Objectivist, published from 1966 to 1971; and then The Ayn Rand Letter, published from 1971 to 1976. The essays written for these periodicals form the core material for a series of nine nonfiction books published during Rand’s lifetime. These books develop Rand’s philosophy in all its major categories and apply it to cultural issues. Perhaps the most significant of these books are The Virtue of Selfishness, which develops her ethical theory, Capitalism: The Unknown Ideal, devoted to political and economic theory, Introduction to Objectivist Epistemology, a systematic presentation of her theory of concepts, and The Romantic Manifesto, a theory of aesthetics.

During the 1960s, Rand’s most significant professional relationship was with Nathaniel Branden. Branden, author of The Psychology of Self-Esteem and later known as a leader in the self-esteem movement in psychology, wrote many essays on philosophical and psychological topics that were published in Rand’s books and periodicals. He was the founder and head of the Nathaniel Branden Institute, the leading Objectivist institution of the 1960s. Based in New York City, the Nathaniel Branden Institute published with Rand’s sanction numerous periodicals and pamphlets and sponsored many lectures in New York that were then distributed on tape around the United States and the rest of the world. The rapid growth of the Nathaniel Branden Institute and the Objectivist movement came to a halt in 1968 when, for both professional and personal reasons, Rand and Branden parted ways.

Rand continued to write and lecture consistently until she stopped publishing The Ayn Rand Letter in 1976. Thereafter she wrote and lectured less as her husband’s health declined, leading to his death in 1979, and as her own health began to decline. Rand died on March 6, 1982, in her New York City apartment.

2. Rand’s Ethical Theory: Rational Egoism

The provocative title of Ayn Rand’s The Virtue of Selfishness matches an equally provocative thesis about ethics. Traditional ethics has always been suspicious of self-interest, praising acts that are selfless in intent and calling amoral or immoral acts that are motivated by self-interest. A self-interested person, on the traditional view, will not consider the interests of others and so will slight or harm those interests in the pursuit of his own.

Rand’s view is that the exact opposite is true: Self-interest, properly understood, is the standard of morality and selflessness is the deepest immorality.

Self-interest rightly understood, according to Rand, is to see oneself as an end in oneself. That is to say that one’s own life and happiness are one’s highest values, and that one does not exist as a servant or slave to the interests of others. Nor do others exist as servants or slaves to one’s own interests. Each person’s own life and happiness are their ultimate ends. Self-interest rightly understood also entails self-responsibility: One’s life is one’s own, and so is the responsibility for sustaining and enhancing it. It is up to each of us to determine what values our lives require, how best to achieve those values, and to act to achieve those values.

Rand’s ethic of self-interest is integral to her advocacy of classical liberalism. Classical liberalism, more often called “libertarianism” in the twentieth century, is the view that individuals should be free to pursue their own interests. This implies, politically, that governments should be limited to protecting each individual’s freedom to do so. In other words, the moral legitimacy of self-interest implies that individuals have rights to their lives, their liberties, their property, and the pursuit of their own happiness, and that the purpose of government is to protect those rights. Economically, leaving individuals free to pursue their own interests implies in turn that only a capitalist or free market economic system is moral: Free individuals will use their time, money, and other property as they see fit, and will interact and trade voluntarily with others to mutual advantage.

3. Reason and Ethics

Fundamentally, the means by which humans live is reason. Our capacity for reason is what enables us to survive and flourish. We are not born knowing what is good for us; that is learned. Nor are we born knowing how to achieve what is good for us; that too is learned. It is by reason that we learn what is food and what is poison, what animals are useful or dangerous to us, how to make tools, what forms of social organization are fruitful, and so on.

Thus, Rand advocates rational self-interest: One’s interests are not whatever one happens to feel like; rather it is by reason that one identifies what is in one’s interest and what is not. By the use of reason one takes into account all of the factors one can identify, projects the consequences of potential courses of action, and adopts principled policies of action.

The principled policies a person should adopt are called virtues. A virtue is an acquired character trait; it results from identifying a policy as good and committing to acting consistently in terms of that policy.

One such virtue is rationality: Having identified the use of reason as fundamentally good, the virtue of rationality is being committed to acting in accordance with reason. Another virtue is productiveness: Given that the values one needs to survive must be produced, the virtue of productiveness is being committed to producing those values. Another is honesty: Given that facts are facts and that one’s life depends on knowing and acting in accordance with the facts, the virtue of honesty is being committed to awareness of the facts.

Independence and integrity are also core virtues for Rand’s account of self-interest. Given that one must think and act by one’s own efforts, being committed to the policy of independent action is a virtue. And given that one must both identify what is in one’s interests and act to achieve it, the virtue of integrity is a policy of being committed to acting on the basis of one’s beliefs. The opposite policy of believing one thing and doing another is of course the vice of hypocrisy; hypocrisy is a policy of self-destruction, on Rand’s view.

Justice is another core self-interested virtue: Justice, on Rand’s account, means a policy of judging people, including oneself, according to their value and acting accordingly. The opposite policy of giving to people more or less than they deserve is injustice. The final virtue on Rand’s list of core virtues is pride, the policy of “moral ambitiousness,” in Rand’s words. This means a policy of being committed to making oneself be the best one can be, of shaping one’s character to the highest level possible.

The moral person, in summary, on Rand’s account, is someone who acts and is committed to acting in their best self-interest. It is by living the morality of self-interest that one survives, flourishes, and achieves happiness.

4. Criticisms of Rand’s Ethics

Every aspect of Rand’s philosophy is subject to lively criticism and debate, but her normative views are the ones most focused upon.

From the broadly defined conservative right, the main criticisms are (a) that Rand’s metaphysical naturalism involves an atheism that undercuts religious metaphysics, (b) that her strong emphasis upon empirical data and reason undercut epistemologies based on faith and tradition, and (c) that her normative individualism undercuts the commands of duty, obligation and selflessness that are necessary for achieving social values. From the left, again defined broadly, the main criticisms are (a) that Rand’s individualism atomistically isolates each of us from genuine society, (b) that her advocacy of free markets enables strong-versus-weak exploitation, and in left-postmodern critique (c) that her philosophical fundamentals commit her to an untenable foundationalism and absolutism.

Here we will focus only on the arguments over Rand’s account of self-interest, which is currently a minority position and subject to strong criticism from both the philosophical left and the philosophical right.

The contrasting view of self-interest typically pits it against morality, holding that one is moral only to the extent that one sacrifices one’s self-interest for the sake of others or, more moderately, to the extent one acts primarily with regard to the interests of others. For example, standard versions of morality will hold that one is moral to the extent one sets aside one’s own interests in order to serve God, or the weak and the poor, or society as a whole. On these accounts, the interests of God, the poor, or society as a whole are held to be of greater moral significance than one’s own, and so accordingly one’s interests should be sacrificed when necessary. These ethics of selflessness thus believe that one should see oneself fundamentally as a servant, as existing to serve the interests of others, not one’s own. “Selfless service to others” or “selfless sacrifice” are stock phrases indicating these accounts’ view of appropriate motivation and action.

One core difference between Rand’s self-interest view and the selfless view can be seen in the reason why most advocates of selflessness think self-interest is dangerous: conflicts of interest.

5. Conflicts of Interest

Most traditional ethics take conflicts of interest to be fundamental to the human condition, and take ethics to be the solution: Basic ethical principles are to tell us whose interests should be sacrificed in order to resolve the conflicts. If there is, for example, a fundamental conflict between what God wants and what humans naturally want, then religious ethics will make fundamental the principle that human wants should be sacrificed for God’s. If there is a fundamental conflict between what society needs and what individuals want, then some versions of secular ethics will make fundamental the principle that the individual’s wants should be sacrificed for society’s.

Taking conflicts of interest to be fundamental almost always stems from one of two beliefs: that human nature is fundamentally destructive or that economic resources are scarce. If human nature is fundamentally destructive, then humans are naturally in conflict with each other. Many ethical philosophies start from this premise—for example, Plato’s myth of Gyges, Jewish and Christian accounts of original sin, and Freud’s account of the id. If what individuals naturally want to do to each other is rape, steal, and kill, then in order to have society these individual desires need to be sacrificed. Consequently, a basic principle of ethics will be to urge individuals to suppress their natural desires so that society can exist. In other words, self-interest is the enemy, and must be sacrificed for others.

If economic resources are scarce, then there is not enough to go around. This scarcity then puts human beings in fundamental conflict with each other: For one individual’s need to be satisfied, another’s must be sacrificed. Many ethical philosophies begin with this premise. For example, Thomas Malthus’s theory that population growth outstrips growth in the food supply falls into this category. Karl Marx’s account of capitalist society is that brutal competition leads to the exploitation of some by others. Garrett Hardin’s famous use of the lifeboat analogy asks us to imagine that society is like a lifeboat with more people than its resources can support. And so, in order to solve the problem of destructive competition the lack of resources leads us to, a basic principle of ethics will be to urge individuals to sacrifice their interests in obtaining more, or even some, so that others may obtain more or some and society can exist peacefully. In other words, in a situation of scarcity, self-interest is the enemy and must be sacrificed for others.

Rand rejects both the scarce resources and destructive human nature premises. Human beings are not born in sin or with destructive desires; nor do they necessarily acquire them in the course of growing to maturity. Instead one is born morally tabula rasa (“blank slate”), and through one’s choices and actions one acquires one’s character traits and habits. As Rand phrased it, “Man is a being of self-made soul.” Having chronic desires to steal, rape, or kill others is the result of mistaken development and the acquisition of bad habits, just as are chronic laziness or the habit of eating too much junk food. And just as one is not born lazy but can by one’s choices develop oneself into a person of vigor or sloth, one is not born antisocial but can by one’s choices develop oneself into a person of cooperativeness or conflict.

Nor are resources scarce, according to Rand, in any fundamental way. By the use of reason, humans can discover new resources and how to use existing resources more efficiently, including recycling where appropriate and making productive processes more efficient. Humans have, for example, continually discovered and developed new energy resources, from animals to wood to coal to oil to nuclear fission to solar panels; and there is no end in sight to this process. At any given moment, the available resources are a fixed amount, but over time the stock of resources are and have been constantly expanding.

Because humans are rational they can produce an ever-expanding number of goods, and so human interests do not fundamentally conflict with each other. Instead, Rand holds that the exact opposite is true: Since humans can and should be productive, human interests are deeply in harmony with each other. For example, my producing more corn is in harmony with your producing more peas, for by our both being productive and trading with each other we are both better off. It is to your interest that I be successful in producing more corn, just as it is to my interest that you be successful in producing more peas.

Conflicts of interest do exist within a narrower scope. For example, in the immediate present available resources are more fixed, and so competition for those resources results, and competition produces winners and losers. Economic competition, however, is a broader form of cooperation, a social way to allocate resources without resorting to physical force and violence. By competition, resources are allocated efficiently and peacefully, and in the long run more resources are produced. Thus, a competitive economic system is in the self-interest of all of us.

Accordingly, Rand argues that her ethic of self-interest is the basis for personal happiness and free and prosperous societies.

6. Rand’s Influence

The impact of Rand’s ideas is difficult to measure, but it has been large. All her books were still in print as of 2017, had sold more than thirty million copies, and continued to sell approximately one million copies each year. A survey jointly conducted by the Library of Congress and the Book of the Month Club early in the 1990s asked readers to name the book that had most influenced their lives: Atlas Shrugged was second only to the Bible. Excerpts from Rand’s works are regularly reprinted in college textbooks and anthologies, and several volumes have been published posthumously containing her early writings, journals, and letters. As an outsider with iconoclastic views, Rand’s influence within the academic world has been limited, though university press books and scholarly articles about her work continue to be published regularly. Outside the academic world are several institutes founded by those influenced by Rand. Noteworthy among these are the Cato Institute, based in Washington, D.C., the leading libertarian think tank. Rand, along with Nobel Prize-winners Friedrich Hayek and Milton Friedman, was highly instrumental in attracting generations of individuals to the libertarian movement. Also noteworthy are the Ayn Rand Institute, founded in 1985 by philosopher Leonard Peikoff and entrepreneur Edward Snider and based in California, and The Atlas Society, founded in 1990 by philosopher David Kelley and based in Washington, D.C.

7. References and Further Reading

a. Primary Sources

  • Rand, Ayn. Atlas Shrugged. Random House, 1957.
    • Rand’s magnum opus of fiction.
  • Rand, Ayn. Capitalism: The Unknown Ideal. New American Library, 1967.
    • A collection of twenty of Rand’s essays on politics, history, and economics. Also includes two essays by psychologist Nathaniel Branden, three by economist Alan Greenspan, and one by historian Robert Hessen.
  • Rand, Ayn. The Fountainhead. Bobbs-Merrill, 1943.
    • The novel of individualism, independence, and integrity that made Rand famous.
  • Rand, Ayn. Introduction to Objectivist Epistemology. New American Library, 1979.
    • Rand’s theory of concept-formation. Includes an essay by philosopher Leonard Peikoff on the analytic/synthetic distinction.
  • Rand, Ayn. Philosophy: Who Needs It. Bobbs-Merrill, 1982.
    • A collection of Rand’s essays on the nature and significance of philosophy, including her critiques of other thinkers such as Kant, Aristotle, Rawls, and Skinner.
  • Rand, Ayn. The Romantic Manifesto. World Publishing, 1969. Paperback edition: New American Library, 1971.
    • A collection of Rand’s essays on philosophy of art and aesthetics.
  • Rand, Ayn. The Virtue of Selfishness. New American Library, 1964.
    • A collection of fourteen of Rand’s essays on ethics. Also includes five essays by psychologist Nathaniel Branden.
  • Rand, Ayn. We the Living. Macmillan, 1936.
    • Rand’s first novel, set in the Soviet Union in the years following the Russian Revolution.

b. Secondary Sources

  • Badhwar, Neera, and Long, Roderick T. “Ayn Rand,” The Stanford Encyclopedia of Philosophy, 2010/2016.
    • Two philosophers present an overview of Rand’s life and work in the major areas of philosophy, with special attention to several major disagreements among philosophers working within Objectivism.
  • Binswanger, Harry. The Biological Basis of Teleological Concepts. Los Angeles, CA: A.R.I. Press, 1990.
    • Written by a philosopher, this is a scholarly work focused on the connection between biology and the concepts at the roots of ethics.
  • Branden, Nathaniel. The Vision of Ayn Rand: The Basic Principles of Objectivism. Cobden Press, 2009.
    • A comprehensive overview of Rand’s philosophy based on the lecture series presented under Rand’s auspices in the 1960s.
  • Branden, Nathaniel, and Branden, Barbara. Who Is Ayn Rand? New York: Random House, 1962.
    • This book contains essays on Objectivism’s moral philosophy, its connection to psychological theory, and a literary study of Rand’s novel methods. It contains an additional biographical essay, tracing Rand’s life from birth up until her mid-50s.
  • Burns, Jennifer. Goddess of the Market: Ayn Rand and the American Right. Oxford University Press, 2009.
    • Written by a historian, a scholarly discussion of Rand’s ambiguous relationship with free market, libertarian, and conservative movements.
  • Gotthelf, Allan and Salmieri, Gregory. A Companion to Ayn Rand. Wiley-Blackwell, 2016.
    • The editors have compiled a series of scholarly entries on all of the major elements of Rand’s philosophy.
  • Gotthelf, Allan and Lennox, James. Concepts and Their Role in Knowledge. University of Pittsburgh Press, 2013.
    • Ten philosophers debate Rand’s epistemology, with focused articles on her theories of perception, concepts, and scientific method.
  • Gotthelf, Allan and Lennox, James. Metaethics, Egoism, and Virtue: Studies in Ayn Rand’s Normative Theory. University of Pittsburgh Press, 2010.
    • Eight philosophers debate Rand’s ethical theory.
  • Hessen, Robert. In Defense of the Corporation. Stanford, CA: Hoover Institution, 1979.
    • An economic historian, Hessen argues and defends from an Objectivist perspective the moral and legal status of the corporate form of business organizations.
  • Hicks, Stephen. “Ayn Rand and Contemporary Business Ethics.” Journal of Accounting, Ethics, and Public Policy 3:1, 2003.
    • A philosopher explores the implications of Rand’s ethics for the foundations of business ethics.
  • Hicks, Stephen. “Egoism in Nietzsche and Ayn Rand.” Journal of Ayn Rand Studies 10:2, 2009.
    • A philosopher compares and contrasts the positions that underlie Nietzsche’s and Rand’s theses on egoism and altruism.
  • Kelley, David. The Evidence of the Senses. Baton Rouge: Louisiana State University Press, 1986.
    • Written by a philosopher working within the Objectivist tradition, this scholarly work in epistemology focuses on the foundational role the senses play in human knowledge.
  • Mayhew, Robert. Ayn Rand’s Marginalia. New Milford, CT: Second Renaissance Books, 1995.
    • This volume contains Rand’s critical comments on over twenty thinkers, including Friedrich Hayek, C. S. Lewis, and Immanuel Kant. Edited by a philosopher, the volume contains facsimiles of the original texts with Rand’s comments on facing pages.
  • Peikoff, Leonard. The Ominous Parallels: The End of Freedom in America. New York: Stein & Day, 1982.
    • A scholarly work in the philosophy of history, arguing Objectivism’s theses about the role of philosophical ideas in history and applying them to explaining the rise of National Socialism.
  • Peikoff, Leonard. Objectivism: The Philosophy of Ayn Rand. New York: Dutton, 1991.
    • This is the first comprehensive overview of all aspects of Objectivist philosophy, written by the philosopher closest to Rand during her lifetime.
  • Rasmussen, Douglas and Douglas Den Uyl, editors. The Philosophic Thought of Ayn Rand. Urbana, IL: University of Illinois Press, 1984.
    • A collection of scholarly essays by philosophers, defending and criticizing various aspects of Objectivism’s metaphysics, epistemology, ethics, and politics.
  • Reisman, George. Capitalism: A Treatise on Economics. Ottawa, IL: Jameson Books, 1996.
    • A scholarly work by an economist, developing free-market capitalist economic theory, especially that coming out of the Austrian tradition, and connecting it to Objectivist philosophy.
  • Sciabarra, Chris Matthew. Ayn Rand, The Russian Radical. University Park: Pennsylvania State University Press, 1995.
    • A work in history of philosophy, this book attempts to trace the influence upon Rand’s thinking of dialectical approaches to philosophy prevalent in 19th century Europe and Russia. Also an introduction and overview of the major branches of Objectivist philosophy.
  • Smith, Tara. Ayn Rand’s Normative Ethics: The Virtuous Egoist. Cambridge University Press, 2006.
    • A scholarly work by a philosopher on Rand’s meta-ethics and its application in normative ethics.
  • Wilkinson, Will, editor. “What’s Living and Dead in Ayn Rand’s Moral and Political Thought?” Cato Unbound, 2010.
    • Four professors of philosophy—Douglas B. Rasmussen, Michael Huemer, Neera K. Badhwar, and Roderick T. Long—discuss and debate the current state of Rand scholarship.
  • Zwolinski, Matthew. “Is Ayn Rand Right about Rights?” Learn Liberty, April 2017.
    • A philosophy professor argues that Rand’s theory of individual rights is subject to three major criticisms.

 

Author Information

Stephen R. C. Hicks
Email: shicks@rockford.edu
Rockford University
U. S. A.

The African Predicament

The African predicament is a concept that explains the aggregate of plights that threaten the African people. It is also an account that combines methods from various disciplines since the robustness of the theme is not limited to the field of philosophy alone but serves as a problem for consideration in the social sciences, sciences, and the humanities; thus it interrogates the predicaments of the African from all these perspectives. This task of interrogation becomes more demanding because of the critical analytic outlook of philosophy in embracing various methods that are relevant in the African predicament. Although it places the African as being on the defensive side of reality having been bedeviled with numerous plagues, it does not exempt the African race in the execution of the problematic situations in which they find themselves. While constantly searching for scapegoats to apportion blame in order to gain psychological relief, Africans are also a threat to themselves; hence a people who have been trained to laugh at themselves bears the greater burden of ensuring liberation, not from the clutches of an alien, but from the enemy that lies within. Clarke (1991, 24) puts it succinctly when he said that a people who have been dehumanized have among them a separate group who are at odds within themselves. It is worth noting that this article aims to present what is basic and common knowledge insofar as the theme of African predicament is concern rather than an attempt to demonize any particular race/people.

Table of Contents

  1. Introduction
  2. Various Dimensions of the African Predicament
    1. Economic Enslavement and the Crisis of Leadership
    2. Mis-education of Africans and the Falsification of History
      1. Why is the Study of African History Necessary?
    3. Philosophy and Western Historiography
    4. Culture and Identity Crisis
    5. Religion
  3. Conclusion
  4. References and Further Readings

1. Introduction

The concept of the African predicament is a holistic one that can be viewed through various lenses depending on the approach(es) that scholars decide to begin their debate. So no single scholar exhausts in totality this captivating yet problematic theme, but even from their relative perspectives, there is a meeting point on the consensus of that which depicts the African predicament. Stanislav Andreski succinctly captures the purpose of the theme that largely lies in exploring those obstacles that bedevil the African continent and thus hinder it on the road to wholesome prosperity, internal peace, and basic/fundamental freedom (Andreski, 1968, 11). Obi Oguejiofor attests to the relativity of such claims but goes further to opine that in the midst of divergent opinions of the African plague, there is a general consensus that much of Africa is in a precarious state, and this concern runs very deep in the mind of an African (Oguejiofor, 2001, 7). Although the predicament of the African people ranges from cultural, political, economic religious, historical, and psychological factors, there is a single thread that binds all of these together in the collective psyche of the African. Until the African is able to mentally decolonize himself or herself, there would be a constant race-to-the-bottom-approach away from all other factors that make the liberation and prosperity of the African race possible. The African predicament becomes a relevant theme in African philosophy because while the sociologist, psychologist, historian, artist, scientist, and so on create ideological superstructure in their various disciplines, philosophy, which thrives mainly on objectivity, harmonizes experiences and views from all fields of study in a critically acclaimed manner without bias/preference to any; therefore, philosophy is that necessary stem that should hold all other branches in an objectified manner. In not laying claim to any discipline, philosophy lays claim to all disciplines.

When we make reference to the African predicament, “African” in this context is not limited to a black individual on the African soil. This is because over the years, the dispersal of the African people has been made possible by the event of colonialism and imperialism. Even before the advent of colonialism, Africans were evidently residing in parts of the Western hemisphere antedating the Columbus-discovery conspiracy theory (Imhotep, 2012, 17). Both in Africa and every other continent where the black individual is located, the plights are similar. Wilson attests to the fact that the black race is wholly bonded not necessarily because they came from the womb of same woman but because the shared experience coupled with the long history of collective parentage brings them together (Wilson, 2014, 50).

2. Various Dimensions of the African Predicament

One of the greatest challenges Africans have is the location of affirmation of their true identity. They carry such burden into economics, culture, religion, and every facet of their being. Thus, having been washed white as snow from their sins by a white Jesus, because sin which is immaterial has been bestowed with a Lockean secondary quality of darkness/blackness, they also wait for a white-washing of their economy, education, culture, religion, and so on. Hence, the richness of their identity as Africans is not dependent on what they make of it but is dependent on what the American or European says it is. The immateriality of salvation by a white savior and the materiality of socio-political, economic, and cultural redemption from the racialisation of colour bear some semblance (Wilson 2014, 38). And since even the location of heaven and hell is determined by the white individual, they are left with the consolation of both heaven and hell as alternative possibilities so as to distract them from their obligations in the material world, all to the advantage of the colonizers who are then left to extract from the natural universe. Since they have no unique identity of their own, they become whoever anybody says they are. At that point when they forget their location in time and space for the benefit of heaven alone, social amnesia sets in (Wilson, 2014, 41, 58).

It comes to the fore that the African predicament began with a racial distinction of colours; however, it is now evident that this predicament has transcended from raciality into the mind. Hence, it is evidently possible to see black individuals in white souls carrying a burden of identity everywhere they go; this burden of identity further complicates the problem because while the average African sees himself or herself as black, he or she fails to also see beyond this Lockean secondary quality of colours. One of the plights that bedevil the African continent is the inability to rally round a race-centred consciousness that sees the black individual first before any other: “No race of people has triumphed without these vital motivational, mental and behavioral orientations, for they are the keystones in the construction of liberated and prosperous peoples” (Wilson, 2014, 375).

The African predicament is explicated within five major themes. The first theme, which is the notion of economic enslavement and the crisis of leadership, x-rays how the problem of lack of leadership on the part of Africans has been exploited to imperialize the African continent and control its resources. The second theme is the mis-education of Africans and the falsification of history and how it has further led to collective historical dementia. Because the education that is received is largely a transmission from the occidental world, the third theme explicates how philosophy and western historiography interrogate this quest to further expunge the African continent from the development of the history of philosophy. The fourth theme is the culture and identity crisis where there is a gradual effort to show how the craving for an identity, other than African, depletes the collective humanity of the African people. Finally, the fifth theme exposes how religion is used to enslave, rather than liberate, the black individual. The onus rests on no one else but the African people to disturb the equilibrium in order to regain independence in all spheres of their being.

a. Economic Enslavement and the Crisis of Leadership

Long before Oguejiofor wrote his Philosophy and the African Predicament, Andreski had written a piece centred on the theme, The African Predicament. However, it is worth noting that in no page of his work did Oguejiofor make reference to the work of Andreski, yet in all their submissions, the problems noted by Andreski in the ’60s are still the same problems, bearing the same form with different matter (matter and form are used here in the philosophical sense). Similarly in writing his book, The Destruction of Black Civilization, Chancellor Williams made reference to the fact that the subtitle of the book, Great Issues of a Race from 4500 B.C. to 2000 A.D., represents a present continuous usage since “the main obstacles which confronted us in the past and are with us today will still be with us in the year 2000 and after, but also that for the rest of this century it is very likely that the Blacks will still be meeting, listening to and applauding fiery, soul-stirring speeches, protesting and denouncing injustices or happily relying on politics as the ultimate solution of our problems” (Williams, 1987, 320). Moreover, Wilson reflects the same thinking in his work, The Falsification of Afrikan Consciousness, that the problems of the black race today in seeking recognition predate the present moment. This cry has been going on since the nineteenth century, and even in American society there had always been inclusion of the black race in the activities of the nation, but this inclusivity is neutralized with a white-dominated ideology (Wilson, 2014, 7-12).

One of the most evident precarious situations of the African people is an imperial-centered economics. It is so-called because Africa is given little or no freedom to take its own destiny necessary for economic prosperity into its own power. Even under conditions where it appears to have economic policies, most of such policies are directed towards the prosperity of other continents, particularly Europe and America. Against this backdrop, the black race is miseducated into believing that the workability of an African centred economics using the African banking system is not feasible, even when such a system has proven to produce the best possible outcomes when properly utilized (Wilson, 2014, 738-740; Wilson, 2014, 45-51). In the 21st century, where emphasis is placed on ideological wars, every nation and continent is in a quick rush to out-compete the other. It would therefore be against Africa’s best interests to wait patiently for other continents to decide its fate instead of taking revolutionary steps.

In the era of colonization, peasant farmers did their jobs under the supervision of the colonial masters. Even on the peasant farmers’ own soil, the land on which they labored to sustain themselves and their families, the colonial supervisors decided how much the peasant farmers earned and determined the quantity of food that accrued to them on accomplishment of their daily tasks. Even in time of harvest, the colonial masters determined the price of the goods from the owners of the land (Andreski 26). This practice was only a rehash of what was to come several years after the colonial masters left Africa. This was a process of the satellisation of Africa, where every economic activity on African soil is directed towards the needs of Europe (Oguejiofor 39). This was not unconnected with the doctrine of the divine right of the white people to exploit the black race. Hence, even a King of France named Francis sought for a clause in “Adam’s Will” that prevented him from his own share of the wealth in Africa (Du Bois, 1999, 27; Clarke 29).

Recalling that it is these very developed nations that control the economic realities in the world, the international market forces have consistently provided the wrong outcomes for poor nations. Problems begin largely when developed nations cajole developing nations to sign up to rules that make sense and are meant for the prosperity of the already developed ones. One evident example that spans through history is the patenting of products and natural resources that emanate from African soil. King Henry VI of England is alleged to have given a letter patent to a Belgian in 1449 for a twenty-year monopoly in the production of stained glass (Bolton, 2008, 208). Some of those other natural resources patented are a hardy grain from Ethiopia called teff, which is a basic ingredient for the diet of the whole nation; an extract of the Aloe Ferox plant from Lesotho, which helps to lighten the skin; and an enzyme in Lake Nakuru in Kenya, brazzein (a protein that is said to be 500 times sweeter than sugar and gotten from a plant in Gabon). Although, this enzyme acquired from Kenya is controversial (as of 2017) because the Kenyan government denied granting permission for research in Lake Nakuru since it did not derive any benefits from this research (Bolton 209). Also, since then  another evident example is the problem of patenting HIV/AIDS drugs in the guise of the Trade Related Aspects of Intellectual Property Rights (TRIP Agreement) (Peter Mugyenyi 2008). Even the establishment of lending agencies such as the International Monetary Fund and the World Bank are imperial constructions partly to exert control over developing nations within Africa. Most of the staff of these banks and lending institutions reside in Europe and America and dictate economic policies for the African continent. They fly into African capitals just occasionally on mission after which they return back to Europe and America and pronounce economic solutions for the countries in Africa in a one-fit-all-approach. Is it possible to understand the intricacies of a country’s, or a continent’s economy in this way? Instead the countries are left with long-range prescriptions that have little or no bearing on the practical realities of the African people: “It’s all but akin to a doctor trying to carry out a major medical checkup by phone” (Bolton 137). Africa is always carried away by the charade of grants and loans that emerge from these institutions but are at the same time tied to conditions that are beneficial primarily to the imperialists.

These “grants” which amount to hardly two percent of the West’s Gross National Product and inflated in their importance and size are made to appear to be generous and voluntarily given from the heart. Western aid is propagandistically made to appear to be designed to support the development and growth of the social and economic infrastructure of recipient Afrikan nations” (Wilson, 2014, 381).

They are therefore lured into adjustment programmes that lead to currency devaluation, high interest rates, privatization of state enterprise, liberalization of imports, and so on (Wilson 368-369). Eventually, the monies that go out of the African continent annually far surpass the grants and loans that come into the continent (Wilson 373). What all these then portend for the African continent is the determination of the prizes of its exported goods by Europe and America, with these economies also fixing the prices of their imports into the African continent. And because the profits that stem out of these international transactions are from the processing/manufacturing stage, the imperialists control that aspect of production too. The poor nations are left unprotected as they rely on one or two exports for their economy, which the developed nations can refuse to buy in order to force prices down. It is not enough for these developing economies to produce, they must also have markets where the goods can be sold, so that the imperialists, that primarily control the forces of demand and supply, do not have the full power to decide to close a developing economy’s markets.

The lack of vision and creativity on the part of African leaders, coupled with the fact that their bellies have become their gods, make it impossible to take initiative to open local markets to engage in transnational trade among fellow African nations. The practical result of all these nonresident tribulations is the West trying to eat its cake and have it at the same time, without in the first place giving the chefs enough ingredients to bake it (Bolton 133).

All these set the foundation for various economic Structural Adjustment Programmes that created room for a fixed economy instead of a free one. It seems, therefore, that the economic structure created by the West is unchangeable because of its inherent correctness. However, it is a structure that is considered sufficient and necessary for any economy, notwithstanding the peculiarity of a people: “Consequently, all economic systems must submit to only one law; the law of “adjustment” to an infallible and immutable economic structure” (Ramose, 2002, 4). In this new law of economics, Social Darwinism is imported into the law of capitalism, and in this law, it is insignificant how the poor end up because only the fittest survives. Each nation that then labours under a structural adjustment programme is a perpetual borrower and a consequent debtor: “that debt is expected to be paid to the faceless bankers of the West by peasant farmers who have neither health facilities, schools for their children, nor adequate shelter over their heads” (Awoonor 2006, 264).

The population of sub-Saharan Africa is fast rising, and it is expected that the growth of the economy should match the rising population. Unfortunately, the staggering realities in Africa’s economy show the contrary. With the Gross National Product of the African continent south of the Sahara (this excludes South Africa) estimated to be about the same with Belgium (a country which had a population of about 10 million in 1991), it is startling to think about the possibility of the African continent having to take care of its population that is estimated to reach 1.6 billion by the year 2030, when it was only 600 million in 1991.

It is more appalling that Africa has refused to take responsibility for its own development. Constantly relying on attractive stipend loans and grants from international agencies and institutions such as the IMF and World Bank, it makes African leaders even more unresponsive to the plight of their continent. Some African elites, as well as Western imperialists, have been beneficiaries of this corporate greed since the West deliberately make deals with corrupt African elites, who will always fall for the profit they stand to gain from such businesses. The cumulative effect would be the West singing anthems of corruption and ineptitude on the part of African leaders, even when the whole scene is a plot from both parties; both Western imperialists and African leaders have been peddlers of the same boat that sinks the African continent. This has led to agitations from Afro-centrists who demand a move from the current state of equilibrium, requesting an exit date of these international institutions away from Africa, because coupled with the crisis of irresponsibility on the part of Africans, these institutions are mere watchdogs of the progress of the African continent. The unfortunate scenario that plays out for Africans is that they provide the capital that is required to finance their oppression.

No programme can truly be considered African if it is fashioned towards the advancement of non-African cultures.

b. Mis-education of Africans and the Falsification of History

i. Why is the Study of African History Necessary?

  • The psychology of individuals and groups could partly be formed from “historical and experiential amnesia.” (When an individual or group is compelled by various circumstances to repress important segments of their formative history).
  • “To manipulate history is to manipulate consciousness; to manipulate consciousness is to manipulate possibilities; and to manipulate possibilities is to manipulate power” (Wilson 2014, 2).
  • History brings liberation.
  • It creates a sense of self-identity. History creates identity and in creating identity, it also distorts identity. Part of the study of Egyptology is all about taking back Africa’s history that has been distorted, or rather stolen, by the European. Through historiography Africans can engage in the study of European history without knowing that these are mere projected myths of the reality of the African people. Hence, they can be reading history about the African people while at the same time giving credence to another race without knowing that such stories have African roots. If we don’t know ourselves, we become a puzzle unto ourselves and other persons become puzzles to us as well; therefore, we carry on a wrong identity everywhere we go.
  • Socio-Political Role of History: Sometimes we ask ourselves: “Why is the study of history necessary?” “What relevance does the study of the people and cultures of Africa have for them when it cannot be translated into economic prosperity that puts bread on their tables?” It is worthy to note that there is a direct, or indirect, relationship between history and money, history and economics, history and power. The question they need to ask themselves in this regard is: “If there was no direct relation between history and power or economics, then why is it that the Europeans rewrote history?” This question calls for serious reflection.
  • History and Psychology: People who have faint knowledge of history are more susceptible to manipulation than those who are knowledgeable in history. When we don’t have interest in history, all we do is to merely follow orders. We don’t need to just study computer science or mathematics, we must understand the psychology of the people who run this world. Since the study of European history is not at the centre of the high school educational curriculum, many Africans are persuaded into believing that the study of their own history is not worth it. Indeed, their ignorance of their history has a different consequence that is more injurious than the West’s ignorance of its own history. For example, a church building is a historical event so too is a bank, and this is an indication that even if individuals from the Caucasian race do not study their history, they still live with and see such historical edifice on a daily basis. History is, therefore, not only written in books, but also becomes an unwritten piece as it is lived daily. It is the past and the future all embedded in a mobile present.
  • Language and Power: If it is indeed true that a creator made the universe and gave humankind the power to name things, then by having such power, humans also had dominion. There is a relationship between naming and dominion, between naming and reality. When a people then relinquish to others the authority to name and define, they permit them to have dominion and control over their being; therefore, there is a close relationship between dominion and history, as history, when actualized, guarantees dominion.

Africans have been presented with a misconstrued history about themselves and their cultures and have been labeled several derogatory titles that have their basis in colonialism. The history of the African peoples and their cultures is not the history of European plunder of African soil. They have a rich history and culture that predate colonialism.

No nation can afford to treat with levity the education of its citizens since the kind of education a people receive either make or unmake them. This is very important because a weak educational system translates into a frail economy in the future. A people’s economy is largely a reflection of the education bequeathed to them and the quality of their economy cannot be better than the quality of their education.

Fortunately, while the education received in other advanced nations prepares them to meet the demands of each age, the education bequeathed to Africans kept them in a static position. Such industrial education that was necessary to meet the challenges of every season that was and is still being received by the blacks was merely to master skills already relegated in progressive societies (Woodson, 2010, 15). This was a deliberate attempt to perpetually relinquish the blacks to the Stone Age and even the study of history has been distorted to remove Africans from the scene of events. European historiography presents the African people as a race that was once upon a time not a people, and has since been grafted into the siblinghood of humanity through Western benevolence. It is partly this distorted knowledge they have about Africa that constantly and consistently depletes their identity. No one expects masters to reproduce their own history, while at the same time exalting slaves. History, as long as it is written and taught by the conqueror, will always be written and taught to the conqueror’s advantage. Thus, when people hear names of towns and historical events such as the University of Djenne (University of Sankhore in Timbuktu), the story of the scramble for Babatus, the ploy of king Necho II, which dates as far back as 600 B.C.E. (Babatus was later renamed Cameroons), and Goshen, which is the alleged birth place of the Bible character Moses (which is actually in Egypt), what they simply do is project whiteness into them, without at the same time knowing that the renaming of cities and events in Africa was a preconceived ideology by the colonial masters since it aids in the distortion of African history. Therefore, the African people could clearly be reading history about themselves, attributing greatness to those who accomplished such historical projects, without at the same time knowing that reference is being made to their ancestors.

Looking through history and viewing some alleged treaties that Africans supposedly made with the Europeans, it is important to note that there were some forms of historical forgery made at a time when African kings had no knowledge of the English language, yet these African leaders made and signed treaties in English (Jochannan, 1991, 18). The fact that Africans along the Nile valley were already in their 13th dynastic period, even when the Biblical Abraham was born, goes to show to a large extent how they have been white-washed by the falsification of African history in order to promote and maintain Eurocentric dominance: “Colonialism brings us to a kind of history written by the conqueror for the conquered to read and enjoy. When the conquered looks around and finds that even God speaks from the heart of the conqueror, the conquered then becomes suspicious of God” (Jochannan 60).

Nobody speaks of the plunder of a virgin land, for humans only struggle for domination in a world where fellow humans live. So when Hegel referred to Africa as a continent that is unhistorical, without movement or development, it is a contradiction of the scramble for Africa that started as far back as 1675 B.C.E. (Jochannan 16) and is still being scrambled for in the present era. This same continent without a history is the same continent that transported a life of contemplative devotion into Europe; a people that had a conception of resurrection and immortality long before the Christianization of those ideas in Europe (Diop 32); and a continent that pioneered foundational patriotism in an organized form (Diop 19) with an organized political, economic, and even educational system. Even in music the African continent creates a lasting influence because one of the authentic forms of art in America is jazz, which could not have been feasible without the rhythmic structure of Africa and the drums. Even the tales and folk-lore of the people of America are not original to them (Awoonor 86; Du Bois, 1994, 7).

Therefore, there is a great need for a new historiography of the African race by the African people themselves, not a history that is based on the adventures of Europe but that is instead premised on life of the African and the very things that characterize the essence of their being.

c. Philosophy and Western Historiography

Philosophy is literally the love of wisdom. The nature of philosophical discourse makes it possible for philosophy to thrive on criticality. Until recently, the study of philosophy in many of Africa’s colleges has been the study of the history of western thoughts. Thus, those responsible for the teaching of such disciplines in universities are always in a hurry to limit their findings to the beginning of philosophy as bequeathed to them by the Europeans, all in the bid to justify a failed deduction whose conclusion having been made, must necessarily be supported by every premise, not minding the wrongness of such foundation. Since the West therefore began its study of history with an epigram that Africa is without history, progress, and development, every study therefore must tend towards proving the said statement. It had therefore been wrongly pre-established from the onset that it was impossible for the task of philosophizing to have been done in Africa before the advent of the West. And because sometimes people misconstrue history to be the study of the past, the past for an African always begins with its plunder by Europe. Their notion of slavery takes their mind back to Africa and to the black race in general, without averting their minds to the fact that some other races outside the African race were also colonized. The history of Africa must and can only be rewritten by African scholars, as the history of a people that predates the existence of the white man on African soil. But this is possible only within a reinvented mind.

It is worth noting that students were being instructed by the Egyptian priests before the invasion of Egypt by Alexander the Great of Greece. It was within this period that the libraries in Egypt were converted into research centres by the school of Aristotle. The philosophers and scientists of this period, apart from the studies they received directly in Egypt, were also beneficiaries of the invasion as citizens of Greece. It is therefore not surprising that these philosophers were always victims of the Athenian government that always persecuted them because of ideologies that were alien to the people of Athens and were then considered unacceptable. How did they then arrive at a conclusion that ancient Greek philosophers, who were vilified for importation of thoughts that were alien to the Greek world, were actually the inventors of those ideas? The history of the life and thought of many of these prominent ancient Greek philosophers is filled with contradictions that can only be solved when we resort to faith as the arbiter of everything. Aristotle, for instance, who studied under the tutelage of Plato, became a great scientist while his tutor was a known philosopher. The fact that a philosopher could turn out a graduate in the sciences proper can only be resolved mysteriously. That Plato could keep Aristotle for a period of twenty years, tutoring him of that, which he is ignorant of himself, becomes questionable. It is therefore noteworthy that although Aristotle studied in Egypt, he made a library for himself from books that were carried from Egypt by Alexander. Plato who then studied philosophy and learnt the ten virtues in Egypt, from which he developed his own four cardinal virtues, could not have been the tutor of Aristotle (James, 2015, 1-3). This fiction continues with the fact that Theophrastus and Eudemus had studied physics, geometry, astronomy, arithmetic and theology all under Aristotle. The idea that different persons could at the same time specialize in different disciplines, some of which are distantly related, under a single tutor, is only acceptable in societies where myth is equated with reality. The fact that even the birth date of Thales is in contest bears semblance with modern day history in some societies where illiteracy makes it impossible for people to give a precise date of birth of their kindred, yet we are made to believe that this struggle among the Greeks is common with literate people. Even the positing of water as the foundation of everything that exists, or an indeterminate substance, including the theory surrounding fire, air, and so on as the primary stuff underlying everything, had long been held as a debate in Egypt; it is needful also to mention the famous Socratic dictum: “man know thyself,” which is actually an Egyptian dictum that has long been Europeanized (Obenga 2004). Since the foundation they build on is held tightly with litany of lies, it may therefore become impossible for their house to stand. People should not also forget that the history of the Greeks, during the time which the Greek philosophers were engaged in their so-called philosophical thoughts, was filled with wars and counter wars, which anyone would be quick to tell that it would be a misnomer to conceive of great intellectual strides in war-ridden areas of Iraq, Syria, Nigeria, and so on. However, it was normal for the Greeks of the ancient period under such violent and unstable conditions to be engaged in serious academic discourse. Nevertheless, like the origin of humankind that is written in favour of the white race: “this unfortunate position of the African continent and its peoples appears to be the result of misrepresentation upon which the structure of race prejudice has been built…that the African is backward, that its people are backward, and that their civilization is also backward” (James 5). The Copernican revolution used to characterize the achievement of Immanuel Kant in philosophy is another case in focus. While the name is derived from the heliocentric movement of the earth around the sun, as proposed by Nicolaus Copernicus in the 16th century, the ancestors of the Africans in Egypt had long held this heliocentric discovery in Egypt before Christ was born (B.C.) (Jackson, 1985, 24-25).

d. Culture and Identity Crisis

The very notion of the miseducation of the African people takes people to the question of identity crisis. One of the features of identity is the fact that it bears a relational character with that very concept, which a thing is identified with. To speak of an African identity is to have a character that is truly African and such character is not individuated, as it is shared among the same species. When a people have no identity that is uniquely theirs, they crave for any form of identity and because such identity comes from without and not within, it denigrates a people to a second order position. Because Africans carry an identity that is alien to them, there is an inbuilt inferiority complex in them. Hence, it does not matter whether they think, or feel that they are inferior or not, the inferiority is created by the existence of such institution that denies a people an identity that they can rightly call theirs. But this deliberate inferiority complex that is institutionalized has an aim, for it leads to dangerous brainwashing that further leads to self-erosion, shame, and self-alienation. Its negative impact is very broad since it erodes the relational character which identity seeks to create (Oguejiofor, 2007, 69). The war on race therefore creates a false biological determinism that makes people think that nature has endowed some races better than others and made them superior to some others. In terms of achievement, a Negro has no identity of his or her own, but they see themselves through the lens of another.

The colonial masters knew this very well; they had to destroy the foundation that made this relational character possible. For instance, the slaves in the United States of America had to be disconnected from family ties to make revolution impossible, thus destroying their loyalty system. Awoonor (2006) gives a vivid account of this identity crisis as it was and is still exemplified in the lives of the black individual. It all began with the history that was taught to the black individual and the various myths about the African race. The oppressed were taught that Africa was a land of barbarous instincts and primitive ties, where the consumption of human flesh featured prominently. The African therefore came to learn that civilization, modernization, and exposure are defined in terms of a total disconnect from African culture. This then went a long way to create an internal phobia for blackness, especially in the African-American and by such attitude, the African-Americans had phobia even for themselves. They were therefore programmed to be ashamed of their ancestry and should therefore count themselves lucky specimens that were dragged out of their homeland in chains. Being a lucky specimen, the best that could happen to the African individual is that he or she was merely worthy of observation. But the danger inherent in this type of specimen is that it is treacherous to that which is under observation, not minding whether the outcome is either positive or negative since it is still an instrument to be used, even if it is of a meritorious type. And because the debate on who is and who is not rational is the foundation of all racism, the colonial masters were in a hurry to justify the thesis of Hegel, that Africans, being a people without history but paradoxically have a history of darkness, it was illogical to conceive of them as a people with reason. They could be therefore worthy of slaves, but they must be slaves of a different kind because of their incapacity to reason, slaves of a subhuman kind that could have neither will nor freedom. Any attempt to bestow on them any knowledge that only humans were capable of grasping would be a contradiction of their irrationality.

What is most pathetic is that slavery has taken another form that has bestowed on the Africans the position of reformed serfdom. In colonial times, most slaves never saw their masters do some field work, they came to equate work with slavery while the ability to live in affluence and not work at the same time was equated with lordship. Many people of the black race therefore struggle to acquire material resources, which for them are synonymous with freedom. This is a distortion that was learnt from the colonial masters. It is worthy to note that the circumstance under which the masters acquired wealth was totally different from the thinking of the slaves. Having utilized their brains to exploit the resources on the African soil, the Europeans lived in a way that the oppressed saw as ostentatious. But since the oppressor was physically present in Africa temporarily, such exalted lifestyle was meant to shield them and their associates from the public in order to guarantee security. The oppressed having been physically free, must now struggle to live in such a gigantic way since the presence of the colonial masters on the African continent suggests that to be free, a person must be wealthy and wealth in this sense proceeds from the materiality of resources. Since the gorgeously dressed slaves who served the slave masters were the envy of fellow slaves, the black individual now equated elegance to being a master. Thus, even after the physical chains have been let loose off their arms and legs, they must still struggle to appear like the slave masters. Eventually, this desire to always look like the slave master always reminds them of their position as slaves that they truly are in their current thinking. While the blacks who can actually afford such exorbitant way of life end up as celebrities who must perform to the admiration of the masters and their households, what therefore was done under duress during colonial times due to the fear of being killed is now done willingly by those, who once oppressed, now take such acts of performance as a profession (Akbar 2014). The oppressed, therefore, moves around with the illusion that they have the mind of the master. It is a clear indication that they are far from freedom. To be ultimately free, they must understand that wealth proceeds first from the intangibility of resources, which can only be acquired from decolonizing the mind.

This was the clear-cut difference between the slaves that were prevalent in African society before colonialism and those introduced by the oppressors. In pre-colonial black Africa, slaves were in hierarchy; however, even in their hierarchical status, they were slaves of a human and not a sub-human kind. Masters, therefore, never exploited their slaves should the occasion arise since they were of different social status; while slaves exploited slaves, those of a higher social status exploited each other (Diop, 1987, 6, 10-11). This is not to downplay the role of the African in this whole predicament because when greed sets in and African leaders began to sell fellow blacks to Arabs, it opened a new page in the undermining of the identity of the black race. African leaders became more enthusiastic in the lucrativeness of such trade. This template was to be followed by the Europeans when they seized the task of enslavement from the Arabs (Williams 55). In all of this, there is still an inherent irrationality in the psyche of the conqueror, which is the fact that there was a victor. Victory in this sense arises from battle or struggle and a people must be in contest over an event or issue with certain rivals. No people need to claim victory over another as long as the preys are not armed in any sense to fight the predators; what the predators ought to simply do is to invade and not merely conquer. Victory for the conqueror should only be a natural result of the irrationality of the prey. However, this was not taken into consideration in the denial of rationality thesis to the African people as argued by the Europeans. But even the scramble for Africa in the nineteenth century was not without its own resistance. A people who have been brutally handicapped by the stroke of the pen should therefore be allowed to carve their own destiny and not be compelled to engage in a race with the rest of the developed world.

e. Religion

The whole concept of religion is not a coinage of the supernatural, but a formulation of humankind to bring the human person closer to the consciousness to the supersensible. The Christian religion becomes a victim in this research because large parts of Africa were colonized by nations that opined that salvation comes only through the Christian route. Thus, one of the criteria for slavery was the fact that a people or a nation was unchristian. Religious authorities operating under the form of religious tyranny gave permission to every nation in Europe to reduce every person in Africa to servitude who never accepted the Christian religion. It is an indicator that Africans have been a spiritual people before slavery since one of the primary reasons for Africa’s enslavement was that the people were not Christians. Not minding the form of religion being practiced by a people in Africa, its basis of reducing an entire race to the level of animals was the fact that they believed in the Supreme Being in a way that the Europeans did not believe. The justification for the enslavement of the African people was further given a biblical foundation. Was it therefore not a preconceived coincidence that the bible, which was used as a tool for colonialism, was given to the black individual with various verses that endorsed slavery as a divine order? It would be incongruous, and a deliberate denial, that has its foundation on irrationality, to conclude that the idea of religion was unknown to the African people before the emergence of the white individual. History, however, has revealed that there is virtually nothing that is found in Christianity that is new to the human person, except for the untrained mind. Since the African people had a vivid knowledge of the supersensible before colonialism, there must be reason(s) for the establishment of an organized religion as imported into various colonies.

When we go through the various mythologies surrounding the concept of death and the afterworld of the Bantu people of Africa, the Masarwas, the Ashanti, the Nandi and Wabende peoples of East Africa, and so on, we would understand that pre-colonial Africa did not only have organized religions, but it also had an elated form of spirituality. Thus, when Lightfoot and Ussher (Jackson 5) announced that the creation of the world dates back to 4004 B.C.E. in justification of the biblical foundations of the universe, it becomes clear that its main task was to justify a theocratic system that is based on European establishment. This is because a careful study of history shows that the 13th dynastic period of the African race predates the failed thesis of the existence of Adam and Eve. When even history records about twenty-five pre-Christian saviour-gods, all born of virgins (Jackson 38), the whole project of Christianity and its subsequent projection of the white race and the quick rush to vilify the black race is an indication that raciality and distortion are found in scripture. When even scripture is used by the oppressor to project whiteness, then it becomes evident that Christianity came in the first place, not primarily because there was a messiah to be advertised to the Africans that the conqueror wanted the oppressed to be beneficiaries, but because it fosters economic manipulation of the black race. Christianity, which therefore teaches that the end does not justify the means, uses its own end as a guide to all other means. This is seen in the use of the Babylonian baal to promote the Christian religion, while at the same time condemning such practices as pagan (Jackson 43). The worship of baal is therefore wrong in Christian theology, but the legend of the Babylonian Bel is right for propagating of Christian ideology. This use of double standards in Christian doctrine is the very standard that is used to condemn traditional religious practices in African society while at the same time using the myths of the same African religion to justify Christology. Black becomes demonic while white is used to represent everything that is honourable. It therefore leaves them amazed that while the image of God and his angelic hosts are Caucasians, the pictorial representation of the devil is black. Those who conceived this notion never averted their minds to the fact that even the devil was part of the angelic hosts before the supposed fall from grace. If everything evil has a black tag, did the devil who is the architect and bearer of evil turn black after the fall, or was he not Caucasian while he was among the angelic hosts? This Christological chauvinism has always been used to cage the African mind in a box, where slavery is justified in order to condemn everything that is African. It becomes laughable that almost everyone who encounters the divine in a vision always sees the heavenly hosts dressed in the racial colour of his or her oppressor. This gives them more reasons to justify godliness in their oppressor, while at the same time perceiving godlessness in their fellow black individuals. And since the person who controls your mind also controls your destiny, in this battle for ideological control of the civilization, the African civilization risks extinction.

Since extraction was the primary driving motive of the emergence of Christianity in Africa, one could therefore go into a church building in the guise that minds are being renewed. Renewal becomes the imperial masters’ coating of re-colonization. Where are Africans in all of these? They end up being nothing better than emancipated slaves. By this emancipation, slavery was transformed but not eliminated. An emancipated slave is one who is given the privilege of carrying the Lockean secondary quality of emancipation, while at the same time neglecting the fact that being a slave is the primary motive that makes a slave who he or she is; he or she is therefore nothing better than an exalted servant. Emancipation then becomes a profound adjective used to glorify a treacherous noun (slave). It is therefore evident from the above that Christianity and other various forms of worship, apart from being kinds of religions, are also ideologies, and like every ideology, one of its main features is to make it sellable to the masses. Real emancipation of the African would not be possible until every trace of Caucasian association with divinity is unveiled. This does not also require its substitution with black pictorial images. They must understand that ultimate liberation recognizes the supremacy of God transcending both Caucasian and black flesh, whom both races aspire to resemble in perfection and not permanently locked in a material pictorial form created by naïve minds (Akbar 68).

3. Conclusion

Freedom is not freely given, for it is always demanded with some form of revolution. In this predicament that faces the African people, the task of regaining humanity does not just lie with those whose humanity has been stolen, but also with the very persons who stole them. Dehumanization makes some persons beasts while regarding others as sub-humans. So both the beasts and the sub-humans are under a form of distortion, and they must be transformed back to a human state. This is very vital to the idea of wholesome decolonization because both the oppressed and the oppressor are afraid of freedom, for while the former fears being free, the latter fears losing the freedom to oppress (Freire, 2005, 46). Although Africans recognize the ills brought upon their colonies by their masters, the future of the African continent largely depends on the possibility of getting Africans who will drive a community-based ideology as against the egocentric lifestyle that led to its downfall in the first place and still puts the continent below other world powers. Lamentations must therefore be shifted from the West and centred on the role Africans have played over the years in the financing of its downfall. There is need for a rigorous reconstruction of the landscape of the African mind. This cannot just be done by mere bodily repatriation of Africans abroad but by the repatriation of the African mind.

4. References and Further Readings

  • Akbar, Na’im. Breaking the Chains of Psychological Slavery. Tallahasee: Mind Productions Associates Inc. 1996.
    • It gives an account of mental decolonization of the black race.
  • Andreski, Stanislav. The African Predicament. London: Michael Joseph Ltd; 1968.
    • It is a summary of the plights of the African race from numerous perspectives.
  • Appiah, Kwame. In my Father’s House. Africa in the Philosophy of Culture. Oxford: Oxford University Press, 1992.
    • A thought provoking discourse centred on race, culture and identity of the African.
  • Awoonor, Kofi. The African Predicament. Legon: Sub-Saharan Publishers, 2006.
    • Utilizing practical experiences in providing a detailed account of the plights of the African at home and abroad.
  • Bernal, Martin. Black Athena. The Afroasiatic Roots of Classical Civilization. Vol. 1. New Jersey: Rutgers University Press, 1987.
    • A detailed account of the fabrications and reordering of history.
  • Clarke, John Henrik. Christopher Columbus and the Afrikan Holocaust. New York: A&B Books Publishers, 1992.
    • A brief summary of the glorious and inglorious past of Africa.
  • Diop, Cheikh Anta. Precolonial Black Africa. New York: Lawrence Hill Books, 1987.
    • A summary of the life and practice of the African people prior to their contact with Europe.
  • Du Bois, W.E.B. The Souls of Black Folk. New York: Dover Publications Inc; 1994.
    • Largely borne out of personal experience, the book is a reflection on the plight of the black race as evident primarily in the lives of the African American.
  • Du Bois, W.E.B. Darkwater: Voices from Within the Veil. New York: Dover Publications Inc; 1999.
    • An autobiographical essay largely centred around black race predicament and consciousness.
  • Du Bois, W.E.B. The Education of Black People. New York: Monthly Review Press, 2001.
    • A critical study on how black people can acquire power from education.
  • Foreman, Christopher ed. The African American Predicament. Washington D.C: Brookings Institution Press, 1999.
    • This work gives an insight into plights of the black American
  • Freire, Paulo. Pedagogy of the Oppressed. New York: Continuum International Publishing Group Inc; 2005.
    • It is a detailed experiential account of oppression and how the oppressed could be truly free indeed.
  • Harrison, Hubert. When Africa Awakes. Baltimore: Black Classic Press, 1997.
    • Centred around the liberation of black people, it is an essay on race consciousness.
  • Imhotep, David. The First Americans were Africans. Bloomington: AuthorHouse, 2012.
    • It gives a justification of the occupation of the land of America by Africans before any other race.
  • Jackson, John. Christianity Before Christ. Texas: American Atheist Press, 1985.
    • In interrogating the existence of the Christian religion and how it emanates from different cultural/religious practices, it claims that nothing that Christianity projects is new.
  • James, George. Stolen Legacy. The Egyptian Origins of Western Philosophy. San Bernardino, CA: A Traffic Output Publication, 2015.
    • It is a summary of the existence of Greek philosophy from ancient Egypt and ways to decolonize the continent.
  • Jochannan, Yosef ben & Clarke John Henrik. From the Nile Valley to the New World. Science, Invention & Technology: New Dimensions in African History. New Jersey: Africa World Press Inc; 1991.
    • An historical account of the achievement of Africans in the field of science and technology.
  • Mugyenyi, Peter. Genocide by Denial. Kampala: Fountain Publishers, 2008.
    • A detailed account of how patenting of drugs has been used to exploit the African continent.
  • Obenga, Theophile. African Philosophy. The Pharaonic Period: 2780-330 BC. Per Ankh, 2004.
    • Highlighting some of the philosophical thought of ancient Egypt, it traces the origin of philosophy and presents Africa as the foundation of philosophy.
  • Oguejiofor, J. Obi. Philosophy and the African Predicament. Ibadan: Hope Publications Ltd; 2001.
    • On the use of philosophy in the liberation of the African continent.
  • Ramose, Mogobe. African Philosophy Through Ubuntu. Harare: Mond Books, 2002.
    • Using the Bantu people of Africa, the book gives an account of what African philosophy entails.
  • Sertima, Ivan Van. They Came Before Columbus. The African Presence in Ancient America. New York: Random House Trade Paperbacks, 2003.
    • Using archeological findings, it tells the history of the presence of the African in America prior to any other race.
  • Williams, Chancellor. The Destruction of Black Civilization: Great Issues of Race from 4500 B.C. to 2000 A.D. Chicago: Third World Press, 1987.
    • A thorough historical piece of the black race with a detailed account of the problems and prospects of the continent.
  • Wilson, Amos. Blueprint for Black Power: A Moral, Political and Economic Imperative for the Twenty-First Century. New York: Afrikan World InfoSystems, 2014.
    • An extensive book on black oppression by white supremacy and the wholesome liberation of the black race.
  • Wilson, Amos. The Falsification of Afrikan Consciousness. Eurocentric History, Psychiatry and the Politics of White Supremacy. New York: Afrikan World InfoSystems, 2014.
    • Journeying into the mind of the African, it suggests ways to liberate the African from the bondage of Eurocentrism.
  • Woodson, Carter. The Mis-Education of the Negro. New York: Seven Treasures Publications, 2010.
    • It is a concise essay on the liberation of the black man from mental and physical slavery.

 

Author Information

Isaiah Aduojo Negedu
Email: negedu.isaiah@fulafia.edu.ng
Federal University Lafia
Nigeria

Thomas Aquinas (1224/6—1274)

aquinasSt. Thomas Aquinas was a Dominican priest and Scriptural theologian. He took seriously the medieval maxim that “grace perfects and builds on nature; it does not set it aside or destroy it.” Therefore, insofar as Thomas thought about philosophy as the discipline that investigates what we can know naturally about God and human beings, he thought that good Scriptural theology, since it treats those same topics, presupposes good philosophical analysis and argumentation. Although Thomas authored some works of pure philosophy, most of his philosophizing is found in the context of his doing Scriptural theology. Indeed, one finds Thomas engaging in the work of philosophy even in his Biblical commentaries and sermons.

Within his large body of work, Thomas treats most of the major sub-disciplines of philosophy, including logic, philosophy of nature, metaphysics, epistemology, philosophical psychology, philosophy of mind, philosophical theology, the philosophy of language, ethics, and political philosophy. As far as his philosophy is concerned, Thomas is perhaps most famous for his so-called five ways of attempting to demonstrate the existence of God. These five short arguments constitute only an introduction to a rigorous project in natural theology—theology that is properly philosophical and so does not make use of appeals to religious authority—that runs through thousands of tightly argued pages. Thomas also offers one of the earliest systematic discussions of the nature and kinds of law, including a famous treatment of natural law. Despite his interest in law, Thomas’ writings on ethical theory are actually virtue-centered and include extended discussions of the relevance of happiness, pleasure, the passions, habit, and the faculty of will for the moral life, as well as detailed treatments of each one of the theological, intellectual, and cardinal virtues. Arguably, Thomas’ most influential contribution to theology and philosophy, however, is his model for the correct relationship between these two disciplines, a model which has it that neither theology nor philosophy is reduced one to the other, where each of these two disciplines is allowed its own proper scope, and each discipline is allowed to perfect the other, if not in content, then at least by inspiring those who practice that discipline to reach ever new intellectual heights.

In his lifetime, Thomas’ expert opinion on theological and philosophical topics was sought by many, including at different times a king, a pope, and a countess. It is fair to say that, as a theologian, Thomas is one of the most important in the history of Western civilization, given the extent of his influence on the development of Roman Catholic theology since the 14th century. However, it also seems right to say—if only from the sheer influence of his work on countless philosophers and intellectuals in every century since the 13th, as well as on persons in countries as culturally diverse as Argentina, Canada, England, France, Germany, India, Italy, Japan, Poland, Spain, and the United States—that, globally, Thomas is one of the 10 most influential philosophers in the Western philosophical tradition.

Table of Contents

  1. Life and Works
    1. Life
    2. Works
  2. Faith and Reason
  3. Philosophy of Language: Analogy
  4. Epistemology
    1. The Nature of Knowledge and Science
    2. The Extension of Science
    3. The Four Causes
      1. The Efficient Cause
      2. The Material Cause
      3. The Formal Cause
      4. The Final Cause
    4. The Sources of Knowledge: Thomas’ Philosophical Psychology
  5. Metaphysics
    1. On Metaphysics as a Science
    2. On What There Is: Metaphysics as the Science of Being qua Being
  6. Natural Theology
    1. Some Methodological Considerations
    2. The Way of Causation: On Demonstrating the Existence of God
    3. The Way of Negation: What God is Not
      1. God is Not Composed of Parts
      2. God is Not Changeable
      3. God is Not in Time
    4. The Way of Excellence: Naming God in and of Himself
  7. Philosophical Anthropology: The Nature of Human Beings
  8. Ethics
    1. The End or Goal of Human Life: Happiness
    2. Morally Virtuous Action as the Way to Happiness
      1. Morally Virtuous Action as Pleasurable
      2. Morally Virtuous Action as Perfectly Voluntary and the Result of Deliberate Choice
      3. Morally Virtuous Action as Morally Good Action
      4. Morally Virtuous Action as Arising from Moral Virtue
    3. Human Virtues as Perfections of Characteristically Human Powers
      1. Infused Virtues
      2. Human Virtues
    4. The Logical Relations between the Human Virtues
    5. Moral Knowledge
    6. The Proximate and Ultimate Standards of Moral Truth
  9. Political Philosophy
    1. Law
      1. The Nature of Law
      2. The Different Kinds of Law
        1. The Eternal Law
        2. The Natural Law
        3. The Divine Law
        4. Human Law and its Relation to Natural Law
    2. Authority: Thomas’ Anti-Anarchism
    3. The Best Form of Government
  10. References and Further Reading
    1. Thomas’ Works
    2. Secondary Sources and Works Cited
    3. Bibliographies and Biographies

1. Life and Works

a. Life

St. Thomas Aquinas was born sometime between 1224 and 1226 in Roccasecca, Italy, near Naples. Thomas’ family was fairly well-to-do, owning a castle that had been in the Aquino family for over a century. One of nine children, Thomas was the youngest of four boys, and, given the customs of the time, his parents considered him destined for a religious vocation.

In his early years, from approximately 5 to 15 years of age, Thomas lived and served at the nearby Benedictine abbey of Monte Cassino, founded by St. Benedict of Nursia himself in the 6th century. It is here that Thomas received his early education. Thomas’ parents probably had great political plans for him, envisioning that one day he would become abbot of Monte Cassino, a position that, at the time, would have brought even greater political power to the Aquino family.

Thomas began his theological studies at the University of Naples in the fall of 1239. In the 13th century, training in theology at the medieval university started with additional study of the seven liberal arts, namely, the three subjects of the trivium (grammar, logic, and rhetoric) and the four subjects of the quadrivium (arithmetic, geometry, music, and astronomy), as well study in philosophy. As part of his philosophical studies at Naples, Thomas was reading in translation the newly discovered writings of Aristotle, perhaps introduced to him by Peter of Ireland. Although Aristotle’s Categories and On Interpretation (with Porphyry’s Isagoge, known as the ‘old logic’) constituted a part of early medieval education, and the remaining works in Aristotle’s Organon, namely, Prior Analytics, Posterior Analytics, Topics, and Sophismata (together known as the ‘new logic’) were known in Europe as early as the middle of the 12th century, most of Aristotle’s corpus had been lost to the Latin West for nearly a millennium. By contrast, Arab philosophers such as Ibn Sina or Avicenna (c. 980-1087) and Ibn Rushd or Averroes (1126-1198) not only had access to works such as Aristotle’s De Anima, Nicomachean Ethics, Physics, and Metaphyiscs, they produced sophisticated commentaries on those works. The Latin West’s increased contact with the Arabic world in the 12th and 13th centuries led to the gradual introduction of these lost Aristotelian works—as well as the writings of the Arabic commentaries mentioned above—into medieval European universities such as Naples. Philosophers such as Peter of Ireland had not seen anything like these Aristotelian works before; they were capacious and methodical but never strayed far from common sense. However, there was controversy too, since Aristotle seemed to teach things that contradicted the Christian faith, most notably that God was not provident over human affairs, that the universe had always existed, and that the human soul was mortal. Thomas would later try to show that such theses either represented misinterpretations of Aristotle’s works or else were founded on probabilistic rather than demonstrative arguments and so could be rejected in light of the surer teaching of the Catholic faith.

It was in the midst of his university studies at Naples that Thomas was stirred to join a new (and not altogether uncontroversial) religious order known as the Order of Preachers or the Dominicans, after their founder, St. Dominic de Guzman (c. 1170-1221), an order which placed an emphasis on preaching and teaching. Although Thomas received the Dominican habit in April of 1244, Thomas’ parents were none too pleased with his decision to join this new evangelical movement. In order to talk some sense into him, Thomas’ mother sent his brothers to bring him to the family castle sometime in late 1244 or early 1245. Back at the family compound, Thomas continued in his resolve to remain with the Dominicans. Having resisted his family’s wishes, he was placed under house arrest. A famous story has it that one day his family members sent a prostitute up to the room where Thomas was being held prisoner. Apparently, they were thinking that Thomas would, like any typical young man, satisfy the desires of his flesh and thereby “come back down to earth” and see to his familial duties. Instead, Thomas supposedly chased the prostitute out of the room with a hot poker, and as the door slammed shut behind her, traced a black cross on the door. Eventually, Thomas’ mother relented and he returned to the Dominicans in the fall of 1245. Despite these family troubles, Thomas remained dedicated to his family for the rest of his life, sometimes staying in family castles during his many travels and even acting late in his life as executor of his brother-in-law’s will.

Recognizing his talent early on, the Dominican authorities sent Thomas to study with St. Albert the Great at the University of Paris for three years, from 1245-1248. Thomas made such an impression on Albert that, having been transferred to the University of Cologne, Albert took Thomas along with him as his personal assistant.

From 1252-1256, Thomas was back at the University of Paris, teaching as a Bachelor of the Sentences. We might think of Thomas’ position at Paris at this time as roughly equivalent to an advanced graduate student teaching a class of his or her own. In addition to his teaching duties, Thomas was also required, in accord with university standards of the time, to work on a commentary on Peter the Lombard’s Sentences. We might think of Thomas’ commentary on the Sentences as roughly equivalent to his doctoral dissertation in theology.

At 32 years of age (1256), Thomas was teaching at the University of Paris as a Master of Theology, the medieval equivalent of a university professorship. After teaching at Paris for three years, the Dominicans moved Thomas back to Italy, where he taught in Naples (from 1259-1261), Orvietto (1261-1265), and Rome (1265-1268). It was during this period, perhaps in Rome, that Thomas began work on his magisterial Summa theologiae.

Thomas was ordered by his superiors to return to the University of Paris in 1268, perhaps to defend the mendicant way of life of the Dominicans and their presence at the university. (Like the Franciscans, the Dominicans depended upon the charity of others in order to continue their work and survive. This sometimes meant they had to beg for their food. In doing so, the members of the mendicant orders consciously saw themselves as living after the pattern of Jesus Christ, who, as the Gospels depict, also depended upon the charity of others for things to eat and places to rest during his public ministry.) Thomas ended up teaching at the University of Paris again as a regent Master from 1268-1272. While he was at the University of Paris, Thomas also famously disputed with philosophers who contended on Aristotelian grounds—wrongly in Thomas’ view—that all human beings shared one intellect, a doctrine that Thomas argued was incompatible with personal immortality and moral responsibility, not to mention our experience of ourselves as individual knowers.

In 1272, the Dominicans moved Thomas back to Naples, where he taught for a year. In the middle of composing his treatise on the sacraments for the Summa theologiae around December of 1273, Thomas had a particularly powerful religious experience. After the experience, despite constant urging from his confessor and assistant Reginald of Piperno, Thomas refused any longer to write. Called to be a theological consultant at the Second Council of Lyon, Thomas died in Fossanova, Italy, on March 7, 1274, while making his way to the council.

Canonized in 1323, Thomas was later proclaimed a Doctor of the Church by Pope St. Pius V in 1567. In 1879, Pope Leo XIII published the encyclical Aeterni Patris, which, among other things, holds up Thomas as the supreme model of the Christian philosopher. Through his voluminous, insightful, and tightly argued writings, Thomas continues to this day to attract numerous intellectual disciples, not only among Catholics, but among Protestants and non-Christians as well.

b. Works

Thomas is famous for being extremely productive as an author in his relatively short life. For example, he authored four encyclopedic theological works, commented on all of the major works of Aristotle, authored commentaries on all of St. Paul’s letters in the New Testament, and put together a verse by verse collection of exegetical comments by the Church Fathers on all four Gospels called the Catena aurea. Such examples constitute only the beginning of a comprehensive list of Thomas’ works. His literary output is as diverse as it is large. Thomas’ body of work can be usefully split up into nine different literary genera: (1) theological syntheses, for example, Summa theologiae and Summa contra gentiles; (2) commentaries on important philosophical works, for example, Commentary on Aristotle’s Nicomachean Ethics and Commentary on Pseudo-Dionysius’ De divinis nominibus; (3) Biblical commentaries, for example, Literal Commentary on Job and Commentary and Lectures on the Epistles of Paul the Apostle; (4) disputed questions, for example, On Evil and On Truth; (5) works of religious devotion, for example, the Liturgy of Corpus Christi and the hymn Adoro te devote; (6) academic sermons, for example, Beata gens, sermon for All Saints; (7) short philosophical treatises, for example, On Being and Essence and On the Principles of Nature; (8) polemical works, for example, On the Eternity of the World against Murmurers, and (9) letters in answer to requests for an expert opinion, for example, On Kingship. For present purposes, this article focuses on the first four of these literary genera. This should be enough to demonstrate the capaciousness of Thomas’ thought.

Thomas’ most famous works are his so-called theological syntheses. Thomas composed four of these during his lifetime: his commentary on Peter Lombard’s Sentences, Summa contra gentiles, Compendium theologiae, and Summa theologiae. Although each of these works was composed for different reasons, they are nonetheless similar insofar as each of them attempts to communicate clearly and defend the substance of the Catholic faith in a manner that can be understood by someone who has the requisite education, that is, training in the liberal arts and Aristotle’s philosophy of science. Although Thomas aims at both clarity and brevity in the works, because Thomas also aims to speak about all the issues integral to the teaching the Catholic faith, the works are quite long (for example, Summa theologiae, although unfinished, numbers 2,592 pages in the English translation of the Fathers of the English Dominican Province).

Thomas’ Summa contra gentiles (SCG), his second great theological synthesis, is split up into four books: book I treats God; book II treats creatures; book III treats divine providence; book IV treats matters pertaining to salvation. Whereas the last book treats subjects the truth of which cannot be demonstrated philosophically, the first three books are intended by Thomas as what we might call works of natural theology, that is, theology that from first to last does not defend its conclusions by citing religious authorities but rather contains only arguments that begin from premises that are or can be made evident to human reason apart from divine revelation and end by drawing logically valid conclusions from such premises. SCG is thus Thomas’ longest and most ambitious attempt at doing what he is probably most famous for—arguing philosophically for various theses concerning the existence of God, the nature of God, and the nature of creatures insofar as they are creatures of God. Although Thomas cites Scripture in these first three books in SCG, such citations always come on the heels of Thomas’ attempt to establish a point philosophically. In citing Scripture in the SCG, Thomas thus aims to demonstrate that faith and reason are not in conflict, that those conclusions reached by way of philosophy coincide with the teachings of Scripture.

Summa theologiae (ST) is Thomas’ most well-known work, and rightly so, for it displays all of Thomas’ intellectual virtues: the integration of a strong faith with great learning; acute organization of thought; judicious use of a wide range of sources, including pagan and other non-Christian sources; an awareness of the complexity of language; linguistic economy; and rigorous argumentation. However, ST is not a piece of scholarship as we often think of scholarship in the early 21st century, that is, a professor showing forth everything that she knows about a subject. Rather, it is the work of a gifted teacher, one intended by its author, as Thomas himself makes clear in the prologue, to aid the spiritual and intellectual formation of his students. It was once thought that Thomas meant ST to replace Lombard’s Sentences as a university textbook in theology, which, incidentally, did begin to happen as early as one hundred and fifty years after Thomas’ death. Recent scholarship has suggested that Thomas rather composed the work for Dominican students preparing for priestly ministry. This thesis is consistent with what Thomas actually does in ST, which may surprise people who have not examined the work as a whole.

What of the method and content of ST? Like Lombard’s Sentences, Thomas’ ST is organized according to the neo-Platonic schema of exit from and return to God. This is no accident. Thomas thinks it is fitting that divine science should imitate reality not only in content but in form. ST is split into three parts. Part one (often abbreviated “Ia.”) treats God and the nature of spiritual creatures, that is, angels and human beings. Part two treats the return of human beings to God by way of their exercising the virtues, knowing and acting in accord with law, and the reception of divine grace. Given the Fall of human beings, part three (often abbreviated “IIIa.”) treats the means by which human beings come to embody the virtues, know the law, and receive grace: (a) the Incarnation, life, passion, death, resurrection, and ascension of Christ, as well as (b) the manner in which Christ’s life and work is made efficacious for human beings, through the sacraments and life of the Church.

Of the three parts of ST, the second part on ethical matters is by far the longest, which is one reason recent scholarship has suggested that Thomas’ interest in composing ST is more practical than theoretical. We might think of ST as a work in Christian ethics, designed specifically to teach those Dominican priests whose primary duties were preaching and hearing confessions. In fact, part two of ST is so long that Thomas splits it into two parts, where the length of each one of these parts is approximately 600 pages in English translation. The first part of the second part is often abbreviated “IaIIae”; the second part of the second part is often abbreviated “IIaIIae.”

The fundamental unit of ST is known as the article. It is in the article that Thomas works through some particular theological or philosophical issue in considerable detail, although not in too much detail. (Recall Thomas is training priests for ministry, not scholars. For Thomas’ most detailed discussions of a topic, readers should turn to his treatment in his disputed questions, his commentary on the Sentences, SCG, and the Biblical commentaries.) Thomas treats a very specific “yes” or “no” question in each article in accord with the method of the medieval disputatio. That is to say, each article within the ST is, as it were, a mini-dialogue. Each article within ST has five parts. First, Thomas raises a very specific question, for example, “whether law needs to be promulgated.” Second, Thomas entertains some objections to the position that he himself defends on the specific question raised in the article. In other words, Thomas is here fielding objections to his own considered position. Third, Thomas cites some authority (in a section that begins, on the contrary) that gives the reader the strong impression that the position defended in the objections is, in fact, untenable. Oftentimes the authority Thomas cites is a passage from the Old or New Testament; otherwise, it is some authoritative interpreter of Scripture or science such as St. Augustine or Aristotle, respectively. It should be noted the authority cited is in no way, shape, or form Thomas’ final word on the subject at hand. Thomas is well aware that authorities need to be interpreted. Fourth, Thomas develops his own position on the specific topic addressed in the article. This part of the article is oftentimes referred to as the body or the respondeo, literally, I respond. Here, Thomas offers arguments in defense of his own considered position on the matter at issue. Sometimes Thomas examines various possible positions on the question at hand, showing why some are untenable whereas others are defensible. At other times, Thomas shows that much of the problem is terminological; if we appreciate the various senses of a term crucial to the science in question, we can show that authorities that seem to be in conflict are simply using an expression with different intended meanings and so do not disagree after all. Fifth, Thomas returns to the objections and answers each of them in light of the work he has done in the body of the article. It should be noted that Thomas often adds interesting details in these answers to the objections to the position he has defended in the body of the article.

In addition to his theological syntheses, Thomas composed numerous commentaries on the works of Aristotle and other neo-Platonic philosophers. For example, Thomas commented on all of Aristotle’s major works, including Metaphysics, Physics, De Anima, and Nichomachean Ethics. These are line-by-line commentaries, and contemporary Aristotle scholars have remarked on their insightfulness, despite the fact that Thomas himself did not know Greek (although he was working from Latin translations of Greek editions of Aristotle’s text). The focus in Thomas’ commentaries is certainly explaining the mind of Aristotle. That being said, given that Thomas sometimes corrects Aristotle in these works (see, for example, his commentary on Physics, book 8, chapter 1), it seems right to say that Thomas’ commentaries on Aristotle are usefully consulted to elucidate Thomas’ own views on philosophical topics as well.

Thomas is often spoken of as an Aristotelian. This is particularly so when speaking of Thomas’ philosophy of language, metaphysics of material objects, and philosophy of science. When it comes to Thomas’ metaphysics and moral philosophy, though, Thomas is equally influenced by the neo-Platonism of Church Fathers and other classical thinkers such as St. Augustine of Hippo, Pope St. Gregory the Great, Proclus, and the Pseudo-Dionysius. One way to see the importance of neo-Platonic thought for Thomas’ own thinking is by noting the fact that Thomas authored commentaries on a number of important neo-Platonic works. These include commentaries on Boethius’ On the Hebdomads, Boethius’ De trinitate, Pseudo-Dionysius’ On the Divine Names, and the anonymous Book of Causes. (The last work Thomas correctly identified as the work of an Arab philosopher who borrowed greatly from Proclus’ Elementatio Theologica and the work of Dionysius; previously it had been thought to be a work of Aristotle’s).

Although Thomas commented on a number of philosophical works, Thomas probably saw his commentaries on Scripture as his most important. (Thomas commented on Job, Isaiah, Jeremiah, Lamentations, Psalms 1-51 (this commentary was interrupted by his death), Matthew, John, Romans, 1 and 2 Corinthians, Galatians, Ephesians, Philippians, Colossians, 1 and 2 Thessalonians, 1 and 2 Timothy, Titus, Philemon, and Hebrews. Thomas also composed a running gloss on the four gospels, the Catena aurea, which consists of a collection of what various Church Fathers have to say about each verse in each of the four gospels.) Thomas understood himself to be, first and foremost, a Catholic Christian theologian. Indeed, theology professors at the University of Paris in Thomas’ time were known as Masters of the Sacred Page. In addition, Thomas was a member of the Dominican order, and the Dominicans have a special regard for teaching the meaning of Scripture.

A reader might wonder why one would mention Thomas’ commentaries on Scripture in an article focused on his contributions to the discipline of philosophy. It is important to mention Thomas’ Scripture commentaries since Thomas often does his philosophizing in the midst of doing theology, and this is no less true in his commentaries on Scripture. To give just one example of the importance of Thomas’ Scripture commentaries for understanding a philosophical topic in his thought, he has interesting things to say about the communal nature of perfect happiness in his commentaries on St. Paul’s letters to the Corinthians and to the Ephesians. A reader who focused merely on Thomas’ treatment of perfect happiness in, for example, the Summa theologiae, would get an incomplete picture of his views on human happiness.

Where talk of Thomas’ philosophy is concerned, there is a final literary genus worth mentioning, the so-called disputed question. Like ST, the articles in Thomas’ disputed questions are organized according to the method of the medieval disputatio. However, whereas a typical article in ST fields three or four objections, it is not uncommon for an article in a disputed question to field 20 objections to the position the master wants to defend. Consider, for example, the question of whether there is power in God. Whereas the article in ST that treats this question fields four objections, the corresponding article in Thomas’ Disputed Questions on the Power of God fields 18 objections. Nonetheless, it would be a mistake to think that Thomas’ disputed questions necessarily represent his most mature discussions of a topic. Although the disputed questions can be regarded as Thomas’ most detailed treatments of a subject, he sometimes changed his mind about issues over the course of his writing career, and the disputed questions do not necessarily represent his last word on a given subject.

2. Faith and Reason

Thomas’ views on the relationship between faith and reason can be contrasted with a number of contemporary views. Consider first an influential position we can label evidentialism. For our purposes, the advocate of evidentialism believes that one should proportion the strength of one’s belief B to the amount of evidence one has for the truth of B, where evidence for a belief is construed either (a) as that belief’s correspondence with a proposition that is self-evident, indubitable, or immediately evident from sense experience, or (b) as that belief’s being supported by a good argument, where such an argument begins from premises that are self-evident, indubitable, or immediately evident from sense experience (see Plantinga [2000, pp. 67-79] and Rota [2012]). Evidentialism, so construed, is incompatible with a traditional religious view that Thomas holds about divine faith: if Susan has divine faith that p, then Susan has faith that p as a gift from God, and Susan reasonably believes that p with a strong conviction, not on the basis of Susan’s personally understanding why p is true, but on the basis of Susan’s reasonably believing that God has divinely revealed that p is true. In other words, divine faith is a kind of certain knowledge by way of testimony for Thomas.

Fideism is another position with which we can contrast Thomas’ views on faith and reason. For our purposes, consider fideism to be the view that states that faith is the only way to apprehend truths about God. Put negatively, the fideist thinks that human reason is incapable of demonstrating truths about God philosophically.

Finally, consider the position on faith and reason known as separatism. According to separatism, philosophy and natural science, on the one hand, and revealed theology, on the other, are incommensurate activities or habits. Any talk of conflict between faith and reason always involves some sort of confusion about the nature of faith, philosophy, or science.

In contrast to the views mentioned above, Thomas not only sees a significant role for both faith and reason in the best kind of human life (contra evidentialism), but he thinks reason apart from faith can discern some truths about God (contra fideism), as epitomized by the work of a pagan philosopher such as Aristotle (see, for example, SCG I, chapter 3). Thomas also recognizes that revealed theology and philosophy are concerned with some of the same topics (contra separatism). Although treating some of the same topics, Thomas thinks it is not possible in principle for there to be a real and significant conflict between the truths discovered by divine faith and theology on the one hand and the truths discerned by reason and philosophy on the other. In fact, Thomas thinks it is a special part of the theologian’s task to explain just why any perceived conflicts between faith and reason are merely apparent and not real and significant conflicts (see, for example, ST Ia. q. 1, a. 8). Indeed, showing that faith and reason are compatible is one of the things Thomas attempts to do in his own works of theology. A diverse group of subsequent religious thinkers have looked to Thomas’ modeling the marriage of faith and reason as one of his most important contributions.

One place where Thomas discusses the relationship between faith and reason is SCG, book I, chapters 3-9. Thomas notes there that there are two kinds of truths about God: those truths that can be apprehended by reason apart from divine revelation, for example, that God exists and that there is one God (in the Summa theologiae, Thomas calls such truths about God the preambles to the faith) and those truths about God the apprehension of which requires a gift of divine grace, for example, the doctrine of the Trinity (Thomas calls these the articles of faith). Although the truth of the preambles to the faith can be apprehended without faith, Thomas thinks human beings are not rationally required to do so. In fact, Thomas argues that three awkward consequences would follow if God required that all human beings need to apprehend the preambles to the faith by way of philosophical argumentation.

First, very few people would come to know truths about God and, since human flourishing requires certain knowledge of God, God wants to be known by as many people as possible. Not everyone has the native intelligence to do the kind of work in philosophy required to understand an argument for the existence of God. Among those who have the requisite intelligence for such work, many do not have the time it takes to apprehend such truths by philosophy, being engaged as they are in other important tasks such as taking care of children, manual labor, feeding the poor, and so forth. Finally, among those who have the natural intelligence and time required for serious philosophical work, many do not have the passion for philosophy that is also required to arrive at an understanding of the arguments for the existence of God.

Second, of the very few who could come to know truths about God philosophically, these would apprehend these truths with anything close to certainty only late in their life, and Thomas thinks that people need to apprehend truths such as the existence of God as soon as possible. (Compare here with a child learning that it is wrong to lie; parents wisely want their children to learn this truth as soon as possible.) In order to understand why Thomas thinks that the existence of God is a truth discernible by way of philosophy only late in life, we need to appreciate his view of philosophy, metaphysics, and natural theology. Philosophy is a discipline we rightly come to only after we have gained some confidence in other disciplines such as arithmetic, grammar, and logic. Among the philosophical disciplines, metaphysics is the most difficult and presupposes competence in other philosophical disciplines such as physics (as it is practiced, for example, in Aristotle’s Physics, that is, what we might call philosophical physics, that is, reflections on the nature of change, matter, motion, and time). Finally, demonstrating the existence of God is the hardest part of metaphysics. If we are to apprehend with confidence the existence of God by way of philosophy, this will happen only after years of intense study and certainly not during childhood, when we might think that Thomas believes it is important, if not necessary, for it to happen.

Third, let us suppose Susan has the native intelligence, time, passion, and experience requisite for apprehending the existence of God philosophically and that she does, in fact, come to know that God exists by way of a philosophical argument. Thomas maintains that such an apprehension is nonetheless going to be deficient for it will not allow Susan to be totally confident that God exists, since Susan is cognizant—being the philosopher she is—that there is a real possibility she has made a mistake in her philosophical reasoning. However, the good life, for example, living like a martyr, requires that we possess an unshakeable confidence that God exists. Since God wants as many people as possible to apprehend his existence, and to do so as soon as possible and with the kind of confidence enjoyed by the Apostles, saints, and martyrs, Thomas argues that it is fitting that God divinely reveals to human beings—even to theologians who can philosophically demonstrate the existence of God—the preambles to the faith, that is, those truths that can be apprehended by human reason apart from divine faith, so that people from all walks of life can, with great confidence, believe that God exists as early in life as possible.

However, does it make sense to believe things about God that exceed the natural capacity of human reason? Thomas thinks the answer is “yes,” and he defends this answer in a number of ways. Two are mentioned here. First, Thomas thinks it sensible of God to ask human beings to believe things about God that exceed their natural capacities since to do so reinforces in human beings an important truth about God, namely, that God is such that He cannot be completely understood by way of our natural capacities. If we say we completely understand God by way of our natural capacities, then we do not understand what “God” means. Talk about God, for Thomas, requires that we recognize our limitations with respect to such a project. God’s asking us to believe things about Him that we cannot apprehend philosophically makes sense for Thomas because it alerts human beings to the fact that we cannot know God in the same way we know the objects of other sciences.

Thomas also notes that believing things about God by faith perfects the soul in a manner that nothing else can. Here Thomas draws on the testimony of Aristotle, who thinks that even a little knowledge of the highest and most beautiful things perfects the soul more than a complete knowledge of earthly things. Although we cannot understand the things of God that we apprehend by faith in this life, even a slim knowledge of God greatly perfects the soul. Just as a bit of real knowledge of human beings is better for Susan’s soul than Susan’s knowing everything there is to know about carpenter ants, Susan’s possessing knowledge about God by faith is better for Susan’s soul than Susan’s knowing scientifically everything there is to know about the cosmos.

Still, we might wonder why Thomas thinks it is reasonable to accept the Catholic faith as opposed to some other faith tradition that, like the Catholic faith, asks us to believe things that exceed the capacity of natural reason. One thing Thomas says is that some non-Catholic religious traditions ask us to believe things that are contrary to what we can know by natural reason. Thomas accepts the medieval maxim that “grace does not destroy nature or set it aside; rather grace always perfects nature.” Although the Catholic faith takes us beyond what natural reason by itself can apprehend, according to Thomas, it never contradicts what we know by way of natural reason. Therefore, any real conflicts between faith and reason in non-Catholic religious traditions give us a reason to prefer the Catholic faith to non-Catholic faith traditions.

In addition, Thomas thinks there are good—although non-demonstrative—arguments for the truth of the Catholic faith. Thomas begins with the accounts of healings, the resurrection of the dead, and miraculous changes in the heavenly bodies, as contained in the Old and New Testaments. These accounts of miracles—which Thomas takes to be historically reliable—offer confirmation of the truthfulness of the teaching of those who perform such works by the grace of God. Even more significant, thinks Thomas, is the fact that simple fishermen were transformed overnight into apostles, that is, eloquent and wise men. Thomas takes this to be a miracle that provides confirmation of the truth of the Catholic faith the apostles preached. Most powerful of all, according to Thomas, the Catholic faith spread throughout the world in the midst of great persecutions. As Thomas notes, the Catholic faith was not initially embraced because it was economically advantageous to do so; nor did it spread—as other religious traditions have—by way of the sword; in fact, people flocked to the Catholic faith—as Thomas notes, both the simple and the learned—despite the fact that it teaches things that surpass the natural capacity of the intellect and demands that people curb their desires for the pleasures of the flesh. Given human nature, Thomas thinks that such conversions were miraculous and so testify to the truth of the faith that such people came to adopt.

3. Philosophy of Language: Analogy

Any discussion of Thomas’ views concerning what something is, for example, goodness or knowledge or form, requires some stage-setting. Much of contemporary analytic philosophy and modern science operates under the assumption that any discourse D that deserves the honor of being called scientific or disciplined requires that the terms employed within D not be used equivocally. Thomas agrees, but with a very important caveat. Thomas distinguishes two different kinds of equivocation: uncontrolled (or complete) equivocation and controlled equivocation (or analogous predication). While the former is incompatible with a discourse being scientific or disciplined, according to Thomas, the latter is not. Thomas therefore distinguishes three different ways words are used: univocally, equivocally (in a sense that is complete or uncontrolled), and analogously, that is, equivocally but in a manner that is controlled. When we use a word univocally, we predicate of two things (x and y) one and the same name n, where n has precisely the same meaning when predicated of x and y. For example, think of the locutions, “the cat is an animal” and “the dog is an animal.” Here, the same word “animal” is predicated of two different things, but the meaning of “animal” is precisely the same in both instances. By contrast, when we use a word equivocally, two things (x and y) are given one and the same name n, where n has one meaning when predicated of x and a different meaning when predicated of y. For example, we use the very same word “bank” to refer to a place where we save money and that part of the land that touches the edge of a river.

Importantly, Thomas notices that some instances of equivocation are controlled, or instances of analogous predication, whereas other instances of equivocal naming are complete or uncontrolled. In a case of complete or uncontrolled equivocation, we predicate of two things (x and y) one and the same name n, where n has one meaning when predicated of x and n has a completely different meaning when predicated of y. English usage of the word “bank” is a good example of complete or uncontrolled equivocation; here the use of the same name is totally an accident of language. It is a matter of linguistic chance that “bank” has these two totally different and unrelated meanings in English.

By contrast, in a case of controlled equivocation or analogous predication, we predicate of two things (x and y) one and the same name n, where n has one meaning when predicated of x, n has a different but not unrelated meaning when predicated of y, where one of these meanings is primary whereas the other meaning derives its meaning from the primary meaning. For example, consider the manner in which we use the word “good.” We sometimes speak of “good dogs,” and sometimes we say things such as “Doug is a good man.” The meanings of “good” in these two locutions obviously differ one from another since in the first sense no moral commendation is implied where there is moral commendation implied in the latter. However, it also seems right to say that “good” is not being used in completely different and unrelated ways in these locutions. Rather, our speaking of “good dogs” derives its meaning from the primary meaning of “good” as a way to offer moral commendation of human beings. We thus use the word “good” as an analogous expression in Thomas’ sense. To take an example Aristotle uses, “healthy” is used in the primary sense in a locution such as “Joe is healthy.” We might also say “Joe’s urine is healthy,” which uses “healthy” to pick out a sign of Joe’s health (in the primary sense of that term), or “exercise is healthy,” which uses “healthy” to pick out a cause of health (again, in the primary sense).

Thomas takes analogous predication or controlled equivocation to be sufficient for good science and philosophy, assuming, of course, that the other relevant conditions for good science or philosophy are met. Although the most famous use to which Thomas puts his theory of analogous naming is his attempt to make sense of a science of God, analogous naming is relevant where many other aspects of philosophy are concerned, Thomas thinks. For example, we also use words analogously when we talk about being, knowledge, causation, and even science itself. Thomas therefore sees a significant difference between complete equivocation and controlled equivocation or analogous naming. Whereas the scientist qua scientist must avoid the former, a discipline that uses words in the latter sense can properly be understood to be scientific or disciplined.

4. Epistemology

a. The Nature of Knowledge and Science

Thomas is aware of the fact that there are different forms of knowledge. One form of knowledge that is particularly important to a 13th-century professor such as Thomas is scientific knowledge (scientia). However, Thomas recognizes that scientific knowledge itself depends upon there being non-scientific kinds of knowledge, for example, sense knowledge and knowledge of self-evident propositions (about each of which, there is more below). We can begin to get a sense of what Thomas means by scientia by way of his discussion of faith, which is a form of knowledge he often contrasts with scientia (see, for example, ST IIaIIae. q. 1, aa. 4-5; q. 2, a. 1). According to Thomas, faith and scientia are alike in being subjectively certain. If I believe that p by faith, then I am confident that p is true. It is likewise with scientific knowledge. However, the reason for one’s being confident that p differs in the cases of faith and scientia. If I know that p by way of science, then I not only have compelling reasons that p, but I understand why those reasons compel me to believe that p. In contrast to scientia, the certainty of faith that p is grounded for Thomas in a rational belief that someone else has scientia or intellectual vision with respect to p. Thus, the certainty of faith is grounded in someone else’s testimony—in the case of divine faith, the testimony of God. For Thomas, faith can and, at least for those who have the time and talent, should be supported by reasons. However, if Susan believes p by faith, Susan may see that p is true, but she does not see why p is true. Susan’s belief that p is ultimately grounded in confidence concerning some other person, for example, Jane’s epistemic competence, where Jane’s competence involves seeing why p is true, either by way of Jane’s having scientia of p, because Jane knows that p is self-evidently true, or because Jane has sense knowledge that p.

We should note that, for Thomas, scientia itself is a term that we rightly use analogously. For example, in speaking of science, we could be talking about an act of inquiry whereby we draw certain conclusions, not previously known, from things we already know, that is, starting from first principles, where these principles are themselves known by way of (reflection upon our) sense experiences, we draw out the logical implications of such principles. We can contrast science as an act of inquiry with another kind of speculative activity that Thomas calls contemplation. Both science (in the sense of engaging in an act of inquiry) and contemplation are acts of speculative intellect according to Thomas, that is, they are uses of intellect that have truth as their immediate object. (In contrast, practical uses of intellect are acts of intellect that aim at the production of something other than what is thought about, for example, thinking at the service of doing the right thing, in the right way, at the right time, and so forth, or thinking at the service of bringing about a work of art.) Thomas thinks that, whereas an act of scientific inquiry aims at discovering a truth not already known, an act of contemplation aims at enjoying a truth already known.

We can speak of science not only as an act of inquiry, but also as a particularly strong sort of argument for the truth of a proposition that Thomas calls a scientific demonstration. If a person possesses a scientific demonstration of some proposition p, then he or she understands an argument that p such that the argument is logically valid and he or she knows with certainty that the premises of the argument are true.

In addition to the senses of science mentioned above, Thomas also recognizes the Aristotelian sense of scientia as a particular kind of intellectual habit or disposition or virtue, which habit is the fruit of scientia as scientific inquiry and requires the possession of scientific demonstrations. But science in the sense of a habit is more than the fruit of inquiry and the possession of arguments. Science as a habit is a person’s possession of an organized body of knowledge of and demonstrative argumentation about some subject matter S, where possessing an organized body of knowledge of and demonstrative argumentation about some subject matter is a function of knowing (a) the basic facts about S, that is, the characteristic properties or powers of things belonging to S, as well as (b) the principles, causes, or explanations of these properties or powers of S, and (c) the logical connections between (a) and (b). For example, according to this model of science, I have a scientific knowledge of living things qua living things only if I know the basic facts about all living things, for example, that living things grow and diminish in size over time, nourish themselves, and reproduce, and I know why living things have these characteristic powers and properties. According to Thomas, a science as habit is a kind of intellectual virtue, that is, a habit of knowledge about a subject matter, acquired from experience, hard work, and discipline, where the acquisition of that habit usually involves having a teacher or teachers. A person who possesses a science s knows the right kind of starting points for thinking about s, that is, the first principles or indemonstrable truths about s, and the scientist can draw correct conclusions from these first principles. In other words, if one has a science of s, one’s knowledge of s is systematic and controlled by experience, and so one can speak about s with ease, coherence, clarity, and profundity.

Thomas notes that the first principles of a science are sometimes naturally known by the scientist, for example in the cases of arithmetic and geometry (ST Ia. q. 1, a. 2). According to Thomas, the science of sacred theology does not fit this characterization of science since the first principles of sacred theology are articles of faith and so are not known by the natural light of reason but rather by the grace of God revealing the truth of such principles to human beings. Of course, contemporary philosophers of science would not find sacred theology’s inability to fit neatly into a well-defined univocal conception of science to be a problem for the scientific status of sacred theology. Think of the demarcation problem, that is, the problem of identifying necessary and sufficient conditions for some discourse counting as science. The demarcation problem suggests that science is a term we use analogously. This is what Thomas thinks. For example, Thomas recognizes that, even among those sciences whose first premises are known to some human beings by the natural light of reason, there are some sciences (call them “the xs”) such that scientists practicing the xs, at least where knowledge of some of the first principles of the xs is concerned, depend upon the testimony of scientists in disciplines other than their own. For example, optics makes use of principles treated in geometry, and music makes use of principles treated in mathematics. If, for example, all musicians had to be experts at mathematics, most musicians would never get to practice the science of music itself. Thus, musicians take the principles and findings of mathematics as a starting point for the practice of their own science. Like optics and music, therefore, sacred theology draws on principles known by those with a higher science, in this case, the science possessed by God and the blessed (see, for example, ST Ia. q. 1, a. 2, respondeo). Unlike optics, music, and other disciplines studied at the university, the principles of sacred theology are not known by the natural light of reason. However, sacred theology is nonetheless a science, since those who possess such a science can, for example, draw logical conclusions from the articles of faith, argue that one article of faith is logically consistent with the other articles of faith, and answer objections to the articles of faith, doing all of these things systematically, clearly, and with ease by drawing on the teachings of other sciences, including philosophy (ST Ia. q. 1, a. 8).

b. The Extension of Science

Given his notion of science (whether taken as activity, demonstrative argument or intellectual virtue), we might think that Thomas understands the extension of science to be wider than what most of our contemporaries would allow. There is a sense in which this is true. Although there is certainly disagreement among our contemporaries over the scientific status of some disciplines studied at modern universities, for example, psychology and sociology, all agree that disciplines such as physics, chemistry, and biology are to be counted among the sciences. The demarcation problem notwithstanding, we tend to think of science as natural science, where a natural science constitutes a discipline that studies the natural world by way of looking for spatio-temporal patterns in that world, where “the way of looking” tends to involve controlled experiments (Artigas 2000, p. 8). Thomas would have known something of science in this sense from his teacher St. Albert the Great (c. 1206-1280). However, for Thomas, (for whom science is understood as a discipline or intellectual virtue) disciplines such as mathematics, music, philosophy, and theology count as sciences too since those who practice such disciplines can talk about the subjects studied in those disciplines in a way that is systematic, orderly, capacious, and controlled by common human experience (and, in some cases, in the light of the findings of other sciences).

On the other hand, there is a sense in which Thomas’ understanding of science is more restrictive than the contemporary notion. Thomas follows Aristotle in thinking that we know something x scientifically only if our knowledge of x is certain. That is to say, we have demonstrative knowledge of x, that is, our knowledge begins from premises that we know with certainty by way of reflection upon sense experience, for example, all animals are mortal or there cannot be more in the effect than in its cause or causes, and ends by drawing logically valid conclusions from those premises. However, it seems to be a hallmark of the modern notion of science that the claims of science are, in fact, fallible, and so, by definition, uncertain.

c. The Four Causes

No account of Thomas’ philosophy of science would be complete without mentioning the doctrine of the four causes. Following Aristotle, Thomas thinks the most capacious scientific account of a physical object or event involves mentioning its four causes, that is, its efficient, material, formal, and final causes. Of course, some things (of which we could possibly have a science of some sort) do not have four causes for Thomas. For example, immaterial substances will not have a material cause. However, Thomas thinks that material objects—whether natural or artificial—do have four causes. For example, for any material object O, O has four causes, the material cause (what O is made of), the formal cause (what O is), the final cause (what the end, goal, purpose, or function of O is), and the efficient cause (what brings—or conserves—O in(to) being). One has a scientific knowledge of O (or O’s kind) only if one knows all four causes of O or the kind to which O belongs. Here follows a more detailed account of each of the four causes as Thomas understands them.

i. The Efficient Cause

An efficient cause of x is a being that acts to bring x into existence, preserve x in existence, perfect x in existence, or otherwise bring about some feature F in x. For example, Michelangelo was the efficient cause of the David. Thomas thinks that there are different kinds of efficient causes, which kinds of efficient causes may all be at work in one and the same object or event, albeit in different ways. For example, Thomas thinks that God is the primary efficient cause of any created being, at every moment in which that created being exists. That is, if it were not for God’s timelessly and efficiently causing a creature to exist at some time t, that creature would not exist at t. God’s act of creation and conservation with respect to some creature C does not rule out that C also simultaneously has creatures as secondary efficient causes of C. This is because God and creatures are efficient causes in different and yet analogous senses. God is the primary efficient cause as creator ex nihilo, timelessly conserving the very existence of any created efficient cause at every moment that it exists, whereas creatures are secondary efficient causes in the sense that they go to work on pre-existing matter such that matter that is merely potentially F actually becomes F. For example, we might say that a sperm cell and female gamete work on one another at fertilization and thereby function as secondary efficient causes of a human being H coming into existence. To continue with this example, Thomas thinks that God, too, is at work as the primary efficient cause of H’s coming into existence, since, for example, (a) God is the creating and conserving cause of (i) any sperm cell as long as it exists, (ii) any female gamete as long as it exists, and (iii) all aspects of the environment necessary for successful fertilization. In addition, Thomas thinks (b) God is the creating and conserving cause of the existence of H itself as long as H exists.

ii. The Material Cause

Thomas thinks that “material cause” (or simply “matter”) is an expression that has a number of different but related meanings. Perhaps the most obvious sense of “matter” is what “garden-variety” objects and their “garden-variety” parts are made of. In this sense of “matter,” the material cause of an axe is some iron and some wood.

There is one sense of “matter” that is very important for an analysis of change, thinks Thomas. Matter in this sense explains why x is capable of being transformed into something that x currently is not. The material cause in this sense is the subject of change—that which explains how something can lose the property not-F and gain the property F. For example, the material cause for an accidental change is some substance. Socrates himself is the material cause of the change that consists in Socrates’ losing the property of not-standing and gaining the property of standing. Such a change is accidental since the substance we name Socrates does not in this case go out of existence in virtue of losing the property of not-standing and gaining the property of standing.

The material cause for a substantial change is what medieval interpreters of Aristotle such as Thomas call prima materia (prime or first matter). Prime matter is that cause of x that is intrinsic to x (we might say, is a part of x) that explains why x is subject to substantial change. For Thomas, substances are unified objects of the highest order. Substances, for example, living things, are thus to be directly contrasted with heaps or collections of objects, for example, a pile of garbage or an army. Thomas thinks that if substantial changes had actual substances functioning as the ultimate subjects for those substantial changes, then it would be reasonable to call into question the substantial existence of those so-called substances that are (supposedly) composed of such substances. If Socrates were composed, say, of Democritean atoms that were substances in their own right, then Socrates, at best, would be nothing more than an arrangement of atoms. He would merely be an accidental being—an accidental relation between a number of substances—instead of a substance. At worst, Socrates would not exist at all (if we think the only substances are fundamental entities such as atoms, and Socrates is not an atom). Since Thomas thinks of Socrates as a paradigm case of a substance, he thus thinks that the matter of a substantial change must be something that is in and of itself not actually a substance but is merely the ultimate material cause of some substance. Thomas calls this ultimate material cause of a substance that can undergo substantial change prime matter. For example, consider that a bear eats a bug at t, so that the bug exists in space s, that is, the bear’s stomach, at t. Some prime matter therefore is configured by the substantial form of a bug in s at t such that there is a bug in s at t. At time t+1, when the bug dies in the bear’s stomach, the prime matter in s loses the substantial form of a bug and that prime matter comes to be configured by a myriad of substantial forms such that the bug no longer exists at t+1. What exists in s at t+1 is a collection of substances, for example, living cells arranged bug-wise, where the cells themselves will soon undergo substantial changes so that what will exist is a collection of non-living substances, for example, the kinds and numbers of atoms and molecules that compose the living cells of a living bug.

That being said, Thomas thinks prime matter never exists without being configured by some form. First of all, matter always exists under dimensions, and so this prime matter (rather than that prime matter) is configured by the accidental form of quantity, and more specifically, the accidental quantity of existing in three dimensions (see, for example, Commentary on Boethius’ De trinitate q. 4, a. 2, respondeo). In addition, it is never the case that some prime matter exists without being configured by some substantial form. For example, some quantity of prime matter m might be configured by the substantial form of an insect at t, be configured by the substantial forms of a collection of living cells at t+1 (for example, some moments after the insect has been eaten by a frog), be configured by the substantial forms of a collection of chemical compounds at t+2, and be incorporated into the body of a frog as an integral part of the frog such that it is configured by the frog’s substantial form at t+3. A portion of prime matter is always configured by a substantial form, though not necessarily this or that substantial form.

Note the theoretical significance of the view that material substances are composed of prime matter as a part. Prime matter is the material causal explanation of the fact that a material substance S’s generation and (potential) corruption are changes that are real (contra Parmenides of Elea), substantial (contra atomists such as Democritus), natural (contra those who might say that all substantial changes are miraculous), and intelligible (contra Heraclitus of Ephesus and Plato of Athens).

iii. The Formal Cause

Like the material cause of an object, the expression formal cause is said in many ways. There are at least three for Thomas. First, formal cause might mean “the nature or definition of a thing,” that is, what-it-is-to-be S. The formal cause of a primary substance x in this sense is the substance-sortal that picks out what x is most fundamentally or the definition of that substance-sortal. For example, for Socrates this would be human being, or, what-it-is-to-be-a-human being, and, given that human beings can be defined as rational animals, rational animal. Although Socrates certainly belongs to other substance-sortals, for example, animal, living thing, rational substance, and substance, such substance-sortals only count as genera to which Socrates belongs; they do not count as Socrates’ infima species, that is, the substance-sortal that picks out what Socrates is most fundamentally. Of course, Socrates can be classified in many other ways, too, for example, as a philosopher or someone who chose not to flee his Athenian prison. However, such classifications are not substantial for Thomas, but merely accidental, for Socrates need not be (or have been) a philosopher—for example, Socrates was not a philosopher when he was two years old, nor someone who chose not to flee his Athenian prison, for even Socrates might have failed to live up to his principles on a given day.

A second sense that formal cause can have for Thomas is that which is intrinsic to or inheres in x and explains that x is actually F. There are two kinds of formal cause in this sense for Thomas. First, there are accidental forms (or simply, accidents). Accidental forms inhere in a substance and explain that a substance x actually is F, where F is a feature that x can gain or lose without x’s ceasing to exist, for example, Socrates’ being tan, Socrates’ weighing 180 lbs, and so forth. Second, there are substantial forms. According to Thomas, substantial forms are particulars—each individual substance has its own individual substantial form—and the substantial form of a substance is the intrinsic formal cause of (a) that substance’s being and (b) that substance’s belonging to the species that it does. A substantial form is a form intrinsic to x that explains the fact that x is actually F, where F is a feature that x cannot gain or lose without ceasing to exist, for example, Socrates’ property being an animal.

A third sense of formal cause for Thomas is the pattern or definition of a thing insofar as it exists in the mind of the maker. Thomas calls this the exemplar formal cause. For example, the form of a house can exist insofar as it is instantiated in matter, for example, in a house. However, the form of (or plan for) a house can also exist in the mind of the architect, even before an actual house is built. This latter sense of formal cause is what we might call the exemplar formal cause. For Thomas, following St. Augustine, some of the ideas of God are exemplar formal causes in this sense, for example, God’s idea of the universe in general, God’s idea of what-it-is-to-be a human being, and so forth, function, as it were, as plans or archetypes in the mind of the Creator for created substances.

iv. The Final Cause

The final cause of an object O is the end, goal, purpose, or function of O. Some material objects have functions as their final causes, namely, that is, artifacts and the parts of organic wholes. For example, the function of a knife is to cut, and the purpose of the heart is to pump blood. Therefore, the final cause of the knife is to cut; the final cause of the heart is to pump blood. Thomas thinks that all substances have final causes. However, Thomas (like Aristotle) thinks of the final cause in a manner that is broader than what we typically mean by function. It is a mistake, therefore, to think that all substances for Thomas have functions in the sense that artifacts or the parts of organic wholes have functions as final causes (we might say that all functions are final causes, but not all final causes are functions). For example, Thomas does not think that clouds have functions in the sense that artifacts or the parts of organic wholes do, but clouds do have final causes. In the broadest sense, that is, in a sense that would apply to all final causes, the final cause of an object is an inclination or tendency to act in a certain way, where such a way of acting tends to bring about a certain range of effects. For example, a knife is something that tends to cut. A cloud is a substance that tends to interact with other substances in the atmosphere in certain ways, ways that are not identical to the ways that either oxygen per se or nitrogen per se tends to interact with other substances.

For Thomas, the final cause is “the cause of all causes” (On the Principles of Nature, ch. 4) and so the final, formal, efficient, and material causes go “hand in hand.” If an object has a tendency to act in a certain way, for example, frogs tend to jump and swim, that tendency—final causality—requires that the frog has a certain formal cause, that is, it is a thing of a certain kind. In addition, things that jump and swim must be composed of certain sorts of stuffs and certain sorts of organs. Frogs, since they are by nature things that flourish by way of jumping and swimming, are composed of bone, blood, and flesh, as well as limbs that are good for jumping and swimming. Finally, a frog’s jumping is something the frog does insofar as it is a frog, given the frog’s form and final cause. That is to say, it is clear that the frog acts as an efficient cause when it jumps, since a frog is the sort of thing that tends to jump (rather than fly or do summersaults). Contrast the frog that is unconscious and pushed such that it falls down a hill. In so falling, the frog is not acting as an efficient cause.

As we have seen, some final causes are functions, whereas it makes better sense to say that some final causes are not functions but rather ends or goals or purposes of the characteristic efficient causality of the substances that have such final causes. In closing this section, we can note that some final causes are intrinsic whereas others are extrinsic. According to Thomas, each and every substance tends to act in a certain way rather than other ways, given the sort of thing it is; such goal-directedness in a substance is its intrinsic final causality. However, sometimes an object O acts as an efficient cause of an effect E (partly) because of the final causality of an object extrinsic to O. Call such final causality extrinsic. For example, John finds Jane attractive, and thereby John decides to go over to Jane and talk to her. John’s own desire for happiness, happiness that John currently believes is linked to Jane, is part of the explanation for why John moves closer to Jane and is a good example of intrinsic formal causality, but Jane’s beauty is also a final cause of John’s action and is a good example of extrinsic final causality.

d. The Sources of Knowledge: Thomas’ Philosophical Psychology

Thomas thinks there are different kinds of knowledge, for example, sense knowledge, knowledge of individuals, scientia, and faith, each of which is interesting in its own right and deserving of extended treatment where its sources are concerned. For present purposes, we shall focus on what Thomas takes to be the sources of knowledge requisite for knowledge as scientia, and, since Thomas recognizes different senses of scientia, what Thomas takes to be the sources for knowledge as a scientific demonstration of a proposition in particular.

As we have seen, if a person possesses scientia with respect to some proposition p for Thomas, then he or she understands an argument that p such that the argument is logically valid and he or she knows the premises of the argument with certainty. Therefore, one of the sources of scientia for Thomas is the operation of the intellect that Thomas calls reasoning (ratiocinatio), that is, the act of drawing a logically valid conclusion from other propositions (see, for example, ST Ia. q. 79, a. 8). Reasoning is sometimes called by Thomists, the third act of the intellect.

How do we come to know the premises of a demonstration with certainty? Our coming to know with certainty the truth of a proposition, Thomas thinks, potentially involves a number of different powers and operations, each of which is rightly considered a source of scientia. Before we speak of the intellectual powers and operations (in addition to ratiocination) that are at play when we come to have scientia, we must first say something about the non-intellectual cognitive powers that are sources of scientia for Thomas.

Thomas agrees with Aristotle that the intellectual powers differ in kind from the sensitive powers such as the five senses and imagination. Nonetheless, Thomas also thinks that all human knowledge in this life begins with sensation. Even our knowledge of God begins, according to Thomas, with what we know of the material world. Since God, for Thomas, is immaterial, the claim that “knowledge… begins in sense” (Disputed Questions on Truth, q. 1, a. 11, respondeo) should not be thought to mean that knowledge of x requires that we can form an accurate image of x. Thomas’ claim rather means that knowledge of any object x presupposes some (perhaps prior) activity on the part of the senses. Indeed, Thomas thinks that sensation is so tightly connected with human knowing that we invariably imagine something when we are thinking about anything at all. Of course, if God exists, that means that what we imagine when we think about God bears little or no relation to the reality, since God is not something sensible. Given the importance of sense experience for knowledge for Thomas, we must mention certain sense powers that are preambles to any operation of the human intellect.

In addition to the five exterior senses (see, for example, ST Ia. q. 78, a. 3), Thomas argues that a capacious account of human cognition requires that we mention various interior senses as preambles to proper intellectual activity (see, for example, ST Ia. q. 78, a. 4). For in order for perfect animals (that is, animals that move themselves, such as horses, oxen, and human beings [see, for example, Commentary on Aristotle’s De Anima, n. 255]) to make practical use of what they cognize by way of the exterior senses, they must have a faculty that senses whether or not they are, in fact, sensing, for the faculties of sight, hearing, and so forth themselves do not confer this ability. In addition, none of the exterior senses enables their possessor to distinguish between the various objects of sense, for example, the sense of sight does not cognize taste, and so forth. Therefore, the animal must have a faculty in addition to the exterior senses by which the animal can identify different kinds of sensations, for example, of color, smell, and so forth with one particular object of experience. We might think that it is some sort of intellectual faculty that coordinates different sensations, but not all animals have reason. Therefore, animals must have an interior sense faculty whereby they sense that they are sensing, and that unifies the distinct sensations of the various sense faculties. Thomas calls this faculty, following Avicenna, the common sense (not to be confused, of course, with common sense as that which most ordinary people know and professors are often accused of not possessing). Since, for Thomas, human beings are animals too, they also possess the faculty of common sense.

In addition to the common sense, Thomas argues that we also need what philosophers have called phantasy or imagination to explain our experience of the cognitive life of animals (including human beings). For, clearly, perfect animals sometimes move themselves to a food source that is currently absent. Therefore, such animals need to be able to imagine things that are not currently present to the senses but have been cognized previously in order to explain their movement to a potential food source. On the assumption that, in corporeal things, to receive and retain are reduced to diverse principles, Thomas argues the faculty of imagination is thus distinct from the exterior senses and the common sense. He also notes that imagination in human beings is interestingly different from that of other animals insofar as human beings, but not other animals, are capable of imagining objects they have never cognized by way of the exterior senses, or objects that do not in fact exist, for example, a golden mountain.

In Thomas’ view, we cannot explain the behavior of perfect animals simply by speaking of the pleasures and pains that such creatures have experienced. Thus, we need to posit two additional powers in those animals. The estimative power is that power by which an animal perceives certain cognitions instinctively, for example, the sheep’s cognition that the wolf is an enemy or the bird’s cognition that straw is useful for building a nest (for neither the sheep nor the bird knows this simply by way of what it cognizes by way of the exterior senses). The memorative power is that power that retains cognitions produced by the estimative power. Since (a) the estimative sense and common sense are different kinds of powers, (b) the common sense and the imagination are different kinds of powers, and (c) the estimative power can be compared to the common sense whereas the memorative power can be compared to the imagination, it stands to reason that the estimative power and the memorative power are different powers.

Just as intellect in human beings makes a difference in the functioning of the faculty of imagination for Thomas, so also does the presence of intellect in human beings transform the nature of the estimative and memorative powers in human beings. As Thomas notes, this is why the estimative and memorative powers have been given special names by philosophers: the estimative power in human beings is called the cogitative power and the memorative power is called the reminiscitive power. The cogitative power in human beings is that power that enables human beings to make an individual thing, event, or phenomena, qua individual thing, event, or phenomena, an object of thought. For example, if Joe comes to believe “this man is wearing red,” he does so partly in virtue of an operation of the cogitative power, since Joe is thinking about this man and his properties (and not simply man in general and redness in general, both of which, for Thomas, are cognized by way of an intellectual and not a sensitive power; see below). Similarly, if I come to think, “I should not steal,” I do so partly by way of my cogitative power according to Thomas insofar as I am ascribing a property to an individual thing, in this case, myself. As for the reminiscitive power, it enables its possessor to remember cognitions produced by the cogitative power. In other words, it helps us to remember intellectual cognitions about individual objects. For example, say that I am trying to remember the name of a particular musician. I employ the reminiscitive power when I think about the names of other musicians who play on recordings with the musician whose name I cannot now remember but want to remember.

Having said something about the non-intellectual, cognitive sources of scientia for Thomas, we can return to speaking of the properly intellectual powers and activities of human beings necessary for scientia. According to Thomas, there are two powers of the intellect, powers Thomas calls the active intellect and the passive intellect, respectively. Thomas thinks that the intellect has what he calls a passive power since human beings come to know things they did not know previously (see, for example, ST Ia. q. 79, a. 2). In being able to do this, human beings are unlike the angels, Thomas thinks, since, according to Thomas, the angels are created actually knowing everything they will naturally know. (According to Thomas, the blessed angels do come to have supernatural knowledge, namely, knowledge of the essence of God in the beatific vision.) Following Aristotle, Thomas believes that the intellect of a human being, in contrast to that of an angel, is a tabula rasa at the beginning of its existence. The passive intellect of a human being is that which receives what a person comes to know; it is also the power by which a human being retains, intellectually, what is received. For Thomas, therefore, the passive intellect plays the role of memory where knowledge of the nature of things is concerned [see, for example, ST Ia. q. 79, a. 7). For example, say John does not know what a star is at time t. He reads about stars at t+1 and in doing so comes to know the nature of a star. Since John’s intellect has been altered such that he knows something he did not know before, there must be a power that explains this ability to receive knowledge; for Thomas, it is John’s passive intellect, that is, the intellect insofar as John can come to know something he did not know before.

Whereas the passive intellect is that which receives and retains an intelligible form, what Thomas calls the active intellect is the efficient cause intrinsic to the knowing agent that makes what is potentially knowable actually so. In Thomas’ view, anything that is understood is understood in virtue of its form. However, the forms of material things, although potentially intelligible, are not actually intelligible insofar as they configure matter, but human beings can understand material things. Therefore, since that which is brought from potency to act is done so only by that which is appropriately actual, we do not know things innately, and we sometimes experience ourselves actually understanding things, there must be a power in human beings that can cause the forms of material objects to become actually intelligible. That power is what Thomas calls the active intellect.

We can round out our discussion of Thomas’ account of the sources of scientia by speaking of the three activities of the powers of the intellect. The first act of the intellect is what Thomists call the act of simple apprehension; this is the intellect’s act of coming to understand the essence of a thing (see, for example, Commentary on Aristotle’s On Interpretation, Proeemium, n. 1). The intellectual act of simple apprehension is simple in the sense that it does not yet imply a judgment on the part of an intellect about the truth or falsity of a proposition. For example, it is by the intellect’s act of simple apprehension that a person cognizes what a thing is, that is, its quiddity, without forming true or false propositions about that quiddity such as, it exists, or it is F rather than not-F.

According to Thomas, the intellect’s simple act of apprehension is the termination of a process that involves not only the activities of intellectual powers but sensory powers, too, both exterior and interior. As we have seen, Thomas thinks that all intellection begins with sensation. Therefore, when we come to understand the essence of a material object, say a bird, the form of the bird is first received spiritually in a material organ, for example, the eye. To say that the form of the bird is received spiritually is simply to say that what is received is received as a form, where the form in question does not exist in the sense organ as it exists extra-mentally. As Stump (2003, p. 253) notes, we might think of this form, as it exists in the sense organ, as encoded information. Thomas calls this immaterial reception of the bird in the eye “the sensible species” of the object cognized. We do not, as of yet, have enough to explain an animal’s conscious awareness of what is sensed. In order for this to occur, Thomas speaks of the need of the sensible species being worked on by the power of phantasia. At that point, the agent has a phantasm of the bird; she is at least conscious of a blue, smallish object with wings. From the phantasm, including experiences of similar phantasms stored in phantasia or the reminiscitive power, the power of active intellect abstracts what Thomas calls the intelligible species from the phantasm(s), that is, leaves to one side those features the agent recognizes are accidental to the object being cognized in order to focus on the quiddity, nature, or essence of what is being cognized. The resulting quiddity is received in the possible intellect. Finally, the intelligible species is transformed into an “inner word” or “concept,” that is, there is conscious awareness of the quiddity of what has been cognized such that the quiddity is recognized as corresponding to a word such as “bird.”

So far we have spoken of the third and first acts of the intellect. The second activity of the intellect is what Thomists call judgment, but Thomas himself typically speaks of the intellect’s composing and dividing (see, for example, Commentary on Aristotle’s On Interpretation, Proeemium, n. 1, and ST Ia. q. 85, a. 5). In this act of the intellect, the intellect compares quiddities and judges whether or not this property or accident should be attributed to this quiddity. For example, Joe comes to know the quiddity of mammality and animality through the first act of intellect and judges (correctly) that all mammals are animals by way of the second act of understanding.

Since scientia for Thomas involves possessing arguments that are logically valid and whose premises are obviously true, one of the sources of scientia for Thomas is the intellect’s second act of intellect, composing and dividing, whereby the scientist forms true premises, or propositions, or judgments about reality. Since such judgments have the intellect’s first act of understanding as a prerequisite—one cannot truly judge that all mammals are animals until one apprehends animality and mammality—acts of simple apprehension are also a source of scientific knowledge for Thomas. This brings us back to where we started, with the third act of intellect, namely, ratiocination, the intellect’s ability to derive a logically valid conclusion from some other proposition or propositions, for example, judging that all mammals are animals and all animals are living things, we reason to the conclusion that all mammals are living things. To take a more interesting example, if we judge that all human beings have intellectual souls and all intellectual souls are by nature incorruptible, it follows that any human being has a part that survives the biological death of that human being.

We would be remiss not to mention God as a source of all forms of knowledge for Thomas. For all human intellection involves many instances of change, of going from a state of not-knowing that p to knowing that p, and each and every change, Thomas thinks, requires as part of its sufficient explanation the action of one being that is itself absolutely immutable (see, for example, Thomas’ so-called first way of demonstrating the existence of God at ST Ia. q. 2, a. 3, respondeo). Thomas believes (by faith) that the God of Abraham, Isaac, and Jacob is this one immutable being. Therefore, in Thomas’ view God is the primary uncaused cause of each and every act of human intellection. However, all of this is consistent, Thomas thinks, with human intellects also being real and active secondary causes of their own acts of knowing. Unlike some of his forerunners in philosophical psychology, Thomas thinks that each and every human being has his or her own agent intellect by which he or she can “light up” the phantasms in order to actually understand a thing. (Here we can contrast Thomas’ views with those of St. Augustine of Hippo, Ibn Sina [Avicenna], and Ibn Rushd [Averroes], all of whom think God or some non-human intellect plays the role of agent intellect). Although God’s act of creating and sustaining any intellectual activity is a necessary condition and the primary efficient cause for any human act of coming to know something not previously known, it is neither a sufficient condition nor the sole cause of such activity, Thomas thinks. For a human being, too, is a secondary, efficient cause of his or her coming to know something.

5. Metaphysics

a. On Metaphysics as a Science

In Thomas’ Aristotelian understanding of science, a science S has a subject matter, and a scientist with respect to S knows the basic facts about the subject matter of S, the principles or starting points for thinking about the subject matter of S, the causes of the subject matter of S, and the proper accidents of the subject matter of S. Following Aristotle, Thomas thinks of metaphysics as a science in this sense. For Thomas, the subject matter of the science of metaphysics is being qua being or being in common, that is, being insofar as it can be said of anything that is a being. (Contrast, for example, the narrower subject matters of philosophical physics, which studies physical being insofar as it can be investigated philosophically, and natural theology, which studies immaterial being insofar as it can be studied by the power of natural reason alone.) Thomas also thinks intelligent discussion of the subject matter of metaphysics requires that one recognize that “being is said in many ways,” that is, that there are a number of different but non-arbitrarily related meanings for being, for example, being as substance, quality, quantity, or relation, being qua actual, being qua potential, and so forth. The metaphysician, minimally, can speak intelligently about the proper relationships between these many different but related meanings of “being.”

The principles of being qua being include those principles that are ever and always employed but are never themselves considered carefully in all disciplines, for example, the principle of identity and the principle of non-contradiction. The causes of being qua being are the efficient, formal, and final causes of being qua being, namely, God. Finally, the proper accidents of being qua being are “one,” “good,” “beautiful,” “same,” “whole,” “part,” and so forth. For Thomas, metaphysics involves not only disciplined discussion of the different senses of being but rational discourse about these principles, causes, and proper accidents of being.

Note that Thomas therefore thinks about the subject matter of metaphysics in a manner that differs from that of contemporary analytic philosophers. Contemporary analytic philosophers tend to think about metaphysics as the philosophical discipline that treats a collection of questions about ultimate reality (see, for example, Van Inwagen 2015, p. 3). However, this contemporary understanding of the subject matter of metaphysics is too broad for Thomas since he thinks there are philosophical disciplines distinct from metaphysics that treat matters of ultimate reality, for example, the ultimate causes of being qua movable are treated in philosophical physics or natural philosophy, the ultimate principles of human being are treated in philosophical anthropology.

b. On What There Is: Metaphysics as the Science of Being qua Being

For Thomas, when we think about the meaning of being wisely, we recognize that we use it analogously and not univocally. Thus, one of the things the metaphysician does, thinks Thomas, is identify, describe, and articulate the relationship between the different senses of being. Let us catalogue some of the ways Thomas uses “being,” which ways of using the expression “being” are best understood by way of emphasizing Thomas’ examples.

In one place Thomas distinguishes four different senses of being (Disputed Questions on Truth q. 21, a. 4, ad4). Being in the primary sense is substantial being, for example, Socrates, or a particular tree. However, there are also extended senses of being; there is being in the sense of the principles of substances, that is, form and matter, being in the sense of the dispositions or accidents of a substance, for example, a quality of a substance, and being in the sense of a privation of a disposition of a substance, for example, a man’s blindness. Again, although the same word is used to speak of these four realities, the term being does not have precisely the same meaning in these four cases, although all four meanings are related to the primary meaning of being as substance.

Another distinction Thomas makes where being is concerned is the distinction between being in act and being in potency. Being in potency does not actually exist now but is such that it can exist at some point in the future, given the species to which that being in potency belongs. In contrast, being in act exists now. For example, say Socrates is not tan right now but can be tan in the future, given that he is a rational animal, and rational animals are such that they can be tan. Socrates is therefore not tan in act, but rather tan in potency (see, for example, On the Principles of Nature, ch. 1). The distinction between being in act and being in potency is important because it helps solve a puzzle raised by Parmenides, namely, how something can change. If “being” can only refer to what exists in act, then there can be no change. However, if being is said in many ways, not only of what actually is but also what can be in the sense of what can become what it is not, then change can be understood as something intelligible (see, for example, Commentary on Aristotle’s Physics, lec. 6, n. 39). The viability of the distinction between being in act and being in potency can be confirmed by thinking about the way we commonly speak and think. For example, compare a rock and a very young person who is not yet old enough to see. Both of them do not actually see, but not in the same sense. For we rightly negate the ability to see of a rock; it does not actually have the ability to see, nor does it potentially have such an ability, given the sort of thing that it is. However, although a very young human person, like the rock, does not actually have the ability to see, that young person is nonetheless potentially something that sees.

If a being were fully actual, then it would be incapable of change. If a being were purely potential, then it would not, by itself, actually exist. Thus, actually existent beings capable of change are composites of act and potency. The principle of actuality in a composite being explains that the being in question actually exists or actually has certain properties whereas the principle of potentiality in a composite being explains that the being in question either need not exist—it is not in the nature of that thing to exist—or is a thing capable of substantial change such that its matter can become part of some numerically distinct substance.

Where act and potency are concerned, Thomas also distinguishes, with Aristotle, between first and second act on the one hand and active and passive potency on the other. A substance s is in first act or actuality insofar as s, with respect to some power P, actually has P. For example, the newborn Socrates, although actually a human being, only potentially has the power to philosophize and so is not in first act with respect to the power to philosophize. On the other hand, Socrates, when awaiting his trial, and being such that he is quite capable of defending the philosophical way of life, is in first act with respect to the habit of philosophy, that is, he actually has the power to philosophize. A substance s is in second act insofar as, with respect to some power P, s not only actually has P but is currently making use of P. For example, imagine that Socrates is sleeping, say, the night before he makes his famous defense of the philosophical way of life. When he is sleeping, although Socrates is in first act with respect to the power to philosophize, he is not in second act with respect to that power (although he is in potency to the second act of philosophizing). Socrates, when he is actually philosophizing at his trial, is not only in first act with respect to the power to philosophize, but also in second act.

Consider now the difference between active and passive potency. Imagine Socrates is not now philosophizing. He is resting. Nonetheless, he is potentially philosophizing. However, his potency with respect to philosophizing is an active potency, for philosophizing is something one does; it is an activity. Insofar as Socrates is not now philosophizing, but is potentially philosophizing, he has an active potency.

Now imagine Socrates is hit by a tomato at time t at his trial. Socrates can be hit by a tomato at t because he has, among other passive potencies, the ability to be hit by an object. Having the ability to be hit by an object is not an ability (or potentiality) Socrates has to F, but rather an ability (or potentiality) to have F done to him; hence, being able to be hit by an object is a passive potentiality of Socrates.

Where being is concerned, Thomas also distinguishes between beings in nature and intentional beings or beings of reason (see, for example, Commentary on Aristotle’s Metaphysics IV, lec. 4, n. 574). Thomas thinks that nothing can be understood, save insofar as it has being. Natural being is what philosophers (and empirical scientists) study, for example, non-living things, plants, animals, human beings, colors, virtues, and so forth. However, some beings that we think about follow upon the consideration of thinking about beings of nature, notions such as genus, species, and difference. These are the sorts of beings studied in logic, Thomas thinks. In additional to logical beings, we could also mention fictional beings such as Hamlet as an example of a being of reason.

Where the meanings of being are concerned, Thomas also recognizes the distinction between being in the sense of the essentia (essence or nature or form) or quod est (what-it-is) of a thing on the one hand and being in the sense of the esse or actus essendi or quo est (that-by-which-it-is) of a thing on the other hand (see, for example, SCG II, ch. 54). To say that a being B’s essentia differs from its esse is to say that B is composed of essentia and esse, which is just to say that B’s esse is limited or contracted by a finite essentia, which is also to say that B’s esse is participated esse, which itself is to say that B receives its esse from another. If esse and essentia do not differ in a being B1, then B1’s esse is not limited by a finite essentia, B1’s esse is not participated and so uncreated, and B1’s esse is unreceived. For Thomas, only in God are God’s esse and essentia identical.

According to Thomas, all created substances are composed of essentia and esse. The case where there is the clearest need to speak of a composition of essentia and esse is that of the angels. In speaking of act and potency in the angels, Thomas does not speak in terms of form and matter, since for Thomas matter as a principle of potentiality is always associated with an individual thing existing in three dimensions. Thomas’ Franciscan colleague at the University of Paris, St. Bonaventure, did indeed argue that angels were composed of form and spiritual matter. However, Thomas thinks the notion of spiritual matter is a contradiction in terms, for to be material is to be spread out in three dimensions, and the angels are not spread out in three dimensions. Angels are essentially immaterial beings, thinks Thomas. (This is not to say that angels cannot on occasion make use of a body by the power of God; this is how Thomas would make sense of the account of the angel Gabriel talking with the Blessed Virgin Mary in the Gospel according to Luke; whatever Mary saw when she claimed to talk to the angel Gabriel, according to Thomas, it was not a part of Gabriel. Compare the notion that angels are purely immaterial beings that nonetheless make use of bodies as instruments with Plato’s view (at least in the Phaedo) that the human body is not a part of a human being but only an instrument that the soul uses in this life.) However, because angels are not pure act—this description is reserved for the first uncaused efficient cause alone for Thomas—there is need to make sense of the fact that an angel is a composite of act and potency. Thus, Thomas speaks of a composition of essentia (being in the sense of what something is) and esse (being in the sense that a thing is) in the angels, for it does not follow from what an angel is that it exists. In other words, where we can distinguish essentia and esse in a thing, that thing is a creature, that is, it exists ever and always because God creates and conserves it in being. Of course, substances composed of form and matter, for example, human beings, non-rational animal, plants, minerals, are creatures too and so they are also composed of essentia and esse. In general, talk of essence/esse composition in created substances is Thomas’ way of making sense, for him, of the fact that such substances do not necessarily exist but depend for their existence, at every moment that they exist, upon God’s primary causal activity.

6. Natural Theology

a. Some Methodological Considerations

Thomas thinks there are two kinds of truths about God: (a) those truths that can be demonstrated philosophically and (b) those truths that human beings can come to know only by the grace of divine revelation. Although Thomas has much of great interest to say about (b)—see, for example, SCG, book IV, ST Ia. qq. 27-43, and ST IIIa.—this article focuses on (a): those truths that according to Thomas can be established about God by philosophical reasoning.

Thomas thinks there are at least three mutually reinforcing approaches to establishing truths about God philosophically: the way of causation; the way of negation, and the way of perfection (or transcendence). Thomas makes use of each one of these methods, for example, in his treatment of what can be said truly about God by the natural light of reason in ST.

b. The Way of Causation: On Demonstrating the Existence of God

Thomas offers what he takes to be demonstrations of the existence of God in a number of places in his corpus. (On the meaning of the term “demonstration,” see the section on Thomas’ epistemology). His most complete argument is found in SCG, book I, chapter 13. There is also an argument that Brian Davies (1992, p. 31) calls “the existence argument,” which can be found at, for example, ST Ia. q. 65, a. 1, respondeo. The most famous of Thomas’ arguments for the existence of God, however, are the so-called “five ways,” found relatively early in ST.

There are a number of things to keep in mind about the five ways. First, the five ways are not complete arguments, for example, we should expect to find some suppressed premises in these arguments. To see this, we can compare the first way of demonstrating the existence of God in ST Ia. q. 2, a. 3, which is an argument from motion, with Thomas’ complete presentation of the argument from motion in SCG, book I, chapter 13. Whereas the former is offered in one paragraph, the latter is given in 32 paragraphs.

Second, Thomas’ arguments do not try to show that God is the first mover, first efficient cause, and so forth in a temporal sense, but rather in what we might call an ontological sense, that is, in the sense that things other than God depend ultimately upon God causing them to exist at every moment that they exist. Indeed, as we shall see, Thomas does not think that God could be first in a temporal sense because God exists outside of time.

Third, as Thomas makes clear in SCG I, 13, 30, his arguments do not assume or presuppose that there was a first moment in time. As he notes there, given that the universe has a beginning, it is easier to show there is a God: “the most efficacious way to prove that God exists is on the supposition that the world is eternal. Granted this supposition, that God exists is less manifest” (Anton Pegis, trans.). Nor do the five ways attempt to prove that there was a first moment of time. Although Thomas believes there was a first moment of time, he is very clear that he thinks such a thing cannot be demonstrated philosophically; he thinks that the temporal beginning of the universe is a mystery of the faith (see, for example, ST Ia. q. 46, a. 2). Thus, if we should assume anything, for the sake of argument, about time or the duration of the world where Thomas’ arguments for the existence of God are concerned, we should assume that there is no first moment of time, that is, that the universe has always existed. Interestingly, even on such a supposition, Thomas thinks he can demonstrate philosophically that there is a God.

Fourth, as will be seen, the five ways are simply five ways of beginning to demonstrate God’s existence. For example, in ST the demonstrations of God’s existence continue beyond Ia. q. 2, a. 3, as Thomas attempts to show that a first mover, first efficient cause, first necessary being, first being, and first intelligence is also ontologically simple (q. 3), perfect (q. 4), good (qq. 5-6), infinite (q. 7), ontologically separate from finite being (q. 8), immutable (q. 9), eternal (q. 10), one (q. 11), knowable by us to some extent (q. 12), nameable by us (q. 13), knowledgeable (q. 14), such that there are ideas in that being’s mind (q. 15), such that life is properly attributed to that being (q. 18), such that will is properly attributed to that being (q. 19), and such that love is properly attributed to that being (q. 19). However, as Thomas says at the end of each of the five ways, such a being is what everyone calls “God.”

For our purposes, let us focus on one of Thomas’ five ways (ST Ia. q. 2, a. 3), the second way. Here is Thomas’ text (note that numbers have been inserted in the following text, corresponding to premises in the detailed formulation of the second way that follows):

The second way is from the nature of the efficient cause. [(1)] In the world of sense we find there is an order of efficient causes. [(3)] There is no case known (neither is it, indeed, possible) in which a thing is found to be the efficient cause of itself; for so it would be prior to itself, which is impossible. Now [(12)] in efficient causes it is not possible to go on to infinity, because [(6)] in all efficient causes following in order, the first is the cause of the intermediate cause, and the intermediate is the cause of the ultimate cause, whether the intermediate cause be several, or only one. Now [(7)] to take away the cause is to take away the effect. Therefore, [(8)] if there be no first cause among efficient causes, there will be no ultimate, nor any intermediate cause. But [(9)] if in efficient causes it is possible to go on to infinity, there will be no first efficient cause, [(10)] neither will there be an ultimate effect, nor any intermediate efficient causes; [(11)] all of which is plainly false. Therefore, [(13)] it is necessary to admit a first efficient cause, [(14)] to which everyone gives the name of God (Fathers of the English Dominican Province, trans.).

 This argument might be formulated as follows:

  1. In the world that can be perceived by the senses, there is an order of efficient causes, for example, there is something E that is an effect of an efficient cause or causes at a time t, for example, there is an animal whose existence at t is an effect of a number of efficient causes, for example, the warmth of the earth’s atmosphere at t, there being oxygen in the atmosphere for the animal to breath at t, and the proper functioning of biological systems within the animal at t, and so forth, and some of those efficient causes of E are themselves effects of other efficient causes at t, for example, the warmth of the earth’s atmosphere at t is an effect of the sun’s warming the atmosphere of the earth at t and the proper functioning of biological systems within the animal at t is an effect of the action of certain bio-chemicals within those biological systems at t, and so forth [assumption].
  2. If there is an order of efficient causes, for example, there is some effect E that has x as an efficient cause at t, and x itself has y as an efficient cause at t, and y itself has z as an efficient cause at t, and so forth, then (a) there is an order of efficient causes of E at t that is infinite, (b) there exists something (E or a cause of E) that is the efficient cause of itself at t, or (c) there is an absolutely first efficient cause of E’s existence at t, that is, E’s existence has an efficient cause at t where that efficient cause itself does not itself have an efficient cause [assumption].
  3. Nothing can be the efficient cause of itself, all by itself, otherwise it would be metaphysically prior to itself, which is impossible [assumption].
  4. Therefore, if there is an order of efficient causes, for example, there is some effect E that has x as an efficient cause of its existence at t, and x itself has y as an efficient cause at t, and so forth, then (a) there is an order of efficient causes of E at t that is infinite or (c) there is an absolutely first efficient cause of E’s existence at t [from (2) and (3), conditional introduction].
  5. (a) There is an order of efficient causes of E at t that is infinite or (c) there is an absolutely first efficient cause of E’s existence at t [from (1) and (4), MP].
  6. In an order of efficient causes such that a is an efficient cause of b and b is an efficient cause of an effect c, a is a first cause of b and c and b is an intermediate cause of the effect c [assumption].
  7. To take away the cause is to take away the effect [assumption].
  8. Therefore, if it is not the case that there is an absolutely first efficient cause of an effect E’s existence at t, then there are no intermediate causes and so no effect E at t [from (6) and (7)].
  9. If there is an order of efficient causes of E at t that is infinite, then it is not the case that there is an absolutely first efficient cause of E’s existence at t [assumption].
  10. Therefore, if there is an order of efficient causes of E at t that is infinite, then there are no intermediate causes and no effect E [from (8) and (9), HS].
  11. It is not the case that there are no intermediate causes and no effect E [from (1)].
  12. Therefore, it is not the case that there is an order of efficient causes of E at t that is infinite [from (10) and (11), MT].
  13. Therefore, there is an absolutely first efficient cause of E’s existence at t [from (5) and (12), DS].
  14. An absolutely first efficient cause of E’s existence at t is what everyone calls “God” [assumption].
  15. Therefore, there is a God [from (13) and (14)].

The second premise, third premise, seventh premise, the inference to the eighth premise, and the fourteenth premise likely require further explanation. As for premise (2), we should note that Thomas assumes the truth of a principle often called the principle of causality. The principle of causality states that every effect has a cause. The principle of causality is a piece of common sense that arguably also plays a pivotal role in all scientific inquiry. If, for example, Susan was eating Wheaties for breakfast and suddenly a blueberry appeared on the top of her cereal, it would be reasonable for Susan to ask, “What caused the blueberry to be there?” We would not accept the following answer as a legitimate response to that question: “Nothing caused it to be there.” Of course, we might not be able to find out precisely what caused the blueberry to be there. However, we should not therefore conclude that the blueberry’s coming to be on the top of Susan’s cereal bowl does not have a cause. The principle of causality is also being invoked when scientists ask a question such as, “What causes plants to grow?” A scientist assumes the principle of causality when he or she assumes there is an answer to this question that involves causes. Of course, when it comes to our understanding of the nature of ultimate causes, it may be that we run into certain limits to human understanding. This is something Thomas admits, as will be seen below. However, we get premise two of the formulation of Thomas’ second way by applying the principle of causality to the case of the existence of some effect. Given the importance of the principle of causality in everyday life and scientific work, to deny the principle of causality in the context of doing metaphysics would seem to be ad hoc (see Feser 2009, p. 51ff. for more discussion of this point).

Premise (3) is a metaphysical principle. Consider a scenario that would constitute a denial of premise (3): there is an x such that, absolutely speaking, x causes itself to exist. However, this is not possible. Although x can be the efficient cause of itself in one respect, for example, an organism is an efficient cause of its own continued existence insofar as it nourishes itself, it cannot be the efficient cause of itself in every respect. This is easiest to see in the case of something bringing itself into existence. In order for x to perform the act of bringing x into existence at time t, x must already exist at t in order to perform such an act. However, if x already exists at t to perform the act of bringing x into existence at t, then x does not bring itself into existence at t, for x already exists at t. However, the same kind of reasoning works if x is a timelessly eternal being. To say that x is timelessly the efficient cause of its own existence is to offer an explanatory circle as an efficient causal explanation for x’s existence, which for Thomas is not to offer a good explanation of x’s existence, since circular arguments or explanations are not good arguments or explanations.

Premise (7) shows that Thomas is not in this argument offering an ultimate efficient causal explanation of what is sometimes called a per accidens series of efficient causes, that is, a series of efficient causes that stretches (perhaps infinitely) backward in time, for example, Rex the dog was efficiently caused by Lassie the dog, and Lassie the dog was efficiently cause by Fido the dog, and so forth. If he did have such a per accidens causal series in mind, then premise (7) would be subject to obvious counter-examples, for example, a sculptor is the efficient cause of a sculpture. However, it routinely happens that a sculpture outlives its sculptor. In such a case, we can take away the efficient cause (the sculptor) without taking away the effect of its efficient causation (the sculpture). Unless we are comfortable assigning to Thomas a view that is obviously mistaken, we will look for a different interpretation of premise (7).

A typical and more charitable interpretation of premise (7) is that Thomas is talking here about concurrent efficient causes and their effects, for example, in a case where a singer’s song exists only as long as the singer sings that song. This interpretation of premise (7) fits well with what we saw Thomas say about the arguments for the existence of God in SCG, namely, that it is better to assume (at least for the sake of argument) that there is no beginning to time when arguing for the existence of God, for, in that case, it is harder to prove that God exists.

With such an interpretation of premise (7) in the background, we are in a position to make sense of the inference from premises (6) and (7) to premise (8). If there were no absolutely first cause in the order of efficient causes of any effect E, then there would be nothing that ultimately existentially “holds up” E, since none of the supposed intermediate causes of E would themselves exist without an efficient cause that is not itself an effect of some efficient cause.

Finally, premise (14) simply records the intuition that if there is an x that is an uncaused cause, then there is a God. Of course, Thomas does not think he has proved here the existence of the Triune God of Christianity (something, in any case, he does not think it possible to demonstrate). Rather, Thomas believes by faith that the absolutely first efficient cause is the Triune God of Christianity. However, to show philosophically that there is a first uncaused efficient cause is enough to show that atheism is false. To put this point another way, Thomas thinks Jews, Muslims, Christians, and pagans such as Aristotle can agree upon the truth of premise (14). As will be seen, Thomas thinks it possible, upon reflection, to draw out interesting implications about the nature of an absolutely first efficient cause from a few additional plausible metaphysical principles. The more inferences Thomas draws out regarding the nature of the absolutely first efficient cause, the easier it will be to say with him (whether or not we think his arguments sound), “But this is what people call ‘God’.”

c. The Way of Negation: What God is Not

As we saw in discussing his philosophical psychology, Thomas thinks that when human beings come to know what a material object is, for example, a donkey, they do so by way of an intelligible species of the donkey, which intelligible species is abstracted from a phantasm by a person’s agent intellect, where the phantasm itself is produced from a sensible species that human beings receive through sense faculties that cognize the object of perception. Thomas thinks I can know what a thing is, for example, a donkey, since the form of a donkey and my intelligible species of a donkey are identical in species (see, for example, SCG III, ch. 49, 5). However, in Thomas’ view, we cannot possess an idea of the first cause, that is, God, in this life that is isomorphic with God’s essence, for he thinks any likeness of God that we have in our minds in this life is derived from what we know of material objects, and such a likeness is not the same in species as the form or essence of God Himself (for reasons that will become clear in what follows). Therefore, we cannot naturally know what God is. (Thomas thinks this is true even of the person who is graced by the theological virtues of faith, hope, and charity in this life; knowing the essence of God is possible for human beings, Thomas thinks, but it is reserved for the blessed in heaven, the intellects of whom have been given a special grace called the light of glory [see, for example, ST Ia. q. 12, a. 11, respondeo].) Although we cannot know what God is in this life, by deducing propositions from the conclusions of the arguments for the existence of God, Thomas thinks we can, by natural reason, come to know what God is not. For our purposes, let us focus on three pieces of negative theology in Thomas’ natural theology: that God is not composed of parts; that God is not changeable; that God does not exist in time.

i. God is Not Composed of Parts

To say that God is not composed of parts is to say that God is metaphysically simple (see, for example, ST Ia. q. 3), for whatever has parts has a cause of its existence, that is, is the sort of thing that is put together or caused to exist by something else. Since nothing can cause itself to exist all by itself, whatever is composed of parts has its existence caused by another. However, God, the first uncaused cause, does not have God’s existence caused by another. Therefore, God does not have parts.

As Thomas notes, the denial that God the Creator has parts shows how much God is unlike those things God creates, for all the things with which we are most familiar are composed of parts of various kinds. However, there are a number of ways in which something might be composed of parts. The most obvious sense is being composed of quantitative parts, for example, there is the top inch of me, the rest of me, and so forth. Since God is not composed of parts, God is not composed of quantitative parts.

Thomas thinks that material objects, at any given time, are also composed of a substance and various accidental forms. The substance of an object explains why that object remains numerically one and the same through time and change. For example, Thomas would say that a human being, say, Sarah, is numerically the same yesterday and today because she is numerically the same substance today as she was yesterday. However, Sarah is not absolutely the same today compared to yesterday, for today she is cheerful, whereas yesterday she was glum. Thomas calls such characteristics—forms a substance can gain or lose while remaining numerically the same substance—accidental forms or accidents. At any given time, Sarah is a composite of her substance and some set of accidental forms. Now, we have shown that God is not composed of parts. Therefore, God also is not a composite of substance and accidental forms.

ii. God is Not Changeable

God’s not being composed of substance and accidental forms shows that God does not change, for if a being changes, it has a feature at one time that it does not possess at another. However, features that a being has at one time that it does not have at another are accidental forms. Thus, beings that change are composed of substance and accidental forms. However, God is not composed of substance and accidents. Therefore, God does not change (see, for example, ST Ia. q. 9).

Indeed, the fact that God is not composed of parts shows that God is not only unchanging, but also immutable (unchangeable), for if God can change, then God has properties or features that he can gain or lose without going out of existence. However, properties or features that a being can gain or lose without going out of existence are accidental forms. Therefore, if God can change, then God is composed of substance and accidental forms. However, God is not composed of parts, including the metaphysical parts that we call substance and accidental forms. Therefore, God cannot change, that is, God is immutable.

iii. God is Not in Time

Thomas contends that God does not exist in time (see, for example, ST Ia. q. 10). To see why he thinks so, consider what he thinks time is: a measurement of change with respect to before and after. (Thomas thinks time is neither a wholly mind-independent reality—hence it is a measurement—nor is it a purely subjective reality—it exists only if there are substances that change.) Therefore, if something does not change, it is not measured by time, that is, it does not exist in time. However, as has been seen, God is unchanging. Therefore, God does not exist in time.

d. The Way of Excellence: Naming God in and of Himself

Thomas thinks that we can not only know that God exists and what God is not by way of philosophy, but we can also know—insofar as we know God is the first efficient cause of creatures, exemplar formal cause of creatures, and final cause of creatures—that it is reasonable and meaningful to predicate of God certain positive perfections such as being, goodness, power, knowledge, life, will, and love. Nonetheless, in knowing that, for example, God is good is a correct and meaningful thing to say, we still do not know the essence of God, Thomas thinks, and so we do not know what God is good means with the clarity by which we know things such as triangles have three sides, mammals are animals, or this tree is flowering right now. Why this is the case will become clear in what follows.

In Thomas’ view, words are signs of concepts and concepts are likenesses of things. (For Thomas, concepts are not [usually] the objects of understanding; they are rather that by which we understand things [see, for example, ST Ia. q. 85, a. 2], like a window in a house is that by which we see what is outside the house.) Therefore, words relate to things through the medium of intellectual conception. We can therefore meaningfully name a thing insofar as we can intellectually conceive it. Although we cannot know the essence of God in this life, we can know that God exists as the absolutely first efficient cause of creatures, we can know what God is not, and, insofar as we know God as the absolutely first efficient cause of creatures and what God is not, we can know God by way of excellence. It is this last way of knowing God that allows us to meaningfully predicate positive perfections of God, thinks Thomas. Knowing God by way of excellence requires some explanation.

First, whatever perfection P exists in an effect must in some way exist in its cause or causes, otherwise P would come from absolutely nothing, and ex nihilo nihil fit (from nothing, nothing comes). (Note that the traditional theological doctrine of creation ex nihilo, which Thomas accepts, does not contradict the Greek axiom, ex nihilo nihil fit. Whereas the latter means that nothing can come from absolutely nothing, the former does not mean that creatures come from absolutely nothing. Rather, creation ex nihilo is shorthand for the view that creatures do not have a first material cause; according to the traditional doctrine of creation ex nihilo, creatures do, of course, have a first efficient, exemplar formal, and extrinsic final cause, that is, God.) Some perfections are pure and others are impure. A pure perfection is a perfection the possession of which does not imply an imperfection on the part of the one to which it is attributed; an impure perfection is a perfection that does imply an imperfection in its possessor, for example, being able to hit a home run is an impure perfection; it is a perfection, but it implies imperfection on the part of the one who possesses it, for example, something that can hit a home run is not an absolutely perfect being since being able to hit a homerun entails being mutable, and an absolutely perfect being is not mutable since a mutable being has a cause of its existence.

Second, creatures possess perfections such as justice, wisdom, goodness, mercy, power, and love. However, justice, wisdom, goodness, mercy, power, and love are pure perfections.

Third, God is the absolutely first efficient cause, which cause is simple, immutable, and timeless. Therefore, whatever pure perfections exist in creatures must pre-exist in God in a more eminent way (ST Ia. q. 4, a. 2, respondeo). Therefore, we can apply positive predicates to God, for example, just, wise, good, merciful, powerful, and loving, although not in such a way that defines the essence of God and not in a manner that we can totally understand in this life (ST Ia. q. 13, a. 1).

Not only can we meaningfully apply positive predicates to God, some such predicates can be applied to God substantially, Thomas thinks (see, for example, ST Ia. q. 13, a. 2, respondeo). One applies a name substantially to x if that name refers to x in and of itself and not merely because of a relation that things other than x bear to x. For example, the terms “Creator” and “Lord” are not said substantially of God, Thomas thinks, since such locutions imply a relation between creatures and God, and, for Thomas, it is not necessary that God bring about creatures (God need not have created and so need not have been a Creator, a Lord, and so forth). Although we come to know God’s perfection, goodness, and wisdom through reflecting upon the existence of creatures, Thomas thinks we can know that predicates such as perfect, good, and wise apply to God substantially and do not simply denote a relation between God and creatures since, as we saw above, God is the absolutely first efficient cause of the perfection, goodness, and wisdom in creatures, and there cannot be more in the effect than in the cause.

However, given the radical metaphysical differences between God and creatures, what is the real significance of substantially applying words such as good, wise, and powerful to God? Thomas knows of some philosophers, for example, Moses Maimonides (1138-1204), who take positive predications with respect to God to be meaningful only insofar as they are interpreted simply as statements of negative theology. For example, on Thomas’ reading, Maimonides thinks “God is good” should be understood simply as “God is not evil.” Thomas notes that other theologians take statements such as “God is good” to simply mean “God is the first efficient cause of creaturely goodness.” Thomas thinks there are a number of problems with these reductive theories of God-talk, but one problem that both of them share, he thinks, is that neither of them do justice to the intentions of people when they speak about God. Thomas states, “For in saying that God lives, [people who speak about God] assuredly mean more than to say that He is the cause of our life, or that He differs from inanimate bodies” (ST Ia. q. 13, a. 2, respondeo; English Dominican Fathers, trans.). According to Thomas, positive predicates such as God is good “are predicated substantially of God, although they fall short of a full representation of Him. . . So when we say, God is good, the meaning is not God is the cause of goodness, or, God is not evil, but the meaning is, Whatever good we attribute to creatures, pre-exists in God, and in a more excellent and higher way” (ST Ia. q. 13, a. 2, respondeo; English Dominican Fathers, trans.). Although it is correct to say that goodness applies to God substantially and that God is good “in a more excellent and higher way” than the way in which we attribute goodness to creatures, given that we do not know the essence of God in this life, we do not comprehend the precise meaning of “good” as applied substantially to God.

As has been seen, Thomas thinks that even within the created order, terms such as “being” and “goodness” are “said in many ways” or used analogously. Thus, we should not be surprised that Thomas thinks that a proper use of positive predications when it comes to God, for example, in the phrase, “God is wise,” involves predicating the term wise of God and human beings analogously and not univocally or equivocally (ST Ia. q. 13, a. 5). Why can we not properly predicate the term wise of God and human beings univocally? When we attribute perfections to creatures, the perfection in question is not to be identified with the creature to which we are attributing it. For example, when we say, John is wise, we do not mean to imply John is wisdom. However, given the divine simplicity, the perfections of God are to be identified with God’s very existence so that when we say God is wise, we should also say God is wisdom itself. In fact it is important to say both God is wise and God is wisdom itself when speaking of the wisdom of God, Thomas thinks. For if we say only the latter, then we may fall into the trap of thinking that God is an abstract entity such as a number (which is false, as the ways of causality, negation, and excellence imply). If we say only the former, we run the risk of thinking about God’s wisdom as though it were like our own, namely, imperfect, acquired, and so forth (which the ways of causality, negation, and excellence also show is false). Thus, when we use the word wise of John and God, we are not speaking univocally, that is, with the precisely same meaning in each instance.

On the other hand, if we merely equivocate on wise when we speak of John and God, then it would not be possible to know anything about God, which, as Thomas points out, is against the views of both Aristotle and the Apostle Paul, that is, both reason and faith. Rather, Thomas thinks we predicate wise of God and creatures in a manner between these two extremes; the term wise is not completely different in meaning when predicated of God and creatures, and this is enough for us to say we know something about the wisdom of God. Although we do name God from creatures, we know God’s manner of being wise super-exceeds the manner in which creatures are wise. It is correct to say, for example, God is wise, but because it is also correct to say God is wisdom itself, the wisdom of God is greater than human wisdom; in fact, it is greater than human beings can grasp in this life. That being said, we can grasp why it is that God’s wisdom is greater than we can grasp in this life, namely, because God is the simple, immutable, and timelessly eternal uncaused cause of creaturely perfections, including creaturely wisdom, and that is to know something very significant about God, Thomas thinks.

7. Philosophical Anthropology: The Nature of Human Beings

Thomas attributes to Plato of Athens the following view:

(P) A human being, for example, Socrates, is identical to his soul, that is, an immaterial substance; the body of Socrates is no part of him.

Thomas thinks (P) is false. In fact, in his view there are good reasons to think a human being is not identical to his or her soul. To take just one of his arguments, Thomas thinks the Platonic view of human beings does not do justice to our experience of ourselves as bodily beings. For Thomas, Plato is right that we human beings do things that do not require a material organ, namely, understanding and willing (for his arguments that acts of understanding do not make use of a material organ per se, see, for example, ST Ia. q. 75, aa. 2, 5, and 6). However, anything that sees, hears, touches, tastes, and smells is clearly also a bodily substance. We experience ourselves as something that sees, hears, touches, tastes, and smells. In short, I smell things, therefore, I am not an immaterial substance (see, for example, ST Ia. q. 76, a. 1, respondeo).

Although Thomas does not agree with Plato that we are identical to immaterial substances, it would be a mistake—or at least potentially misleading—to describe Thomas as a materialist. Like Aristotle, Thomas rejects the atomistic materialism of Democritus. In other words, Thomas would also reject the following view:

(M) Human beings are composed merely of matter.

For Thomas, (M) is false since human beings, like all material substances, are composed of prime matter and substantial form, and forms are immaterial. In fact, even non-living things such as instances of water and bronze are composed of matter and form for Thomas, since matter without form has no actual existence.

However, Thomas thinks (M) is false in the case of human beings for another reason: the substantial form of a human being—what he calls an intellect or intellectual soul—is a kind of substantial form specially created by God, one that for a time continues to exist without being united to matter after the death of the human being whose substantial form it is. To make some sense of Thomas’ views here, note that Thomas thinks a kind of substantial form is the more perfect insofar as the features, powers, and operations it confers on a substance are, to use a contemporary idiom, “emergent,” that is, features of a substance that cannot be said to belong to any of the integral parts of the substance that is configured by that substantial form, whether those integral parts are considered one at a time or as a mere collection. Here is Thomas:

It must be considered that the more noble a form is, the more it rises above (dominatur) corporeal matter, the less it is merged in matter, and the more it exceeds matter by its operation or power. Hence, we see that the form of a mixed body has a certain operation that is not caused by [its] elemental qualities (ST Ia. q. 76, a. 1, respondeo; English Dominican Fathers, trans.).

In other words, a substance’s substantial form is something above and beyond the properties of that substance’s integral parts. Why think a thing like that? Substances have powers and operations that are not identical to any of the powers and operations of that substance’s integral parts taken individually, nor are the powers conferred by a substantial form of a substance x identical to a mere summation of the powers of the integral parts of x. Thus, a mixed body such as a piece of bronze has certain powers that none of its elemental parts have by themselves nor when those elemental parts are considered as a mere sum.

Consider that Thomas thinks substantial forms fall into the following sort of hierarchy of perfection. The least perfect kind of substantial form corresponds with the least perfect kind of material substance, namely, the elements (for Thomas, elemental substances are individual instances of the kinds water, air, earth, and fire; for us they might be fundamental particles such as quarks and electrons). Thomas says that the substantial forms of the elements are wholly immersed in matter, since the only features that elements have are those that are most basic to matter. In contrast, the substantial forms of compounds, that is, instances of those non-living substance-kinds composed of different kinds of elements, for example, blood, bone, and bronze, have operations that are not caused by their elemental parts. Above the substantial forms of compounds, the substantial forms of living things, including plants, reach a level of perfection such that they get a new name: “soul” (see, for example: Disputed Question on the Soul [QDA] a. 1; ST Ia. q. 75, a.1; and ST Ia. q. 76, a.1.). For those of the 21st century, soul almost always means “immortal substance.” Thomas rather uses soul (anima) in Aristotle’s deflationary sense of “a substantial form which is the explanation for why a substance is alive rather than dead.” To see this, consider the English word “animate.” Soul (anima), for Thomas, is the principle or explanation for life or animation in a living substance. Souls are therefore substantial forms that enable plants and animals to do what all living things do: move, nourish, and reproduce themselves, things non-living substances cannot do. Next in line comes the souls or substantial forms of non-human animals, which have emergent properties to an even greater degree than the souls of plants, since in virtue of these substantial forms non-human animals not only live, move, nourish themselves, and reproduce, but also sense the world. Finally, the substantial forms of human beings have operations (namely, understanding and willing) that do not require bodily organs at all in order to operate, although such operations are designed to work in tandem with bodily organs (see, for example, SCG II, ch. 68). Since human souls do not require matter for their characteristic operations, given the principle that something’s activity is a reflection of its mode of existence (for example, if something acts as a material thing, it must be a material thing; if something acts as an immaterial thing, it must be an immaterial thing), human souls can exist apart from matter, for example, after biological death. In contrast, the substantial forms of non-human material substances are immersed in matter such that they go out of existence whenever they are separated from it (see, for example, ST Ia. q. 75, a. 3).

Since the human soul is able to exist apart from the matter it configures, the soul is a subsistent thing for Thomas, not simply a principle of being as are material substantial forms (see, for example: QDA a. 1; QDA a. 14; and ST Ia. q. 75, a. 2). However, even when it is separated from matter, a human soul remains the substantial form of a human being. As Thomas states (see, for example, ST Ia. q. 75, a. 4), a human being such as Socrates is not identical to his soul (for human beings are individual members of the species rational animal). Nonetheless, the individual soul can preserve the being and identity of the human being whose soul it is. In other words, although the soul is not identical to the human person, a human person can be composed of his or her soul alone. Thomas explains the point as follows: God creates the human soul such that it shares its existence with matter when a human being comes to exist (see, for example, SCG II, ch. 68, 3). Because the being of the human soul is numerically the same as that of the composite—again, the soul shares its being with the matter it configures whenever the soul configures matter—when the soul exists apart from matter between death and the general resurrection, the being of the composite is preserved insofar as the soul remains in existence (see, for example: SCG IV, ch. 81, 11; ST Ia. q. 76, a. 1, ad5; and ST IaIIae. q. 4, a. 5, ad2).

Consider an analogy: say Ted loses his arms and legs in a traffic accident but survives the accident. After the accident, Ted is not identical to the parts that compose him. Otherwise, we would have to say, by the law of the transitivity of identity, that Ted’s arms and legs (or the simples that composed them) were not parts of Ted before the accident. Composition is not identity. Something analogous can be said about Thomas’ views on the human soul and the human person. Although the human soul is never identical to the human person for Thomas, it is the case that after death and before the general resurrection, some human persons are composed merely of their soul.

Although the human soul can exist apart from matter between death and the general resurrection, existing separately from matter is unnatural for the human soul. The human soul, by its very nature, is a substantial form of a material substance (see, for example, SCG II, chs. 68 and 83). Given Thomas’ belief in a good and loving God, he thinks such a state can only be temporary (see, for example, SCG IV, ch. 79). Indeed, as a Catholic Christian, Thomas believes by faith that it will be only temporary, since the Catholic faith teaches there will one day be a general resurrection of the dead in which all human beings rise from the dead, that is, all intellectual souls will reconfigure matter. At that time not only will all separated souls configure matter again, by a miracle the separated soul of each human being will come to configure matter such that each human being will have numerically the same human body that he or she did in this life (see, for example: ST Suppl. q. 79, a. 1; and SCG IV, chs. 80 and 81). Human beings will then be restored to their natural state as embodied beings that know, will, and love.

Finally, since human souls are immaterial, subsistent entities, they cannot have their origin in matter (see, for example, SCG II, ch. 86). Thus, unlike material substantial forms, human souls only come to exist by way of a special act of creation on the part of God (see, for example, SCG II, ch. 87). Therefore, for Thomas, the beginning of the existence of every human person is both natural (insofar as the human parents of that person supply the matter of the person) and supernatural (insofar as God creates a person’s substantial form or intellectual soul ex nihilo).

8. Ethics

Thomas has one of the most well-developed and capacious ethical systems of any Western philosopher, drawing as he does on Jewish, Christian, Greek, and Roman sources, and treating topics such as axiology, action-theory, the passions, virtue theory, normative ethics, applied ethics, law, and grace. His ST alone devotes some 1,000 pages in English translation to ethical issues. Where many philosophers have been content to treat topics in meta-ethics and ethical theory, Thomas also devotes the largest part of his efforts in ST, for example, to articulate the nature and relations between the particular virtues and vices. In this summary of his ethical thought, we treat, only in very general terms, what Thomas has to say about the ultimate end of human life, the means for achieving the ultimate end, the human virtues as perfections of the characteristic human powers, the logical relationship between the virtues, moral knowledge, and the ultimate and proximate standards for moral truth.

a. The End or Goal of Human Life: Happiness

Thomas argues that in order to make sense of any genuine action in the universe we must distinguish its end or goal from the various means that a being employs in order to achieve such an end, for if a being does not act for an end, then that being’s acting in this or that way would be a matter of chance. In that case there would be no reason why the being acted as it did. In other words, the act would be unintelligible. However, for any act A in the universe, A is intelligible. Therefore, every being acts for an end (see, for example, SCG III, ch. 2). An end of an action is something (call it x) such that a being is inclined to x for its own sake and not simply as a means to achieving something other than x. A means to an end refers to something (call it y) such that a being is inclined to y for the sake of something other than y. However, some ends are what Thomas calls “ultimate.” An ultimate end is an end of action such that a being is inclined to it merely for its own sake, not also as a means to some further end.

Thomas thinks we can apply this general theory of action to human action. For example, although wealth might be treated as an end by a person relative to the means that a person employs to achieve it, for example, working, Thomas thinks it is obvious that wealth is not an ultimate end, and even more clearly, wealth is not the ultimate end. This distinction between an ultimate end and the ultimate end is important and does not go unnoticed by Thomas. He is willing to take seriously the possibility that human life might have several ultimate ends (see, for example, ST IaIIae. q. 1, a. 5). For example, we might think that knowledge, virtue, and pleasure are each ultimate ends of human life, that is, things we desire for their own sake and not also as means to some further end. However, Thomas thinks it is clear that a human being really has only one ultimate end. This is because the ultimate end—as Thomas understands the term—is more than simply something we seek merely for its own sake; it is something such that all by itself it entirely satisfies one’s desire. Say that John desires pleasure and virtue as ends in themselves, and pleasure and virtue do not necessarily come and go together in this life (some things that are pleasant are not compatible with a life of virtue; sometimes the virtuous life entails doing what is unpleasant). Thus, neither of these could be equivalent to the ultimate end for John; for John’s having one without the other, there would still be something that John desires, and possession of the ultimate end sates all of one’s desires. In that case, if pleasure and virtue are both ends in themselves, then at most they must be component parts of an ultimate end construed as a complex whole.

Thus, for Thomas, each and every human being (like all beings) has one ultimate end. However, do all human beings have the same ultimate end? Thomas thinks so, and he believes that, in one sense, this should not be controversial. All human beings think of happiness as the ultimate end of human beings. Of course, Thomas recognizes that to speak about the ultimate end as “happiness” is still to speak about the ultimate end in very abstract terms, or, as Thomas puts it, to speak merely of the “notion of the ultimate end” (rationem ultimi finis) (ST IaIIae. q. 1, a. 7). Four people might agree that their goal in life is to be happy but disagree with one another (greatly) about that in which a happy life consists. For Thomas, this claim is not the same as the claim that human beings choose different means to achieving happiness. Although this is undoubtedly true, what Thomas means to say here is that people disagree about the nature of the happy life itself, for example, some think the ultimate end itself is the acquisition of wealth, others enjoying certain pleasures, whereas others think the happy life is equivalent to a life of virtuous activity. To see Thomas’ point, compare John and Jane, both of whom plan to rob a bank. John (unthinkingly) takes the acquisition of a great sum of wealth to be his ultimate end. Jane realizes that wealth is really merely an instrumental good and has already planned to retire to a vacation resort, which she (still shortsightedly) takes to be the object of human happiness.

Although people certainly disagree about what happiness is in the concrete, Thomas maintains that there are objective truths about the nature of happiness. (It is important to emphasize here that if one thinks that there are ways in which all of us must live if we are to be counted as genuinely happy, for example, by displaying and acting in accord with the moral virtues, then one can also think there are nearly an infinite number of ways that we can manifest those virtues, for example, as doctors, lawyers, teachers, artists, mechanics, engineers, priests, lay persons, and so forth.) If we take Thomas’ manner of speaking about human happiness in ST as demonstrative of his own position—what we have here, after all, is one long chain of arguments—Thomas also thinks that it is possible to offer a convincing argument for what it is that, objectively, fulfills a human being qua human being. However, Thomas also shows sensitivity to the role that our moral habits play in forming our beliefs—and so which arguments we will find convincing—regarding the nature of the good life for human beings (see, for example, ST IaIIae. q. 1, a. 7).

Before leaving the subject of the ultimate end of human action, we should note two other respects in which Thomas thinks the expression “ultimate end” (or “happiness”) is ambiguous. First, it is one thing to speak about the happiness that human beings can possess in this life, what Thomas sometimes calls “imperfect human happiness,” and another to speak about the happiness possessed by God, the angels, and the blessed, which Thomas considers to be perfect (see, for example, ST IaIIae. q. 4, a. 5). Thomas calls this worldly human happiness imperfect not only because he thinks it pales by comparison with the perfect happiness enjoyed by the saints in heaven, but also because he reads Aristotle—whose discussion of happiness is very important for Thomas’ own—as thinking about this worldly human happiness as imperfect. Thomas notes that, after Aristotle identifies the general characteristics of human happiness in NE, book I, ch. 7, Aristotle goes on to note in chapter 10 that human beings cannot be happy in this life, absolutely speaking, or perfectly, since human beings in this life can lose their happiness, and not being able to lose their happiness is something human beings desire. Thus, Aristotle himself thinks of human happiness in this life as imperfect in comparison to the conditions he lays out in NE, book I, ch. 7. Aristotle thinks humans are happy in this life merely as human beings, that is, as beings whose nature is mutable.

Second, Thomas recognizes two different kinds of questions we might wish to raise when we think about the nature of human happiness (see, for example, ST IaIIae. q. 1, a. 8 and q. 2, a. 7). When asking about the nature of human happiness, we might be asking what is true about the person who is happy. As Thomas puts it, this is to focus our attention on the use, possession, or attainment of happiness by the one who we are describing as (at least hypothetically) happy. To speak about happiness in this sense is to make claims about what has to be true about the soul of the person who is happy, for example, that happiness is an activity of the soul and not merely a state of the soul or an emotion, that it is a speculative rather than a practical activity, that this activity does not require a body, and so forth. However, in asking about the happiness of human beings, we might rather be asking about the object of happiness, or as Thomas puts it, “the thing itself in which is found the aspect of good” (ST IaIIae q. 1, a. 8). For example, the end of a hungry man in the sense of the object of his desire is food; the end of the hungry man in the sense of attainment is eating.

What constitutes happiness for Thomas? Thomas agrees with Aristotle that the attainment of happiness consists in the soul’s activity expressing virtue and, particularly, the best virtue of contemplation where the object of such contemplation is the best possible object, that is, God. Thus, the object of human happiness, whether perfect or imperfect, is the cause of all things, namely, God, for human beings desire to know all things and desire the perfect good. However, this is just another way to talk about God. Therefore, whether they consciously know it or not, all human beings desire contemplative union with God. Thomas thinks that human beings in this life—even those who possess the infused virtues, whether theological or moral (about which more is said below)—at best attain happiness only imperfectly since their contemplation and love of God is, at best, imperfect. For Thomas, only human happiness in heaven is perfect insofar as God brings it about that persons in heaven enjoy a perfect intellectual and volitional union with God. Thomas calls such a union the beatific vision.

b. Morally Virtuous Action as the Way to Happiness

Thomas thinks that happiness is the goal of all human activity. That suggests that human beings normally achieve happiness by means of human actions, that is, embodied acts of intellect and will (see, for example, ST IaIIae. q. 6, prologue). However, Thomas also thinks there are certain kinds of human actions that conduce to happiness. One complication, however, arises from the fact that Thomas thinks that we can speak about both imperfect and perfect happiness, the latter which is a happiness that human beings can only possess by God’s grace helping us transcend (but not setting aside) human nature. This latter happiness culminates for the saints in the beatitudo (blessedness) of heaven. Thus, according to Thomas, there are, in reality, two mutually reinforcing stories to tell about those human actions that lead to happiness. Since our focus here is on Thomas’ philosophy, we shall focus on what follows on what Thomas has to say about the relation between virtuous actions and imperfect happiness in this life. (We will nonetheless have occasion to discuss a few things about Thomas’ views on perfect happiness.)

Thomas’ primary concern in the place where he provides his most detailed outline of the good human life—ST IaIIae.—is explaining how human beings achieve happiness by means of virtuous human actions, especially morally virtuous actions (for more on the difference between intellectual virtue and moral virtue, see the section below on Human Virtues as Perfections of Characteristically Human Powers). Thomas, like Aristotle and Jesus of Nazareth (see, for example, Matthew 5:48), is a moral perfectionist in the sense that the means to human happiness comes not by way of merely good human actions, but by way of perfect or virtuous moral actions. Thus, in order to understand Thomas’ understanding of morality and the good life, we have to say something about his understanding of virtuous moral activity. However, what are morally virtuous human actions? In general terms, Thomas thinks virtuous human actions are actions that perfect the human agent that performs them, that is, good human actions are actions that conduce to happiness for the agent that performs them. An act is perfective of an agent relative to the kind to which the agent belongs. Since human beings are rational animals by nature, then virtuous human actions are actions that perfect the rationality and animality of human beings. Of course, this is still to speak about actions that conduce to happiness in very abstract terms. Thomas has much to say about the specific characteristics of virtuous human action, especially morally virtuous action.

i. Morally Virtuous Action as Pleasurable

First of all, good or happiness conducive human actions are pleasant for Thomas. Thomas goes so far as to say that intellectual pleasure (or delight) is even a necessary or proper accident of human activity in heaven (see, for example, ST IaIIae. q. 4, a. 1; and ST IaIIae. q. 34, a. 3). Thomas also sees pleasure as a necessary feature of the kind of happiness humans can have in this life, if only because virtuous activity—at the center of the good life for Thomas—involves taking pleasure in those virtuous actions (see, for example, ST IaIIae. q. 31, a. 4; ST IaIIae. q. 31, a. 5, ad1; and ST IaIIae. q. 35, a. 5). Both intellectually and morally virtuous actions are pleasant in themselves, thinks Thomas; in fact, he thinks they are the most pleasant of activities in themselves (ST IaIIae. q. 31, a. 5).

However, it is not just intellectual pleasure that belongs to virtuous human action in this life for Thomas, but bodily pleasure, too. For we are bodily creatures and not simply souls, and so human perfection (happiness) must make reference to the body (ST IaIIae. q. 59, a. 3). Thomas rejects the view, held by some Stoics, that all bodily pleasures are evil. As Thomas notes, it is natural for human beings to experience bodily and sensitive pleasures in this life (ST IaIIae. q. 34, a. 1). Therefore, the perfection of a bodily nature such as ours will involve not only intellectual pleasures, but bodily and sensitive pleasures, too.

Nonetheless, Thomas thinks it is true that bodily pleasure tends to hinder the use of reason, and this for three reasons (ST IaIIae. q. 33, a. 3). First, bodily pleasures, as powerful as they are, can distract us from the work of reason. Second, bodily pleasures can be contrary to reason, particularly those that are enjoyed in excess. Third, bodily pleasures can weaken or fetter the reason in a way analogous to how the drunkard’s use of reason is weakened. However, despite all of this, Thomas does not think that bodily pleasure is something evil by definition, and this for two reasons. First, pleasure is taking repose in an apparent good; but if we take repose in a manner that is consistent with reason, such pleasure is good, otherwise, it is not. Second, taking pleasure in an action is more akin to that action than a desire to act since the desire to act precedes the act whereas the pleasure in acting does not. However, desiring to do good is something good, whereas desiring to do evil is itself evil. A fortiori, taking pleasure in doing good is itself something good whereas taking pleasure in evil is something evil.

However, perhaps some bodily pleasures are evil by definition. For example, there have been philosophers and religious teachers that teach that sexual pleasure is evil insofar as it hinders reason. Although Thomas agrees that sexual pleasure hinders reason, he disagrees that sexual pleasure is bad per se. Recall that a bodily pleasure hinders reason for one of three reasons: it distracts us from using reason, it is inconsistent with reason, or it weakens reason. Thomas does not think that sexual pleasure per se is inconsistent with reason, for it is natural to feel pleasure in the sexual act (indeed, Thomas says that, before the Fall, the sexual act would have been even more pleasurable [see, for example, ST Ia. q. 98, a. 2, ad3]), and performing the sexual act within marriage is, all other things being equal, something natural and good. Thus, sexual pleasure must hinder reason insofar as it distracts us from using reason or weakens reason. However, this need not be morally evil, even a venial sin, as long as it is not inconsistent with reason, just as sleep, which hinders reason, is not necessarily evil, for as Thomas notes, “Reason itself demands that the use of reason be interrupted at times” (ST IaIIae. q. 34, a. 1, ad1).

ii. Morally Virtuous Action as Perfectly Voluntary and the Result of Deliberate Choice

Although virtuous actions are pleasant for Thomas, they are, more importantly, morally good as well. What does this mean for Thomas? We can begin with the fact that, according to Thomas, morally good actions are moral rather than amoral. However, moral actions have being voluntary as a necessary condition. Voluntary acts are acts that arise (a) from a principle intrinsic to the agent and (b) from some sort of knowledge of the end of the act on the part of the agent (see, for example, ST IaIIae. q. 6, a. 1). For example, the movements of a plant do not meet the necessary condition of being voluntary, according to Thomas. This is because plants do not have cognitive powers and so have no apprehension of the end of their actions. To take another example, insofar as a squirrel moves towards an object on the basis of apprehending that object by way of its sense faculties, the squirrel’s act is, in a sense, a voluntary one (see, for example, ST IaIIae. q. 6, a. 2).

However, an action’s being voluntary is not a sufficient condition for that action counting as a moral action according to Thomas. More than being voluntary, moral actions must be perfectly voluntary in order to count as moral actions. A perfectly voluntary action is an action that arises (a) from knowledge of the end of an action, understood as an end of action, and (b) from knowledge that the act is a means to the end apprehended (see, for example, ST IaIIae. q. 6, a. 2). This is just to say that perfectly voluntary actions are caused by rational appetite, or will, for Thomas. Therefore, although irrational animals (such as squirrels) can be said, in a sense, to act voluntarily, they cannot be understood to be acting morally, since they do not cognize the end as an end and do not understand their actions to be a means to such an end. Indeed, insofar as an act of a human being does not arise from an act of will, for example, when someone moves his or her arm while he or she is asleep, that action is not perfectly voluntary and so is not a moral action for Thomas (see, for example, ST IaIIae. q. 1, a. 1).

Morally virtuous action is moral (rather than amoral) action, and so it is perfectly voluntary. However, morally virtuous activity is also intentional and deliberate. Here we see a connection between the virtue of prudence and the other moral virtues. Prudence is that virtue that enables one to make a virtuous decision about what, for example, courage calls for in a given situation, which is often (but not always) acting in a mean between extremes. In other words, prudence is the virtue of rational choice (see, for example, ST IaIIae. q. 57, a. 5). Without prudence, human action may be good but not virtuous since virtuous activity is a function of rational choice about what to do in a given set of circumstances; although, as we shall see, virtuous action arises from a virtuous habit, and virtuous action is not habitual in the sense that we “do it without even thinking about it.”

iii. Morally Virtuous Action as Morally Good Action

Although morally virtuous action is more than simply morally good action, it is at least that. However, how does Thomas distinguish morally good actions from bad or indifferent ones? First of all, Thomas thinks that some kinds of actions are bad by definition. As Thomas would put it, such actions are bad according to their genus or species, no matter the circumstances in which those actions are performed. For example, an act of adultery is a species of action that is immoral in and of itself insofar as such acts necessarily have the agent acting immoderately with respect to sexual passion as well as putting preexisting or potential children at great risk of being harmed (ST IIaIIae. q. 154, a. 8, respondeo). An action, therefore, that counts as morally good—and so is conducive to living what we might call a good life—cannot be an action that is morally bad according to its genus or species.

Second, there are circumstances surrounding an action that affect the moral goodness or badness of an action. For example, Thomas thinks that it is morally permissible for a community to put a criminal to death on the authority of the one who governs that community. However, if those in authority in a community have set a timetable for an execution, say, that it should occur no sooner than Wednesday at 5 PM, and John the executioner, on his own authority, kills the prisoner on Wednesday at 10 AM (where John is not also an authority in the community), then the circumstances of John’s act of killing make what might otherwise have been a morally permissible act to be an immoral act. Sometimes circumstances make an action that is bad according to its species even worse. For example, it is morally wrong to murder. However, if someone murders his father, he commits patricide, which is a more grievous act than the act of murdering a stranger.

Third, motivations count as another form of circumstance that make an action bad, good, better, or worse than another. If Jane obeys her parents because of her love for God while Joan does so because she is afraid of being punished, although Joan’s act can still be morally praiseworthy, it is not as praiseworthy as Jane’s, since Jane’s motivation for moral action is better than Joan’s.

In putting these three “sources” for offering a moral evaluation of a particular human action together—kind of action, circumstances surrounding an action, and motivation for action——Thomas thinks we can go some distance in determining whether a particular action is morally good or bad, as well as how good or bad that action is. For example, Thomas thinks lying by definition is morally bad (see, for example, ST IaIIae. q. 110, a. 3). However, not all lies are equally bad. If someone lies in order to get an innocent person killed, one commits a mortal sin (the effect of which is, if one dies without repenting of such a sin, one will go to hell). However, if one tells a lie in order to save a person’s innocent life, one does something morally wrong, but such moral wrongdoing counts only as a venial sin, where venial sins harm the soul but do not kill charity or grace in the soul (see, for example, ST IaIIae. q. 110, a. 4).

iv. Morally Virtuous Action as Arising from Moral Virtue

Morally virtuous action, therefore, is minimally morally good action—morally good or neutral with respect to the kind of action, good in the circumstances, and well-motivated. However, it is also action that arises from a good moral habit, that is, a moral virtue, which good moral habits make it possible easily and gracefully to act with moral excellence. To be sure, in many cases, moral virtues are acquired by way of good actions. However, one morally good action is not necessarily a morally virtuous act. This is because virtuous actions arise from a habit such that one wills to do what is virtuous with ease. The person who does what the virtuous person does, but with great difficulty, is at best continent or imperfectly virtuous—a good state of character compared to being incontinent or vicious to be sure—but not perfectly virtuous.

One way that Thomas often sums up the conditions for morally virtuous action we have been discussing is to say that morally virtuous action consists in a mean between extremes (see, for example, ST IaIIae. q. 60, a. 4). In acting temperately, for example, one must eat the right amount of food in a given circumstance, for the right reason, in the right manner, and from a temperate state of moral character. If, for example, John eats the right amount of food on a day of feasting (where John rightly eats more on such days than he ordinarily does), but does so for the sake of vain glory, his eating would nonetheless count as excessive. If, on the other hand, John eats the right amount of food on a day of mourning (where John rightly eats less on such days than he ordinarily does) for the sake of vain glory, this would be deficient (compare ST IaIIae. q. 64, a. 1, ad 3). Of course, John might also eat too much on a given day, or too little, for example, on a day marked for feasting and celebration. Such actions would also be excessive and deficient, respectively, and not morally virtuous.

c. Human Virtues as Perfections of Characteristically Human Powers

So far we have discussed Thomas’ account of the nature of the means to happiness as moral virtue bearing fruit in morally virtuous action. One might wonder how we acquire the virtues. Although we have a natural desire for some of the virtues, the actual possession of the virtues is not in us by nature. How do we come to possess the virtues according to Thomas? Here, it is again worth pointing out that there are two stories to tell, since Thomas thinks there are really two different kinds of virtue, one which disposes us to act perfectly in accord with human nature and one which disposes us to perform acts which transcend human nature (see, for example, ST IaIIae. q. 54, a. 3). These two kinds of virtues correspond with the two different ends of human beings for Thomas, one that is natural, that is, the imperfect happiness attainable by human beings in this life by the natural light of reason and the natural inclination of the will, and one that is supernatural and comes to us only by grace, that is, the perfect happiness of the saints in heaven, in which happiness Christians can begin to participate even in this life, Thomas thinks.

According to Thomas, human beings can acquire virtues that perfect human beings according to their natural end by repeatedly performing the kinds of acts a virtuous person performs, that is, by habituation. Thomas calls such virtues human (see, for example, ST IaIIae. q. 54, a. 3; ST IaIIae. q. 55, aa. 1-3; and ST IaIIae. q. 61, a. 1, ad2) in order to distinguish such virtues from “infused” (or, to use concepts Thomas finds in Aristotle, “god-like,” “heroic” or “super-human”) virtues, which are virtues we have only by way of a gift from God, not by habituation. For example, we can imagine that, apart from any special gift of the God, Socrates was courageous in the sense that Socrates acquired the ability to habitually say “yes” to pains that are in accord with right reason in much the same way that an athlete or a musician voluntarily becomes more skilled or proficient in what they do through practice, that is by doing (or at least approximating) what good athletes and virtuosi do. Before saying more about human virtue, which is our focus here, it will be good to say a few things about infused virtue since this is an important topic for Thomas, and Thomas’ views on infused virtue are historically very important.

i. Infused Virtues

Like human virtues, infused virtues are perfections of our natural powers that enable us to do something well and to do it easily. For example, the virtue of faith enables its possessor, on a given occasion, to believe that “God exists and rewards those who seek Him” (Hebrews 11:6) and to do so confidently and without also thinking it false that God exists, and so forth. In addition, as in the case of human virtues, we are not born with the infused virtues; virtues, for Thomas, are acquired.

However, infused virtues differ from human virtues in a number of interesting ways. First, unlike human virtues, which enable us to perfect our powers such that we can perform acts that lead to a good earthly life, infused virtues enable us to perfect our powers such that we can perform acts in this life commensurate with—and/or as a means to—eternal life in heaven (ST IaIIae. q. 62, a. 1).

Second, whereas a human virtue, for example, human temperance, is acquired by habituation, that is, by repeatedly performing the kinds of actions that are performed by the temperate person, infused virtues are wholly gifts from God. Thomas cites St. Augustine in this regard: “Virtue is a good quality of the mind, by which we live righteously, of which no one can make a bad use, which God works in us, without us (ST IaIIae. q. 55, a. 4, obj. 1; emphasis mine). To see clearly this difference between human and infused virtue according to Thomas, note that Thomas thinks that neither infused nor human virtue makes a human being impervious to committing mortal sin. (For Thomas, a mortal sin is a sin that kills supernatural life in the soul, where such supernatural life makes one fit for the supernatural reward of heaven. Mortal sins require intentionally and deliberately doing what is grievously morally wrong. Contrast a mortal sin with a venial sin. Although venial sin can lead to mortal sin, and so ought to be avoided, a venial sin does not destroy supernatural life in the human soul.) Does Socrates lose his human virtue, for example, his courage, if he commits a mortal sin? Thomas thinks the answer is “no.” This is because naturally acquired virtues are virtues acquired through habituation, and one sinful act does not destroy a habit acquired by way of the repetition of many acts of one kind (see, for example, ST IaIIae. q. 63, a. 2, ad2). However, since infused virtues are not acquired through habituation but are rather a function of being in a state of grace as a free gift from God, and sinning mortally causes one to no longer be in a state of grace, just one mortal sin eliminates the infused virtues in the soul (although imperfect forms of them can remain, for example, unformed faith and hope [see below]). Of course, such mortal sins can be forgiven, Thomas thinks, by God’s grace through the sacrament of penance, thereby restoring a soul to the state of grace (see, for example, ST IIIa. q. 86, a. 1, respondeo).   

Thomas speaks of at least two different kinds of infused virtue. First, there are the well-known theological virtues of faith, hope, and charity (see, for example, St. Paul’s First Letter to the Corinthians, ch. 13). In general, the theological virtues direct human beings toward their supernatural end, specifically in relation to God himself. In other words, they are gifts of God that enable human beings to look to God himself as the object of a happiness that transcends the natural powers of human beings. Faith is the infused virtue that enables its possessor to believe what God has supernaturally revealed. Hope is the infused virtue that enables its possessor to look forward to God Himself—and not some created image of God—being the object of his or her perfect bliss. Finally, the virtue of charity creates a union of friendship between the soul of its possessor and God—a union that is not natural to human beings but requires that God raise up the nature of its possessor to God. In comparison to charity, faith and hope are imperfect infused virtues, since, unlike charity, faith and hope connote the lack of complete possession of God (see, for example, ST IaIIae. q. 66, a. 6, respondeo). As has been seen, perfect human happiness (qua possession) consists of the beatific vision. However, if we have faith, we do not have vision. If we have hope, we do not yet possess that for which we hope. Therefore, among the theological virtues, only charity remains in the saints in heaven. Thomas thinks this is one reason why St. Paul says, “The greatest of these [three virtues, that is, faith, hope, and charity] is charity.”

Unlike the intellectual and moral virtues—whether infused or human—the theological virtues do not observe the mean where their proper object, that is, God, is concerned, for Thomas thinks it is not possible to put faith in God too much, to hope too much in God, or to love God more than one should (see, for example, ST IaIIae. q. 64, a. 4).

Second, in addition to the theological virtues, there are also the infused versions of the intellectual and moral virtues (see, for example, ST IaIIae. q. 63, a. 3; on the distinction between intellectual and moral virtue, see below). Why infused virtues of this type? Whereas the theological virtues direct human beings to God Himself as object of supernatural happiness, the infused intellectual and moral virtues are those virtues that are commensurate with the theological virtues—and thus direct us to a supernatural perfection—where things other than God are concerned. Just as human beings are naturally directed to both God and creatures through their natural desires and through virtues that can be acquired naturally, so human beings, by the grace of God, can be supernaturally directed both to God and creatures through the theological and the infused intellectual and moral virtues, respectively. As Thomas says in one place, where the human moral virtues, for example, enable human beings to live well in a human community, the infused moral virtues make human beings fit for life in the kingdom of God (see, for example, ST IaIIae. q. 63, a. 4).

ii. Human Virtues

Thomas thinks there are a number of human virtues, and so in order to offer an account of what he has to say about humanly virtuous activity (and its relationship to the imperfect human happiness we can have in this life), we need to mention the different kinds of human virtues. In order to do this, we have to examine the various powers that human beings possess, since, for Thomas, mature human beings possess various powers, and virtues in human beings are perfections of the characteristically human powers (see, for example, ST IaIIae. q. 55, a. 1).

First, there are the rational powers of intellect and will. Although Thomas thinks that intellect enables human beings to do a number of different things, most important for the moral life is intellect’s ability to allow a human being to think about actions in universal terms, that is, to think about an action as a certain kind of action, for example, a voluntary action, or as a murder, or as one done for the sake of loving God. Our ability to do this—which separates us from irrational animals, Thomas thinks—is a requisite condition for being able to act morally. Since a gorilla, we might suppose, cannot think about actions in universal terms, it cannot perform moral actions.

Second, Thomas also distinguishes between the apprehensive powers of the soul, that is, powers such as sense and intellect that are productive of knowledge of some sort, and the appetitive powers of the soul, which are powers that incline creatures to a certain goal or end in light of how objects are apprehended by the senses and/or intellect as desirable or undesirable. The will, according to Thomas, is an appetitive power always linked with the operation of intellect. For Thomas, intellect and will always act in tandem. Since the object of will—that is, what it is about—is being insofar as the intellect presents it as desirable, Thomas thinks of will as rational appetite. The will is therefore an inclination in rational beings towards an object or act because of what the intellect of that being presents of that object or act as something desirable or good in some way.

In addition to the appetitive power of the will, there are appetitive powers in the soul that produce acts that by nature require bodily organs and therefore involve bodily changes, namely, the acts of the soul that Thomas calls passions or affections. These include not only emotions such as love and anger, but pleasure and pain, as well (see, for example, ST IaIIae. q. 31, a. 1).

Thomas thinks there are two different kinds of appetitive powers that produce passions in us, namely, the concupiscible power and the irascible power. The object of the concupiscible power is sensible good and evil insofar as a creature desires/wants to avoid such sensible goods/evils in- and-of-themselves. Thus, the concupiscible power produces in us the passions of love, hate, pleasure, and pain or sorrow. By contrast, the object of the irascible power is sensible good and evil insofar as such good/evil is difficult to acquire/avoid. Thomas therefore associates the passions of anger, fear, and hope with the irascible power.

In contrast to Socrates of Athens, who, according to Thomas, thinks all human virtues are intellectual virtues (see, for example, ST IaIIae. q. 58, a. 2), Thomas distinguishes intellectual and moral virtues since he thinks human beings are both intellectual and appetitive beings. Since virtues are dispositions to make a good use of one’s powers, Thomas distinguishes virtues perfecting the intellect—called the intellectual virtues—from those that perfect the appetitive powers, that is, the moral virtues. Unlike the moral virtues, which automatically confer the right use of a habit, intellectual virtues merely confer an aptness to do something excellently (ST IaIIae. q. 57, a. 1). For example, John might have an intellectual virtue such that he can easily solve mathematical problems. However, John might use such a habit for evil purposes. On the other hand, if John is courageous, he cannot make use of his habit of courage to do what is wrong. If John were to do what is morally wrong, it would be in spite of his moral virtues, not because of them.

Following Aristotle, Thomas mentions five intellectual virtues: wisdom (sapientia), understanding (intellectus), science (scientia), art (ars), and prudence (prudentia). First, there are the purely speculative intellectual virtues. These intellectual virtues do not essentially aim at some practical effect but rather aim simply at the consideration of truth. Understanding is the speculative intellectual virtue concerning the consideration of first principles, that is, those propositions that are known through themselves and not by way of deduction from other propositions, for example, the principle of non-contradiction, and propositions such as all mammals are animals and it is morally wrong to kill an innocent person intentionally. Wisdom is the intellectual virtue that involves the ability to think truly about the highest causes, for example, God and other matters treated in metaphysics. As we saw in the section on the nature of knowledge and science above, science (considered as a virtue) is the intellectual ability to draw correct conclusions from first principles within a particular subject domain, for example, there is the science of physics, which is the ability to draw correct conclusions from the first principles of being qua material being.

Second, there are two intellectual virtues, namely, art and prudence, to which it belongs essentially to bring about some practical effect. Thomas defines art as “right reason about certain works to be made” (ST IaIIae. q. 57, a. 3, respondeo). Art is therefore unlike the first three of the intellectual virtues mentioned—which virtues are purely speculative—since art necessarily involves the practical effect of bringing about the work of art (if I simply think about a work of art without making a work of art, I am not employing the intellectual virtue of ars). Thomas considers art nonetheless to be an intellectual virtue because the goodness or badness of the will is irrelevant where the exercise of art itself is concerned. (Beethoven may or may not have been a morally bad man all the while he composed the 9th symphony, but we need not consider the moral status of Beethoven’s appetites when we consider the excellence of his 9th symphony qua work of art).

Finally, there is prudence. Prudence is the habit that enables its possessor to recognize and choose the morally right action in any given set of circumstances. As Thomas puts it: “Prudence is right reason of things to be done” (ST IaIIae. q. 57, a. 4, respondeo). Prudence is not a speculative intellectual virtue for the same reason ars is not: the human being exercising the virtue of prudence is not simply thinking about an object but engaged in bringing about some practical effect (so, for example, the philosopher who is simply thinking about the right thing to do without actually doing the morally right thing is not exercising the virtue of prudence, even if said philosopher is, in fact, prudent). Prudence also differs from ars in a crucial way: whereas one can exercise the virtue of ars without rectitude in the will, for example, one can bring about a good work of art by way of a morally bad action, one cannot exercise the virtue of prudence without rectitude in the will. Indeed, we do not find prudence in a person without also finding in that person the moral virtues of justice, courage, and temperance. Thus, not only is prudence necessarily practical, its exercise necessarily involves someone (a) habitually acting with a good will and (b) possessing appetites for food, drink, and sex that are habitually measured by right reason.

Why, then, is prudence an intellectual virtue for Thomas? Recall that Thomas thinks that virtue is the perfection of some power of the soul. Thomas therefore thinks the essential difference between the intellectual and moral virtues concerns the kinds of powers they perfect. Intellectual virtues perfect the intellect while moral virtues are perfections of the appetitive powers. However, prudence is essentially a perfection of intellect, and so it is an intellectual virtue. Nonetheless, it “has something in common with the moral virtues,” (ST IaIIae. q. 58, a. 3, ad1) Thomas says, insofar as it is concerned with things to be done. This is why, Thomas thinks, prudence is also reckoned among the moral virtues by authors such as Cicero and St. Augustine. Indeed, some philosophers call prudence a “mixed” virtue, partly intellectual and partly moral.

According to Thomas, moral virtue “perfects the appetitive part of the soul by directing it to good as defined by reason” (ST IaIIae. q. 59, a. 4, respondeo). Since the moral virtues are perfections of human appetitive powers, there is a cardinal or hinge moral virtue for each one of the appetitive powers (recall that prudence is the cardinal moral virtue that perfects the intellect thinking about what is to be done in particular circumstances). As has been seen, Thomas thinks there are three appetitive powers: the will, the concupiscible power, and the irascible power. Thus, there are three cardinal moral virtues: justice (which perfects the faculty of will); temperance (perfecting the concupiscible power), and fortitude (perfecting the irascible power). Where prudence perfects intellect itself thinking about what is to be done, justice is intellect disposing the will such that a person is “set in order not only in himself, but also in regard to another” (ST IaIIae. q. 66, a. 4). According to Thomas, temperance is the virtue whereby the passions of touch participate in reason so that one is habitually able to say “no” to desires of the flesh that are not in accord with right reason (ST IaIIae. q. 61, a. 3). Finally, fortitude is the virtue whereby the desire to avoid suffering participates in reason such that one is habitually able to say “yes” to suffering insofar as right reason summons us to do so (ST IaIIae q. 61, a. 3).

This is just the tip of the iceberg of what Thomas has to say by way of characterizing the human virtues and their importance for the good life. In addition, Thomas has a lot to say about the parts of the cardinal virtues and the virtues connected to the cardinal virtues, not to mention the vices that correspond with these virtues (see, for example, his treatment of these issues in ST IIaIIae).

d. The Logical Relations between the Human Virtues

Virtue ethicists have traditionally been interested in defending a position on the logical relations between the human virtues. For example, we might wonder whether one can really be courageous without also being temperate. Thomas is no exception to this rule. As has been seen, there are two kinds of human virtues, intellectual and moral. Where specifying the relations between the human moral virtues are concerned, Thomas thinks it important to distinguish two senses of human moral virtue, namely, perfect human moral virtue and imperfect human moral virtue (see, for example, ST IaIIae. q. 65, a. 1). An imperfect human moral virtue, for example, imperfect courage, is a disposition such that one simply has a strong inclination or desire to do good deeds, in this case, courageous deeds. Perfect human moral virtues, by contrast, are dispositions such that one is inclined to do good deeds well, that is, in the right way, at the right time, for the proper motive, and so forth. Where imperfect human moral virtues are concerned, these can be possessed independently of the others. For example, Joe is inclined (by nature or by acquired habit) to perform deeds that would be rightly (if loosely) described as just, but Joe is not inclined to virtuous activity where his desires for eating, drinking, and sex are concerned. By contrast, perfect human moral virtues cannot be possessed apart from one another. If Joe is perfectly just, then he also is perfectly temperate. Thomas has two reasons for accepting this “unity of the virtues” thesis. As he notes, these two reasons correspond with two different ways we can distinguish the cardinal virtues from one another (ST IaIIae. q. 65, a.1, respondeo).

First, we might distinguish the virtues “according to certain general properties of the virtues: for instance, by saying that discretion belongs to prudence, rectitude to justice, moderation to temperance, and strength of mind to courage” (ST IaIIae. q. 65, a. 1, respondeo). Given this way of distinguishing the virtues, discretion is not perfectly virtuous without strength of mind, strength of mind is not virtuous without moderation, and so forth. Thomas notes that it is for this sort of reason that, for example, Pope St. Gregory the Great and St. Augustine believe the unity of the virtues thesis.

Second, we might distinguish the cardinal virtues as Thomas himself prefers to do, after the example of Aristotle, namely, insofar as the different virtues perfect different powers. Given this way of distinguishing the virtues, it still follows that one cannot have any one of the perfect cardinal virtues without also possessing the others. This is because one cannot have courage, temperance, or justice without prudence, since part of the definition of a perfect virtue is acting in accord with rational choice, where rational choice is a function of being prudent. For example, if I am able to act courageously in a given situation, not only does my irascible power need to be perfected, that is, I have to perfectly desire to act rationally when experiencing the emotion of fear, but I need to know just what courageous action calls for in that given situation. For example, it may be that the prudent thing to do in that situation is to “run away in order to fight another day.” However, knowing just what to do in a given situation where one feels afraid is a function of the virtue of prudence. Thus, one cannot be perfectly courageous without having perfect prudence (ST IaIIae. q. 65, a. 1; see also ST IaIIae. q. 58, a. 4).

However, according to Thomas, it is also the case that one cannot be perfectly prudent unless one is also perfectly temperate, just, and courageous. This is because the prudent person has a perfected intellect where deciding on the virtuous thing to do in any given situation. However, such knowledge requires a perfected knowledge about the rational ends or principles of human action, for one cannot perfectly know how to apply the principles of action in a given situation if one does not perfectly know the principles of action. However, a perfect knowledge of the ends or principles of human action requires the possession of those virtues that perfect the irascible appetite, the concupiscible appetite, and the will, otherwise, one will have a less than perfect, that is, a distorted, picture of what ought to be pursued or avoided. For example, if John is a coward, then he will be inclined to think that one always ought to avoid what causes pain. However, if John is inclined to believe such a thing, then he will not be able to think rightly, that is, prudently, about just what he should do in a particular situation that potentially involves him suffering pain. However, what goes for courage goes for temperance and justice, too. Therefore, the perfectly prudent person has the perfect virtues of courage, temperance, and justice.

Finally, we can also note that, for Thomas, Joe cannot be perfectly temperate if he is not also perfectly courageous and just (where we are speaking about perfect human virtue). This is because Joe cannot be temperate if he is not also prudent. However, for Thomas, Joe cannot be prudent if he is not also temperate, courageous, and just. Therefore, Joe cannot be temperate if he is not also courageous and just. For the same kinds of reasons, it follows, according to Thomas, that all of the human cardinal virtues come with one another. It is for these sorts of reasons that Thomas affirms the truth of the “unity of the virtues” thesis.

Where perfect human virtue is at issue, what of the relation between the human intellectual virtues and the human moral virtues for Thomas? Since prudence is a mixed virtue—at once moral and intellectual—there is at least one human intellectual virtue that requires possession of the moral virtues and one intellectual virtue that is required for possession of the moral virtues. In addition, since the possession of prudence requires a knowledge of the principles of human action that are naturally known, that is, natural law precepts (see the section on moral knowledge below), and understanding is the virtue whose possessor has knowledge of, among other things, the principles of human action that are naturally known, possession of the moral virtues requires possession of the intellectual virtue of understanding (although one may have understanding without possessing the moral virtues, if only because one can have understanding without prudence).

As for the other intellectual virtues—art, wisdom, and science—none of these virtues can be possessed without the virtue of understanding. To give Thomas’ example, if one does not know a whole is greater than one of its parts—knowledge of which is a function of having the intellectual virtue of understanding—then one will not be able to possess the science of geometry. Aside from its dependence on understanding, the possession of the virtue of art does not require the moral virtues or any of the other intellectual virtues. The possession of science with respect to a particular subject matter seems to be similar to the virtue of art in this regard, that is, although it requires possessing the virtue of understanding, it does not require the possession of moral virtues or any other intellectual virtues.

The possession of the intellectual virtue of wisdom—habitual knowledge of the highest causes—seems to differ for Thomas from science and art insofar as possession of wisdom presupposes the possession of other forms of scientific knowledge (see, for example, SCG I, ch. 4, sec. 3). Nonetheless, like art and the other sciences, one can possess the virtue of wisdom without possessing prudence and the other moral virtues. That being said, Thomas seems to suggest that possession of the virtue of wisdom is less likely if one lacks the moral virtues (SCG I, ch. 4, sec. 3).

e. Moral Knowledge

In order to make sense of Thomas’ views on moral knowledge, it is important to distinguish between different kinds of moral knowledge, which different kinds of moral knowledge are produced by the (virtuous) working of different kinds of powers.

Thomas thinks that all human beings who have reached the age of reason and received at least an elementary moral education have a kind of moral knowledge, namely, a knowledge of universal moral principles. One place he says something like this is in his famous discussion of law in ST. In that place he argues that there are at least three different kinds of universal principles of the natural law, that is, principles that apply in all times, places, and circumstances, which principles can be learned by reflecting on one’s experiences by way of the natural light of human reason, apart from faith (although Thomas notes that knowledge of these principles often is inculcated in human beings immediately through divinely infused faith [see, for example, ST IaIIae. q. 100, a. 3, respondeo]).

First, there are those universal principles of the natural law that function as the first principles of the natural law, for example, one should do good and avoid evil (ST IaIIae. q. 100, a. 3, respondeo). Such universal principles are known to be true by every human person who has reached the age of reason without fail. Of course, most people—unless they are doing theology or philosophy—will not make such principles of practical action explicit. In being usually implicit in our moral reasoning, Thomas compares the first principles of the natural law with the first principles of all reasoning, for example, the principle of identity and the principle of non-contradiction.

Second, there are those universal principles of the natural law that, with just a bit of reflection, can be derived from the first principle of the natural law (ST IaIIae. q. 100, a. 3, respondeo). We can call these the secondary universal precepts of the natural law. For example, we all know we should do good and avoid evil. We also know, when we reflect upon it, that failing to honor those who have given us extremely valuable gifts we cannot repay would be to do evil. However, we all know that our father and mother have given us extremely valuable gifts we cannot repay, for example, life and a moral education. Therefore, we can naturally know that we ought to honor our mother and our father. Of course, most of us do not need to make such reasoning explicit in order to accept such moral principles as absolute prescriptions or prohibitions. Like the first universal principles of the natural law, the truthfulness of these secondary universal precepts of the natural law is immediately obvious to us—whether we know this by the natural light of reason insofar as the truth of such propositions is obvious to us as soon as we understand the meaning of the terms in those propositions or we immediately know them to be true by the light of faith (see, for example, ST IaIIae. q. 100, a. 1 respondeo). Thomas thinks that (at least abstract formulations of) the commandments of the Decalogue constitute good examples of the secondary, universal principles of the natural law [see, for example, ST IaIIae. q. 100, a. 3, respondeo). To know the primary and secondary universal precepts of the natural law is to have what Thomas calls the human virtue of understanding with respect to the principles of moral action. Moral knowledge of other sorts is built on the back of having the virtue of understanding with respect to moral action. As we have seen, it is possible to have the virtue of understanding (say, with respect to principles of action) without otherwise being morally virtuous, for example, prudent, courageous, and so forth (see, for example, ST IaIIae. q. 58, a. 5).

Third, Thomas thinks there are also universal principles of the natural law that are not immediately obvious to all but which can be inculcated in students by a wise teacher (see, for example ST IaIIae. q. 58, a. 5; ST IaIIae. q. 100, a. 1, respondeo; and ST IaIIae. q. 100, a. 3, respondeo). We might call this third of universal principle of the natural law the tertiary precepts of the natural law. Thomas gives as an example of such a principle a precept from Leviticus 19: 32: “Rise up before the hoary head, and honor the person of the aged man,” that is, respect your elders (ST IaIIae. q. 100, a. 1, respondeo). Other examples Thomas would give of tertiary precepts of the natural law are one ought to give alms to those in need (ST IIaIIae. q. 32, a. 5, respondeo), one must not intentionally spill one’s seed in the sex act (ST IIaIIae. q. 154, a. 11, respondeo), and one should not lay with a person of the same sex (ST IIaIIae. q. 154, a. 11, respondeo).

It is easy to be confused by what Thomas says here about natural law as conferring moral knowledge if we think Thomas means that all people have good arguments for their moral beliefs. People sometimes say that they “just see” that something is morally wrong or right. Thomas thinks it is possible to know the general precepts of the moral law without possessing a scientific kind of moral knowledge (which, as has been seen, does require having arguments for a thesis). One way to talk about this “just seeing” that some moral propositions are true is by making reference to what Thomas calls natural law. People do not typically argue their way to believing the general norms of morality, for example, it is wrong to murder, one should not lie. Rather, the truth of these norms is “self-evident” (per se nota) to us, that is, we understand such norms to be true as soon as we understand the terms in the propositions that correspond to such norms (see, for example, ST IaIIae. q. 94, a. 2). Of course, that does not mean that arguments cannot be given for the truth of such norms, at least in the case of the secondary and tertiary precepts of the natural law, if only for the sake of possessing a science of morals. The truth of such basic moral norms is thus analogous to the truth of the proposition “God exists” for Thomas, which for most people is not a proposition one (needs to) argue(s) for, although the theologian or philosopher does argue for the truth of such a proposition for the sake of scientific completeness (see, for example, ST Ia. q. 2, a. 2, ad2).

So far we have simply talked about the fact that, in Thomas’ view, human beings have some knowledge of universal moral principles. However, unless such knowledge is joined to knowledge of particular cases in the moral agent or there is a knowledge of particular moral principles in the agent, then the moral agent will not know what he or she ought to do in a particular circumstance. For example, all human beings know they should seek happiness, that is, they should do for themselves what will help them to flourish. However, in a particular case, Joe really wants to go to bed with Mike’s wife. In fact, given his passions and lack of temperance, it seems to Joe that going to bed with Mike’s wife will help him to flourish as an individual human being. That is, it seems good to Joe to commit adultery. Thomas thinks that ordinarily a person such as Joe knows by the universal principles of the natural law, that is, he understands not only that he should not commit adultery but that committing adultery will not help him flourish. In addition, Joe knows that going to bed with Mike’s wife would be an example of an adulterous act. However, such knowledge can be destroyed or rendered ineffective (and perhaps partly due to Joe’s willingness that it be so) in a particular case by his passion, which reflects a lack of a virtuous moral disposition in Joe, that is, temperance, which would support the judgment of Joe’s reason that adultery is not happiness-conducive. Thus, it may seem genuinely good to Joe to go to bed with Mike’s wife. In this particular case, (we are supposing) Joe lacks effective moral knowledge of the wrongness of going to bed with Mike’s wife. (Again, Joe could be morally responsible for his lack of temperance, and so for his lack of resolve to act in accord with what he knows about the morality of going to bed with Mike’s wife; in that case, his passion would simply render him vincibly ignorant of the principles of this particular case and so would not excuse his moral wrongdoing, although it would make intelligible why he wills as he does.) In order for knowledge of the universal principles of the natural law to be effective, the agent must have knowledge of moral particulars, and such knowledge, Thomas thinks, requires possessing the moral virtues. Without the virtues, a person will have at best a deficient, shallow, or distorted picture of what is really good for one’s self, let alone others (see, for example, ST IaIIae. q. 58, a. 5, respondeo).

Finally, we should mention another kind of knowledge of moral particulars that is important for Thomas, namely, knowing just what to do in a particular situation such that one does the right thing, for the right reason, in the right way, to the proper extent, and so forth. This is knowledge had by way of the possession of prudence. As we noted above, the knowledge that comes by prudence has the agent’s possession of the other moral virtues as a necessary condition, for the knowledge we are speaking of here is knowing just how to act courageously in this situation; to know this, one must have one’s passions ordered such that, whatever one chooses to do, one knows one always ought to act courageously. However, the prudent person is also able to decide to act in a particular way in a given situation. Such deciding, of course, involves a sort of knowing just what the situation in question calls for, morally speaking. In order for one’s temperance, for example, to be effective, one needs not only to have a habit of desiring food, drink, and sex in a manner consistent with right reason, but one needs to decide how to use that power in a particular situation. For example, the prudent person knows what temperate eating will look like on this given day, at this given time, and so forth. The moral knowledge that comes by prudence is another kind of moral knowledge, Thomas thinks, one necessary for living a good human life.

f. The Proximate and Ultimate Standards of Moral Truth

According to Thomas, the proximate measure for the goodness and badness of human actions is human reason insofar as it is functioning properly, or to put it in Thomas’ words, right reason (recta ratio) (see, for example, ST IaIIae. q. 34, a. 1). Thomas sometimes speaks of this proximate measure of what is good in terms of that in which the virtuous person takes pleasure (see, for example, ST IaIIae. q. 1, a. 7; and ST IaIIae. q. 34, a. 4).

However, since right reason in human beings is a kind of participation in God’s mind (see, for example, ST IaIIae. q. 91, a. 2, respondeo), we can also speak of the mind of God as the ultimate standard for whether a human action is morally good or bad. In fact, given Thomas’ doctrine of divine simplicity, we can say simply that God is the ultimate measure or standard of moral goodness.

One way Thomas speaks about God being the measure of morally good acts is by using the language of law. According to Thomas, God’s idea regarding His providential plan for the universe has the nature of a law (ST Ia. q. 91, a. 1; see the section below on political philosophy for more on Thomas on law). This idea of how the universe ought to go, like any other of God’s ideas, is not, in reality, distinct from God Himself, for by the divine simplicity God’s intellect and will are in reality the same as God himself. God’s own infinite and perfect being—we might even say “God’s character,” if we keep in mind that applying such terms to God is done only analogously in comparison to the way we use them of human moral agents—is the ultimate rule or measure for all creaturely activity, including normative activity. This is why Thomas can say that none of the precepts of the Decalogue are dispensable (ST IaIIae. q. 100, a. 8), for each one of the Ten Commandments is a fundamental precept of the natural law, thinks Thomas. However, it would be a contradiction in terms for God to will that a fundamental precept of the natural law be violated, since the fundamental precepts of the natural law are necessary truths (we could say that they are true in all possible worlds) that reflect God’s own necessary, infinite, and perfect being. For God to will to dispense with any of the Ten Commandments, for example, for God to will that someone murder, would be tantamount to God’s willing in opposition to His own perfection. Since God’s will and God’s perfection (being) are the same, for God to will in opposition to His own perfect being would be a contradiction in terms.

9. Political Philosophy

a. Law

i. The Nature of Law

For Thomas, law is (a) a rational command (b) promulgated (c) by the one or ones who have care of a perfect community (d) for the sake of the common good of that community (ST IaIIae. q. 90, a. 4). First, a law is a rational command. It is not simply a suggestion or an act of counsel. If John merely suggests a course of action A to Mike, or Mike asks John what to do about some moral decision D, and Mike merely offers counsel to John about what to do where D is concerned, all other things being equal, John is not morally obligated to perform A or follow John’s advice where D is concerned, even if John is related to Mike as John’s moral or political superior. Mike may indeed be likely to perform A or follow John’s advice about D out of fear or out of respect for John, but Mike would not necessarily do something morally wrong if he did not perform A or follow John’s counsel about D. On the other hand, if John commands Mike to do something (and all the other conditions for a law are met), then John does something morally wrong if he fails to act in accord with John’s command. According to Thomas, law morally obligates those to whom it is directed. That being said, not all moral acts are equally morally wrong for Thomas. It may be that Susan’s breaking a law in a given situation merely counts as a venial sin. (For the distinction between venial and mortal sin, see the section on infused virtue above.)

A law is also a rational command. That means that, minimally, John’s command must be coherent. In addition, for John’s command to have the force of law, it must not contradict any pre-existing law that has the force of law. Such a pre-existing law could be a higher law. For example, if John (a mere human being) commands that all citizens sacrifice to him as an act of divine worship once a year, Thomas would say that such a command does not have the force of law insofar as (Thomas thinks) such a command is in conflict with a natural law precept that ordains that only divine beings deserve to be worshiped by way of an act of sacrifice. One is not obliged to obey a human being’s ordinance that is in conflict with the commands of a higher power (see, for example, ST IaIIae. q. 104, a. 5.). In his Letter from the Birmingham Jail, Martin Luther King Jr. invokes precisely this aspect of Thomas’ understanding of law in defense of the injustice of segregation ordinances when he notes that, according to Thomas, “an unjust law is a human law that is not rooted in eternal law and natural law” (1963, p. 82).

A command C of a human being could also be in conflict with a pre-existing human law. C would not, in such a case, have the force of law. Take an example: John’s mother commands him to run some errands for her. As John is about to do so, John’s father says to him: “Stop what you’re doing right now and do your homework!” Assuming that John’s mother and father have equal authority in John’s home, and that both of these commands meet all of the other relevant conditions for a law, the command issued by John’s father does not have the force of law for John, since it contradicts a pre-existing law.

Second, commands that get to count as laws must have as their purpose the preservation and promotion of the common good of a particular community. When Thomas speaks about the common good of a community, he means to treat the community itself as something that has conditions for its survival and its flourishing. For example, if a tyrant issues an edict that involves taxing its citizens so heavily that the workers in that community would not be able to feed themselves or their families, such an edict would violate the very purpose of law, since the edict would, in short order, lead to the destruction of the community.

Third, in addition to being a rational command that promotes the common good of a community, a law must be issued by those who have true political authority in that community. There is no need to think that the authority figures in question here have to be political authorities in the sense that we take elected officials or kings to be. Within the confines of a household, for example, parents have the authority to make laws, that is, rational commands that morally obligate those to whom the laws are addressed. It is worth stressing that a command’s being issued by the requisite authority is a necessary but not sufficient condition for that command’s having the force of law. The political authorities in Birmingham, Alabama may have been genuine authorities and enjoyed real power to make laws. However, if Martin Luther King Jr. was right that segregation ordinances were unjust—and so irrational—then such ordinances, despite the fact that they were issued by authorities that were legitimate, did not have the force of law and so did not morally obligate those who, in their conscience, recognized that such segregation ordinances were unjust.

Finally, a command must be promulgated in order to have the force of law, that is, to morally bind in conscience those to whom it is directed. Thomas accepts the principle that ignorance of the law excuses, but not just any kind of ignorance does so. For ignorance comes in at least two varieties, invincible and vincible. If I am invincibly ignorant of p, it is not reasonable to expect me to know p, given my circumstances. For example, say John has been extremely ill for a year, and in that time a law was passed of which, under normal circumstances, John should have made himself aware. Because of John’s circumstances, however, it would be correct to say he remains invincibly ignorant of the law. For John, then, the law does not bind in conscience (at least as long as John remains invincibly ignorant of it). If John were to transgress the law, John would not be morally culpable for such a transgression. On the other hand, someone might really be ignorant of a law but still be culpable for transgressing it. Such a person would be vincibly ignorant of that law. Someone is vincibly ignorant of a law just in case that person does not know about the law but should have taken actions so as to know about it.

ii. The Different Kinds of Law

1. The Eternal Law

In his famous discussion of law in ST, Thomas distinguishes four different kinds of law: eternal, natural, human, and divine. The eternal law is “God’s idea of the government of things in the universe” (ST IaIIae. q. 91, a. 1, respondeo). This description of the eternal law follows Thomas’ definition of law in general, which definition mentions the four causes of law. Recall that, according to Thomas, a law is a rational command (this is a law’s formal cause) made by the legitimate authority of a community (a law’s efficient cause) for the common good of that community (the final cause) and promulgated (the material cause). The community in question here is the whole universe of creatures, the legitimate authority of which is God the creator. In Thomas’ view, God the creator is provident over, that is, governs, his creation (see, for example, ST Ia. q. 22, aa. 1-2). Since God is perfect Being and Goodness itself (see, for example, ST Ia. q. 4, a. 2; and ST Ia. q. 13, a. 2, respondeo), God’s governing of the universe is perfectly good, and so God’s idea of how the universe should be is a rational command for the sake of the common good of the universe.

How does God promulgate the eternal law? God communicates the eternal law to creatures in accord with their capacity to receive it. Now, God’s eternal law is not distinct from God, but God is perfection itself. Therefore, God communicates Himself, that is, perfection itself, to creatures insofar as this is possible, that is, insofar as God creates things as certain reflections of God’s own perfection.

For example, God communicates His perfection to non-rational, non-living creatures insofar as God creates each of these beings with a nature that is inclined to perfect itself simply by exhibiting those properties that are characteristic of its kind. For example, a carbon atom reflects the divine perfection—and so has God’s eternal law communicated to it—insofar as God gives a carbon atom a nature such that it tends to exhibit the properties characteristic of a carbon atom, for example, being such that it can form such and such bonds with such and such atoms, and so forth. God communicates the eternal law to plants insofar as God creates plants with a nature such that they not only tend to exhibit certain properties, each of which is a certain limited reflection of the Creator, but also insofar as plants are inclined by nature to perfect themselves by nourishing themselves, growing, and maturing so as to contribute to the perpetuation of their species through reproduction. Non-rational animals, of course, have all of these perfections plus the added perfection of being conscious of other things, thereby having the eternal law communicated to them in an even more perfect sense than in the case of non-living things and plants. Finally, rational creatures—whether human beings or angels—have the eternal law communicated to them in the most perfect way available to a creature, that is, in a manner analogous to how human beings promulgate the law to other human beings, that is, insofar as they are self-consciously aware of being obligated by said law. In other words, God gives rational creatures a nature such that they can naturally come to understand that they are obligated to act in some ways and refrain from acting in other ways. This reception of the law by rational creatures is what Thomas calls the natural (moral) law (see, for example, ST Ia. q. 91, a. 2, respondeo).

2. The Natural Law

More specifically, by natural law Thomas understands that aspect of the eternal law that has to do with the flourishing of rational creatures insofar as it can be naturally known by rational creatures—in contrast to that aspect of the eternal law insofar as it is communicated by way of a divine revelation. (In this section, we are interested in natural law only insofar as it is relevant for the development of a political philosophy; for the importance of natural law where moral knowledge is concerned, see the discussion of that topic in the ethics section above.) To put this another way, the natural law implies a rational creature’s natural understanding of himself or herself as a being that is obligated to do or refrain from doing certain things, where he or she recognizes that these obligations do not derive their force from any human legislator. As we saw Martin Luther King Jr. say above, there are some moral laws that constitute the foundation of any just human society; if such laws are transgressed, or legislated against, we act or legislate unjustly. This set of moral laws that transcends the particularities of any given human culture is what Thomas and King call the natural law.

There is another way to think about natural law in the context of politics that is commensurate with what was said above. As in the case of all creatures, the nature possessed by human beings represents a certain way of participating in God, a certain finite degree of perfection that is therefore limited and imperfect in comparison to God’s absolute, infinite perfection. As Thomas famously says in one place, “The natural law is nothing else than the rational creature’s participation of the eternal law” (ST IaIIae. q. 91, a. 2, respondeo). Now, like all created beings, human beings are naturally inclined to perfect themselves, since their nature is an image of the eternal law, which is absolutely perfect. One way in which all creatures show that they are creatures, that is, created by Perfection itself, is in their natural inclination toward perfecting themselves as members of their species. However, human beings are rational creatures and rational creatures participate in the eternal law in a characteristic way, that is, rationally; since the perfection of a rational creature involves knowing and choosing, rational creatures are naturally inclined to know and to choose, and to do so well. In addition, like other animals, human beings must move themselves (with the help of others) from merely potentially having certain perfections to actually having perfections that are characteristic of flourishing members of their species. Although everything is perfect to some extent insofar as it exists—since existence itself is a perfection that reflects Being itself—actually possessing a perfection P is a greater form of perfection than merely potentially possessing P. Therefore, the natural law is a human being’s natural understanding of its inclination to perfect himself or herself according to the kind of thing he or she naturally is, that is, a rational, free, social, and physical being. Thus, we know naturally that we should act rationally, protect life, educate our children, increase liberty for ourselves and others, work for the common good of the community, and, given the precept act rationally, apply all these principles in a rational manner, a manner that reflects a natural understanding that we are animals of a certain sort. We therefore are naturally inclined to pursue those goods that are consistent with human flourishing, as we understand it, that is, the flourishing of a rational, free, social, and animal being. Insofar as we conclude that such an activity or apparent good is a real good for us, we conclude that it is a good we can—or ought to—seek. Insofar as we see that a particular activity or apparent good undermines human flourishing, we conclude that such an activity or apparent good is something bad and so should not be sought, but rather avoided.

3. The Divine Law

The chief reason the natural law is called natural is because it is that aspect of the eternal law that rational creatures can (given the right sort of circumstances) discern to be true by unaided human reason, that is, apart from a special divine revelation. What human beings can know of God’s eternal law only by way of a special divine revelation from God is what Thomas calls divine law (ST IaIIae. q. 91, a. 4, respondeo and ad2). Thomas also contrasts the divine law with the natural law by noting that the natural law directs us to perform those actions we must habitually perform if we are to flourish in this life as human beings (what Thomas calls our natural end, that is, our end qua created). The divine law, on the other hand, directs us to perform actions that are proportionate with living an eternal life with God (what Thomas calls our supernatural end, that is, our end qua grace and glory). It is not as though the natural law is irrelevant where our supernatural end is concerned since, as Thomas often says, “grace perfects nature; it does not destroy it” (see, for example, ST Ia. q. 1, a. 8, ad2). Therefore, living in a manner that violates the natural law is inconsistent with a human being’s achieving his or her supernatural end too. That being said, to live merely in accord with the natural law is not proportionate to the life that human beings live in heaven, which life, by the grace of God, human beings can, in a limited sense, begin to live even in this life. Thus, one reason God gives the divine law is to instruct human beings about which acts are proportionate to a supernatural life, that is, flourishing in heaven, so as to make human beings fit for heaven (see, for example, ST IaIIae. q. 91, a. 4, respondeo).

4. Human Law and its Relation to Natural Law

Thomas develops his account of human law by way of an analogy (see ST IaIIae. q. 91, a. 3). He posits that the human law is to the natural law what the conclusions of the speculative sciences (for example, metaphysics and mathematics) are to the indemonstrable principles of that science. Just as all science begins from premises the truth of which cannot themselves be demonstrated, for example, the law of non-contradiction, and proceeds by the work of reason to particular conclusions, so, in practical matters (such as politics), authorities begin with the knowledge of indemonstrable precepts, for example, good should be rewarded and evil punished and the punishment must fit the crime, and proceed to apply those precepts in light of the particular circumstances, needs, and realities of the communities of which they are the rightful leaders. These particular practical applications of the natural law, as long as they meet the conditions of law, have the force of law. Such laws Thomas calls, human laws. For example, the relevant authorities in community A might decide to enact a law that theft should be punished as follows: the convicted thief must return all that was stolen and refrain from going to sea for one day for each ducat that was stolen. On the other hand, community B enacts the following law: the thief will be imprisoned for up to one day for each dollar stolen.

Thomas would want us to notice a couple of things about these human laws. First, neither of these laws follow logically from the precepts of the natural law. Just as one cannot deduce empirical truths from the law of non-contradiction alone, one cannot deduce human laws simply from the precepts of the natural law. That being said, the natural law functions as a kind of control on what can count as a legitimate (morally and legally binding) law. Just as any scientific theory that contradicts itself is not a good theory, although a number of proposed theories meet this minimal condition of rationality, so no binding law contradicts the precepts of the natural law, although there may be any number of proposed human laws that are consistent with the natural law.

Second, notice that the human laws addressing the appropriate punishment of thievery mentioned above reflect the circumstances in which the members of those communities find themselves. For example, say the members of community A belong to a society where sea-faring is important, and so restriction of such sea-faring is appropriately painful. On the other hand, the members of community B, say, do not live in circumstances where it is so important to travel at sea, and so the punishment for thievery reflects that. Some human laws, Thomas thinks, will be different in different times and places, if only because they are enacted in times and places where there are different geographical, moral, political, and religious circumstances and needs.

b. Authority: Thomas’ Anti-Anarchism

Unlike some political philosophers, who see the need for human authority as, at best, a consequence of some moral weakness on the part of human beings, Thomas thinks human authority is logically connected with the natural end of human beings as rational, social animals. Thomas, therefore, rejects anarchism in all of its forms, and he does so for philosophical reasons. Human authority is in itself good and is necessary for the good life, given the kind of thing human beings are. One place where we can see clearly that Thomas holds this position is in his discussion of what human life would have been like in the Garden of Eden had Adam and Eve (and their progeny) not fallen into sin.

In a section of ST where he is discussing what life was (and in some cases would have been) like for the first human beings in the state of innocence, that is, before the Fall, Thomas entertains questions about human beings as authorities over various things in that state of innocence (Ia. q. 96). Particularly relevant for our purposes are articles three and four.

In article three, Thomas asks whether all human beings would have been equal in the state of innocence. Thomas answers this question by saying, “In some senses, human beings would have been equal in the state of innocence, but in other senses, they would not have been equal.” Thomas thinks human beings would have been equal, that is, the same, in the state of innocence in two significant senses: (a) all human beings would have been free of defects in the soul, for example, all human beings would have been equal in the state of innocence insofar as none would have had sinned, and (b) all human beings would have been free of defects in the body, that is, no human beings would have experienced bodily pain, suffered disease, and so forth in the state of innocence. It is worth mentioning that Thomas believes that the state of innocence was an actual state of affairs, even if it probably did not last very long. However, it certainly could have lasted a long time. In fact, assuming Adam and Eve and their progeny chose not to sin, the state of innocence could have been perpetual or could have lasted until God translated the whole human race into heaven (see, for example, ST Ia. q. 102, a. 4, respondeo).

Interestingly, Thomas thinks that there are a number of different ways in which human beings would have been unequal (by which he simply means, not the same) in the state of innocence. First of all, since God intended there to be families in the state of innocence, some would have been male and others female, since human sexual reproduction, which was intended by God in the state of innocence, requires diversity of the sexes. In addition, some people would have been older than others, since children would have born to their parents in the state of innocence.

Second, there would have been inequalities having to do with the souls of those in the state of innocence. For example, although none would have a defect in the soul, some would have had more knowledge or virtue than others. Thomas mentions the following sort of reason: those in the state of innocence had free choice of the will. Thus, some would have freely chosen to make a greater advance in knowledge in virtue than others. In addition, although the first human persons were created with knowledge and all the virtues, at least in habit (see ST Ia. q. 95, a. 3), those born as children in paradise would not have had knowledge and the virtues, being too young (ST Ia. q. 101, aa. 1 and 2). Therefore, adult human persons in the state of innocence would have had more knowledge and virtue than children born in paradise.

Third, since human bodies would not have been exempt from the influence of the laws of nature, the bodies of those in paradise would have been unequal, for example, some would have been stronger or more beautiful than others, although, again, all would have been without bodily defect. Since those in the state of innocence have the virtues—or at the very least, have no defects in the soul—such disparity in knowledge, virtue, bodily strength, and beauty among those in paradise would not have necessarily occasioned jealousy and envy.

In the fourth article in this question on authority in the state of innocence, Thomas asks whether some human beings would be master of other human beings in the state of innocence. In answering this question, Thomas distinguishes two senses of “mastership.” First, there is the sense of “mastership” that is involved in the master/slave relationship. Second, there is a broader sense of “mastership” where one person is in authority over another, for example, a father in relation to his child.

Thomas argues that “mastership” in the first sense would not exist in the state of innocence. According to Thomas, a slave is contrasted with a politically free person insofar as the slave, but not the free person, is compelled to yield to another something he or she naturally desires, and ought, to possess himself or herself, namely, the liberty to order his or her life according to his or her own desires, insofar as those desires are in accord with reason. This provides Thomas with two reasons for thinking there would be no slavery in the state of innocence. First, since all persons naturally desire political freedom, not having it would be painful. However, there is no pain in the state of innocence. Second, all persons ought to enjoy political freedom. Slaves do not have it. However, there is no sin in the state of innocence. Therefore, there is no “mastership” in the state of innocence that implies the existence of slavery.

Nonetheless, Thomas argues there would have been human authorities, that is, some human beings governing others, in the state of innocence. Why? Thomas offers two reasons. He begins from the belief that human beings are by nature rational and social creatures, and so would have led a social life with other human beings, ordered by reason, in the state of innocence. This means that, in the state of innocence, human beings would seek not just their own good but the common good of the society of which those individuals are a part. However, where there are many reasonable individuals, there will be many reasonable but irreconcilable ideas about how to proceed on a variety of different practical matters. For the sake of the common good, there must therefore be those who have the authority to decide which of many reasonable and irreconcilable ideas will have the force of law in the state of innocence. Therefore, there would have been some human beings in authority over other human beings in the state of innocence.

Thomas’ second reason that there would have been human authorities in the state of innocence has him drawing on positions he established in ST Ia. q. 96, a. 3. Recall that he argues there that human beings would have been unequal in the state of innocence insofar as some would have been wiser and more virtuous than others. However, it would be unfitting if the wiser and more virtuous did not share their gifts with others for the sake of the common good, namely, as those who have political authority. Given that (as Thomas believes) human beings are not born with knowledge and virtue, it seems obvious that this would have been true in the case of the relation between parents and their children. However, Thomas sees that human authorities would have been necessary and fitting at all levels of society.

Since law is bound up with authority for Thomas, what has been said about authority has an interesting consequence for Thomas’ views on law too. It is not essential to law that there be evil-doers. Given that human beings are rational and social creatures, that is, they were not created to live independently and autonomously with respect to other human beings, even in a perfect society a human society will have human laws. (This also assumes that God has willed to share His authority with others; this is precisely what Thomas thinks; in fact, Thomas thinks that having authority over others is part of what it means to be created in the image of God.) Recall the definition of law—it says nothing about curbing appetites or protecting the innocent. In a world where the strong try to take advantage of the weak, law, of course, does do these things. However, the fact that law protects the weak from the strong is accidental to law for Thomas.

c. The Best Form of Government

Thomas thinks that a just government is one in which the ruler or rulers work(s) for the common good and not simply for the good of one class of citizens. In his view, there are a number of un-mixed forms of government that are, in principle, legitimate or just, for example, kingship (regnum), that is, rule by one virtuous man, aristocracy, that is, rule by a few virtuous men, and polity, rule by a large number of citizens. Following Aristotle in Politics, book III, chapter 7, Thomas identifies three unjust forms of unmixed government that are opposed to these just forms: for example, tyranny, that is, rule by one man who looks after his own benefit rather than the common good, oligarchy, that is, rule by a few wealthy men who look after their own good rather than the common good, and democracy, rule by the many poor people for their own good rather than the common good (see, for example, De regno ad regem Cypri, I, ch. 2 [chapter 1 in some editions]).

Of the various just unmixed forms of government, Thomas thinks that a kingship is, in principle, the best form of government. He offers a number of arguments for this thesis. Consider just one of these. Thomas thinks the chief concern of a good ruler is to secure the unity and peace of the community. Therefore, the more a form of government is better able to secure unity and peace in the community, the better is that form of government, all other things being equal. What itself has the nature of unity and peace is better able to secure unity and peace than what is many. However, kingship has the nature of unity and peace more so than rule by many men (whether or not these men are virtuous; recall from our discussion of authority above that Thomas does not think that a group of virtuous people will necessarily agree on a course of action). Therefore, all other things being equal, kingship is better able to secure unity and peace than rule by many. Therefore, kingship is the best unmixed form of government (De regno, book I, ch. 3 [ch. 2]; compare this argument with Thomas’ argument at SCG IV, ch. 76 that there needs to be one bishop, that is, the Pope, functioning as the visible head of the Church in order to secure the unity and peace of the Church.)

Thomas is aware of the possibility that a good man can become a tyrant (De regno, book I, ch. 7 [ch. 6 in some editions]). Furthermore, since the contrary of the best is the worst, and tyranny is the contrary of kingship, tyranny is the worst form of government (De regno, ch. 4 [ch. 3 in some editions]). Thomas therefore thinks kingship should be limited in a number of ways in order to ensure a ruler will not be(come) a tyrant.

First, in a limited kingship the king is selected by others who have the authority to do so (De regno, book I, ch. 7 [ch. 6], where such authorities should choose a king with a moral character such that it is unlikely he will become a tyrant. In one place Thomas speaks of an ideal situation where the king is selected from among the people—presumably for his virtue—and by the people (ST IaIIae q. 105, a. 1, respondeo). Second, in order to ensure the king does not become a tyrant, the government (and its constitution) should be written so as to limit the power of the king (De regno, book I, ch. 7 [ch. 6]). Finally, Thomas thinks kingship ideally should be limited in that the community has a right to depose or restrict the power of the king if he becomes a tyrant (De regno I, ch. 7 [ch. 6]). Although early in his career he seems to sanction tyrannicide (In Sent. Book II, d. 44, qu. 2, ad5), by the time he writes De regno (book I, ch. 7 [ch. 6]) Thomas rejects that view not only as imprudent, but also as inconsistent with the teaching of the Apostles (compare 1 Peter 2:19). Rather, those who have the authority to appoint the king have the authority and responsibility to depose him if need be (De regno book I, ch. 7 [ch. 6]). If no human authorities can or are willing to help a community ruled by a tyrant, Thomas counsels that the people should have recourse to God. However, in doing so, they should first look to expiating their own sins, since God sometimes allows a people to be ruled by the impious as a punishment for sin (De regno book I, ch. 7 [ch. 6]).

Notably, in a place in ST, Thomas argues that a certain kind of mixed government is really the best form of government (ST IaIIae. q. 105, a. 1, respondeo). Thomas notes there that both Aristotle (Politics, book iii) and divine revelation (Deuteronomy 1:15; Exodus 18:21; and Deuteronomy 1:13) agree that the ideal form of government combines kingship, aristocracy, and democracy insofar as one virtuous man rules as king, the king has a few virtuous men under him as advisors, and, not only all are eligible to govern (the virtuous can come from the populace and not simply from the wealthy class), but also all participate in governance insofar as all participate in choosing who will be the king.

Thomas argues that this form of mixed government—part kingship, part aristocracy, and part democracy—is the best form of government as follows. As Aristotle states in Politics ii, 6, a form of government where all take some part in the government ensures peace among the people, commends itself to all, and is most enduring. However, a form of government that ensures peace among the people, commends itself to all, and is most enduring is, all other things being equal, the best form of government. Therefore,

(G1) A form of government where all take some part in the government is, all other things being equal, the best form of government.

However, given the soundness of the kind of argument for the superiority of kingship as a form of government we noted above, and the importance of virtuous politicians for a good government, we have the following:

(G2) The best non-mixed form of government is kingship.

(G3) The second-best form of non-mixed government is an aristocracy.

However, there is a mixed form of government (call it a limited kingship or limited democracy) that is part kingship, since a virtuous man presides over all, part aristocracy, since the king takes to himself a set of virtuous advisors and governors, and part democracy, since the rulers can be chosen from among the people and the people have a right to choose their rulers.

However, there is no form of government other than a limited kingship or limited democracy that takes the truths of (G1), (G2), and (G3) into account. Therefore, the best form of government is a limited kingship or limited democracy. Thus, interestingly, we have in Thomas a 13th-century theologian advocating for a limited form of democracy as the best form of government.

10. References and Further Reading

a. Thomas’ Works

Thomas authored an astonishing number of works during his short life. Other than the first entry below, which cites the ongoing project of providing a critical edition of Thomas’ Opera Omnia (entire body of work), the entries mentioned here are those works of Thomas’ cited in the body of this article. For a complete list of Thomas’ works, see Torrell 2005, Stump 2003, or Kretzmann and Stump 1998.

  • Opera Omnia (Complete Works), 1248-1273. Ed. Leonine Commission, S. Thomae Aquinatis Doctoris Angelici. Opera Omnia. Iussu Leonis XIII, P.M. edita, Rome: Vatican Polyglot Press, 1882- (on-going).
  • De principiis naturae, ad fratrem Sylvestrum (On the Principles of Nature, for Brother Sylvester), 1248-1252 or 1252-1256.
    • English translation: Eleonore Stump and Stephen Chanderbhan, trans. In The Hackett Aquinas: Basic Works. Jeffrey Hause and Robert Pasnau, eds. (Indianapolis: Hackett Publishing Company, 2014), pp. 2-13.
  • De ente et essentia, ad fratres et socios suos (On Being and Essence, for His Brothers and Companions), 1252-1253.
    • English translation: Peter King, trans. In The Hackett Aquinas: Basic Works. Jeffrey Hause and Robert Pasnau, eds. (Indianapolis: Hackett Publishing Company, 2014), pp. 14-35.
  • Scriptum super libros Sententiarum (Commentary on [Lombard’s] Sentences), 1252-1256.
  • Questiones disputatae de veritate (Disputed Questions on Truth), 1256-1259.
    • English translation: Mulligan, Robert W., James V. McGlynn, and Robert W. Schmidt, trans. Truth. 3 vols. Library of Living Catholic Thought (Chicago: Regnery, 1952-1954; reprint, Indianapolis: Hackett, 1994).
  • Beata gens (Sermon on the Feast of All Saints, the First of November), ca. 1256-1259 or1268-1272?
    • English translation: Mark-Robin Hoogland, trans. In The Fathers of the Church: Medieval Continuation. Vol. II. Thomas Aquinas: The Academic Sermons (Washington, DC: The Catholic University of America Press, 2010), pp. 295-312.
  • Expositio super librum Boethii De trinitate (Commentary on Boethius’ De trinitate), 1257-1258 or 1259 (incomplete).
    • English translation: Maurer, Armand, trans. Faith, Reason and Theology: Questions I-IV of his Commentary on the De Trinitate of Boethius. Mediaeval Sources in Translation, 32 (Toronto: Pontifical Institute of Mediaeval Studies, 1987). Maurer, Armand, trans. The Division and Methods of the Sciences: Questions V and VI of his Commentary on the De Trinitate of Boethius. 4th rev. ed. Mediaeval Sources in Translation, 3 (Toronto: Pontifical Institute of Mediaeval Studies, 1986).
  • Summa contra gentiles (Synopsis [of Christian Doctrine] Directed against Unbelievers) [SCG], 1259-1265.
    • English translation: Pegis, Anton C., James F. Anderson, Vernon J. Bourke, and Charles J. O’Neil, trans. Summa contra gentiles (1955; reprint, Notre Dame, IN: University of Notre Dame Press, 1975).
  • Glossa continua super Evangelia (Catena aurea) (A Continuous Gloss on the Evangelists [collected from the writings of the Church Fathers]), 1262-1265 (Matthew); 1265-1268 (Mark, Luke, and John).
    • English translation: M. Pattison, J. D. Dalgairns, and T. D. Ryder, trans. John Henry Newman, ed. 4 vols. (1841-1845; reprint, Boonville, NY: Preserving Christian Publications, 2009).
  • Expositio super Iob ad litteram (Literal Commentary on Job), 1263-1265.
    • English translation: Yaffe, Martin D., and Anthony Damico, trans. The Literal Exposition on Job: A Scriptural Commentary Concerning Providence. Classics in Religious Studies, 7 (Atlanta, GA: Scholars Press, 1989).
  • Expositio et lectura super Epistolas Pauli Apostoli (Commentary and lectures on the Epistles of Paul the Apostle), 1263-1265 (1 Cor. 11—Philemon); 1271-1272 (Romans), and 1272- 1273 (Hebrews).
    • English translations: multiple.
  • Officium de festo Corporis Christi ad mandatum Urbani Papae (The Office of the Feast of the Body of Christ, Commissioned by Pope Urban), 1264.
    • English translation: The Aquinas Prayer Book: The Prayers and Hymns of St. Thomas Aquinas, R. Anderson and J. Moser, trans. (Manchester, NH: Sophia Institute Press, 2000).
  • Adoro te devote (Hymn) (Humbly I Adore Thee), 1264 or 1274.
    • English translation: The Aquinas Prayer Book: The Prayers and Hymns of St. Thomas Aquinas, R. Anderson and J. Moser, trans. (Manchester: NH: Sophia Institute Press, 2000).
  • Quaestiones disputatae de potentia (Disputed Questions on [the] Power [of God]), 1265-1266.
    • English translation: The English Dominican Fathers, trans. (1932; reprint, Eugene, OR: Wipf and Stock, 2004).
  • Compendium theologiae, ad fratrem Reginaldum socium suum (A Compendium of Theology, for Brother Reginald, his Companion), 1265-1267 (incomplete).
    • English translation: Vollert, Cyril, trans. Light of Faith: The Compendium of Theology (1947; reprint, Manchester, NH: Sophia Institute, 1993).
  • Expositio super librum Dionysii De divinis nominibus (Commentary on Pseudo-Dionysius’ De divinis nominibus), 1265-1268.
    • English translation: Marsh, Harry C., trans. “A Translation of Thomas Aquinas’ In Librum beati Dionysii de divinis nominibus expositio.” In his “Cosmic Structure and the Knowledge of God: Thomas Aquinas’ In Librum beati Dionysii de divinis nominibus expositio,” 265–549. Ph.D. diss. (Vanderbilt University, 1994).
  • Summa theologiae (Synopsis of Theology) [ST], 1265-1268 (Prima Pars); 1271 (Prima Secundae); 1271-1272 (Secunda Secundae), and 1271-1273 (Tertia Pars) (incomplete).
    • English translation: Fathers of the English Dominican Province, trans. (1911; reprint, Allen, TX: Christian Classics, 1981).
  • Quaestiones disputatae de anima (Disputed Questions on the Soul), 1266-1267.
    • English translation: Robb, James H., trans. Questions on the Soul. Mediaeval Philosophical Texts in Translation, 27 (Milwaukee: Marquette University Press, 1984).
  • De regno [or De regimine principum], ad regem Cypri (On Kingship [or On the Governance of Rulers], for the King of Cyprus), 1266-1267.
    • English translation: Phelan, Gerald B., and I.T. Eschmann, trans. On Kingship to the King of Cyprus. Mediaeval Sources in Translation, 2 (Toronto: Pontifical Institute of Mediaeval Studies, 1949).
  • Sententia super De anima (Commentary on Aristotle’s De anima), 1267-1268.
    • English translation: Pasnau, Robert C., trans. Commentary on Aristotle’s De anima (New Haven: Yale University Press, 1999).
  • Expositio Libri Physicorum (Commentary on Aristotle’s Physics), 1268-1270.
    • English translation: Blackwell, Richard J., Richard J. Spath, and W. Edmund Thirlkel, trans. Commentary on Aristotle’s Physics. Rare Masterpieces of Philosophy and Science (New Haven: Yale University Press, 1963; reprint, Aristotelian Commentary Series. Notre Dame, IN: Dumb Ox Books, 1999).
  • Questiones disputatae de malo (Disputed Questions on Evil), 1269-1271.
    • English translation: Trans. Jean Oesterle (Notre Dame, IN: The University of Notre Dame Press, 1995).
  • Expositio Libri Peryermenias (Commentary on Aristotle’s De interpretatione), 1270-1271.
    • English translation: Oesterle, Jean, trans. Aristotle on Interpretation: Commentary by St. Thomas and Cajetan. Mediaeval Philosophical Texts in Translation, 11 (Milwaukee: Marquette University Press, 1962. Reprinted, with a new introduction, as Commentary on Aristotle’s On Interpretation, Notre Dame, IN: Dumb Ox Books, 2004).
  • Sententia super Metaphysicam (Commentary on Aristotle’s Metaphysics), 1270-1273.
    • English translation: Rowan, John P., trans. Commentary on the Metaphysics of Aristotle. 2 vols (Chicago: Regnery, 1964; reprinted in one volume with revisions as Commentary on Aristotle’s Metaphysics, Aristotelian Commentary Series, Notre Dame, IN: Dumb Ox Books, 1995).
  • De aeternitate mundi, contra murmurantes (On the Eternity of the World against Murmerers), 1271.
    • English translation: In St. Thomas, Siger de Brabant, and St. Bonaventure, On the Eternity of the World, Cyril Vollert, Lottie Kenzierski, and Paul M. Byrne, trans. Mediaeval Philosophical Texts in Translation, 16 (Milwaukee: Marquette University Press, 1964).
  • Sententia libri Ethicorum (Commentary on Aristotle’s Nicomachean Ethics), 1271-1272.
    • English translation: Litzinger, C.I., trans. Commentary on the Nicomachean Ethics. 2 vols. Library of Living Catholic Thought (Chicago: Regnery, 1964; reprinted in 1 vol. with revisions as Commentary on Aristotle’s Nicomachean Ethics, Aristotelian Commentary Series, Notre Dame, IN: Dumb Ox Books, 1993).
  • Expositio super librum Boethii De hebdomadibus (Commentary on Boethius’ De hebdomadibus), 1271-1272?
    • English translation: Schultz, Janice L., and Edward A. Synan, trans. An Exposition of the ‘On the Hebdomads’ of Boethius. Thomas Aquinas in Translation (Washington, DC: The Catholic University of America Press, 2001).
  • Expositio super librum De causis (Commentary on Liber de causis), 1272-1273.
    • English translation: Guagliardo, Vincent A., Charles R. Hess, and Richard C. Taylor, trans. Commentary on the Book of Causes. Thomas Aquinas in Translation (Washington, DC: The Catholic University of America Press, 1996).

b. Secondary Sources and Works Cited

The secondary literature on Thomas is vast. Here follows just a few important studies of Thomas’ thought in English that will be particularly helpful to someone who wants to learn more about Thomas’ philosophical thought as a whole. Also included in this section are works cited within the article (other than Thomas’ own).

  • Artigas, Mariano. The Mind of the Universe: Understanding Science and Religion (Philadelphia: Templeton Foundation Press, 2000).
  • Chesterton, G. K. The Dumb Ox (New York: Image Books, 1956).
    • Originally published in 1933, this is a wryly written study by the famous English journalist that attempts to convey the spirit and significance of Thomas’ thought. The eminent 20th-century Thomas scholar Etienne Gilson once called it “the best book ever written on St. Thomas.” The book is readily available in many different editions.
  • Clarke, W. Norris. The One and the Many: A Contemporary Thomistic Metaphysics (Notre Dame: University of Notre Dame Press, 2001).
    • An excellent attempt to articulate Thomas’ metaphysical views in light of the phenomenological and personalist traditions of 20th-century philosophy.
  • Copleston, F.C. Aquinas. (London: Penguin Books, 1955).
    • A still classic study that attempts to explain Thomas’ views with an eye toward analytic philosophical idioms.
  • Davies, Brian. The Thought of Thomas Aquinas (Oxford: Clarendon Press, 1992).
    • A clear and philosophically interesting summary of Thomas’ theological and philosophical thought, one that follows the structure of Thomas’ Summa theologiae.
  • Davies, Brian and Eleonore Stump, eds. The Oxford Handbook of Aquinas (Oxford: Oxford University Press, 2012).
    • A recent and excellent collection of scholarly articles on all aspects of Thomas’ thought.
  • Eberl, Jason. The Routledge Guidebook to Aquinas’ Summa Theologiae (London: Routledge, 2015).
    • A close reading and explanation of the philosophical views contained in Thomas’ greatest work.
  • Feser, Edward. Aquinas: A Beginner’s Guide (Oxford: Oneworld, 2009).
    • Despite the title, this is a sophisticated, very readable, articulation and defense of ideas central to Thomas’ thought.
  • Gilson, Etienne. The Christian Philosophy of St. Thomas Aquinas. Trans. L. K Shook (1956; reprint, Notre Dame, IN: University of Notre Dame Press, 1994).
    • A classic study by the famous 20th-century Thomist and scholar of medieval philosophy. Among other things, Gilson argues that Thomas’ concept of actus essendi is the key to understanding his thought and its unique contribution to the history of Western philosophy.
  • King, Jr., Martin Luther. “Letter from the Birmingham Jail,” in Why We Can’t Wait (New York: Signet Books, 1963).
  • Kretzmann, Norman and Eleonore Stump, eds. The Cambridge Companion to Aquinas (Cambridge: Cambridge University Press, 1993).
    • An excellent collection of scholarly introductions to all the major facets of Thomas’ thought.
  • Pieper, Josef. A Guide to Thomas Aquinas. Trans. Richard and Clara Winston (San Francisco: Ignatius, 1991).
    • Gives a helpful introduction to Thomas’ thought by way of clearly presenting the historical context in which Thomas lived and taught.
  • Plantinga, Alvin. Warranted Christian Belief (New York: Oxford University Press, 2000).
  • Rota, Michael W. “What Aristotelian and Thomistic philosophy can contribute to Christian theology,” in Theology and Philosophy: Faith and Reason, eds. O. Crisp, G. D’Costa, M. Davies, and P. Hampson (London: T & T Clark, 2012), pp. 102-115.
  • Stump, Eleonore. Aquinas. Arguments of the Philosophers (London: Routledge, 2003).
    • A detailed presentation of Thomas’ philosophical thought, one that articulates and defends Thomas’ views in light of contemporary analytic philosophical discussions in metaphysics, epistemology, the philosophy of religion, the philosophy of mind, and ethics.
  • Torrell, Jean-Pierre. Aquinas’s Summa: Background, Structure, and Reception. Trans. Benedict M. Guevin (Washington, DC: The Catholic University of America Press, 2005).
    • Helpfully explains the context, content, and the history of the reaction to Thomas’ greatest work.
  • Van Inwagen, Peter. Metaphysics. 4th ed. (Boulder, CO: Westview Press, 2015).

c. Bibliographies and Biographies

  • Ingardia, Richard. Thomas Aquinas: International Bibliography 1977-1990 (Bowling Green, KY: The Philosophical Documentation Center).  
  • Kretzmann, Norman and Eleonore Stump. “Aquinas, Thomas,” in The Routledge Encyclopedia of Philosophy. Vol. 1. Edward Craig, ed. (London: Routledge, 1998), pp. 326-350.
    • A scholarly, concise, and very informative account of Thomas’ life and works. Also contains a good bibliography.
  • Miethe, T. L. and Vernon Bourke. Thomistic Bibliography 1940-1978 (Westport, CT: Greenwood Press, 1980).
  • Torrell, Jean-Pierre. Saint Thomas Aquinas: The Person and His Work. Trans. Robert Royal. Revised Edition (Washington, DC: The Catholic University of America Press, 2005).
    • The most up-to-date, scholarly, book-length treatment of Thomas’ life and works.
  • Tugwell, Simon. Albert and Thomas: Selected Writings. The Classics of Western Spirituality (Mahwah, NJ: Paulist Press, 1988).
    • The introduction to this work contains a concise and helpful account of Thomas’ life and works.
  • Weisheipl, J. Friar Thomas D’Aquino: His Life, Thought, and Works (Washington, DC: The Catholic University of America Press, 1983).
    • A classic study, which is nonetheless superseded by (Torrell 2005).

 

Author Information

Christopher M. Brown
Email: chrisb@utm.edu
University of Tennessee at Martin
U. S. A.

Scotus: Knowledge of God

Any discussion of John Duns Scotus (1266—1308) on our knowledge of God has to be a discussion of Scotus’s thesis that we have concepts univocal to God and creatures. By this, Scotus means that some one idea can equally represent both God and other types of things. This is striking even to modern ears and was perhaps more so for Scotus’s contemporaries. There are religious objections. Some call Scotus an idolater. But beyond this, as Scotus himself pointed out, the metaphysical ramifications of his thesis threaten to “destroy all philosophy.” By this, he means Aristotle’s thought, which did much to set the philosophical terrain of the thirteenth century. For Aristotle, words that refer to things that are different yet somehow related are analogical, words like ‘healthy’ said of both persons and medicine. Medievals adopted Aristotle’s scheme to make sense of the meaning of religious language, which uses words like ‘good’ to talk about God and creatures. For thinkers in the Latin West, concerns did not focus so much on whether God talk is analogical, as it did on exactly what type of analogy was at play. Imagine the reception when Scotus insisted that analogy (and with analogy, religious language) in fact rests tacitly on concepts univocal to God and creatures. In an Aristotelian universe, this would seem to require that God and creatures really do have something in common, that they differ only in kind, like cats and persons. But everyone, including Scotus, agreed that this was not so. Hence, Scotus and Aristotle would seem to be irremediably at odds with one another and Scotus would indeed destroy all philosophy. As bad as things seem for Scotus, these difficulties recede in light of the fact that his univocity thesis is about religious language, not things. Yes, we can think of radically distinct types of things using just one concept, but this does not mean that they really share in some feature. Thoughts and things need not line up that neatly. This is the way that it is for Scotus. The univocal concepts that represent God and creatures are high level abstractions, mental constructs formed through experience and conceived apart from the limits that attended the things that gave rise to them. These concepts are sufficiently vague to conceive God and creatures, provided that we see the concepts for the abstractions that they are. They really do not refer to anything, because every being is either finite (creatures) or infinite (God), and this makes all the difference. Scotus recognizes that the complex concept formed when a univocal concept is linked with the concept of God’s infinite being is about something that is metaphysically distinct from any creature. But, the genesis of the concept lies in concepts that are of creatures and creatures imitate God as effects imitate their causes. Therefore, to the extent that imperfect creatures imperfectly imitate the perfect creator, univocal concepts are of God. But only to this extent, which medieval thinkers, including Scotus, agree falls far short of the perfection of the divine essence.

Table of Contents

  1. Introduction
  2. Preliminaries
    1. Scotus’s Writings and Early Thought
    2. The Aristotelian Paradigm: Thirteenth-Century Categorial Metaphysics
  3. Contemporary Scholarship
  4. Scotus on our Natural Knowledge of God
  5. Univocity
    1. Univocity and Natural Theology
    2. Illumination Theory and Abstraction
    3. Analogy and Univocity
  6. Metaphysics as Natural Theology
    1. Metaphysics and the Transcendentals
    2. Does Scotus “Destroy All Philosophy?”
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

John Duns Scotus (1266—1308) defends the existence of concepts univocal to God and creatures (on the medieval understanding of concepts, see below, section 2a), most importantly the concept of being. In so doing, Scotus helped to expand medieval thinkers’ understanding of the scope of metaphysics, which, following Aristotle, was conceived of as the science of being qua being or being as such (Metaphysics (Metaph.) 4.1). Because for Scotus being as such pertains to both God and creatures, he thinks of metaphysics as a natural theology (Wolter 1946). More specifically, Scotus believes that certain attributes characterize everything that exists (examples of such attributes include one, true and good) (Ordinatio (Ord.) I, d. 8, q. 3, Vat. IV). These transcendental attributes transcended the then generally accepted classification of types of beings (see below, sections 2b and 6a) and thereby apply to everything. Metaphysics is then for Scotus the science of the transcendentals.

Scotus’s work was vulnerable to a variety of objections on several fronts. First, on strictly philosophical grounds, medievals did not think that there was any concept broad enough to take in everything (see below, section 2b and King 2003). Again, were God similar to and different from creatures, God would be metaphysically complex. For medieval thinkers, this would mean that God is not God, but rather a contingent thing like any other and hence unable to function as the uncaused cause of all things. Scotus recognizes that his univocity thesis threatens “to destroy all philosophy” (1 Lectura (Lect.), d. 3, n. 105) and is at pains to answer these various charges. His strategy is to insist that the univocity thesis is about concepts, not things (Lect. 1, d. 8, n. 129, Vat. XVII:46). More specifically, the informational or intensional content of concepts that are common to God and creatures does not involve the respectively infinite and limited being of either. The concepts really do not apply to anything until these considerations are introduced (or, strictly speaking, reintroduced in the case of creatures, whence concepts are derived). This introduction renders a composite concept comprising the univocal concept and a concept of degree (limited or unlimited as regards creatures or God, respectively). This composition does not alter the content of the univocal concept, which therefore carries the same content in each application and, with that, the unity of meaning that is requisite for it to support the types of sound proof that interest most theologians (see below, section 5a). As the thesis is about how we talk about things and not how they are, univocal terms can apply to all things and the fact that we use these terms to speak of God does not entail any real metaphysical complexity of the divine essence. But, inasmuch as univocal concepts are drawn from creatures and creatures imperfectly represent or imitate God (Ord. I, d. 3, pt. 1, qq. 1-2, n. 56, Vat. III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179), univocal concepts are also of God (see below, sections 3 and 6b). The conceptual apparatuses that Scotus deploys in support of his thesis regarding religious language along with his construal of metaphysics as a natural theology have deeply influenced the Western tradition. For instance, Scotus’s thesis that things that do not share in any real feature may fall under one univocal concept opens the door to William of Ockham’s nominalism, which allows for universal concepts absent common natures (see Summa Logicae I.14 and Klima 2010). Again, Scotus’s influence is likewise seen in Francisco Suárez, David Hume, Immanuel Kant, Charles Sanders Peirce and Martin Heidegger, to name just a few.

Scotus’s earliest disciples were divided over how to understand the Subtle Doctor on the key issue of the univocity of being. Antonius Andreas (d. 1320 C.E.), for instance, looks for some reality common to God and creatures as the real basis for Scotus’s univocal concepts; whereas Peter of Navarre (d. 1347 C.E.) finds in Scotus a weak sense of univocity, whereby concepts univocal to God and creatures are such only inasmuch as they are so indifferent as to apply properly to neither. Peter thereby works to bring Scotus’s thought into line with the common opinion (which he links to Thomas Aquinas, d. 1274 C.E.) that denies that there are any concepts univocal to God and creatures, maintaining rather that theological discourse is analogical (Dumont 1992). This view recognizes that any idea that we have of God is proper rather to creatures, because ideas are in their genesis of creatures, not God. In light of this generally empiricist outlook, these theologians acknowledge that religious language is analogical. As creatures imitate or represent the creator, we are justified in assigning various perfections to God. But, creatures are limited and imperfect. Therefore, barring any special revelation, our grasp of God in this life must be inaccurate (see below, section 5c).

Debates such as those that unfolded soon after his death over whether Scotus held that concepts univocal to God and creatures pick out a reality common to both persist, though a consensus has emerged that Scotus’s univocal concepts are vague abstractions that apply properly to neither God nor creatures absent certain modal considerations relevant to finite and infinite being that serve to delimit the scope of these concepts (Cross 2001). Scotus was led to this account out of his belief that (1) analogy (and with analogy, religious language) tacitly relies on univocity (see below, section 5c) and (2) God and creatures do not share in any common reality.

Hampering the efforts of scholars to present a clear picture of Scotus’s considered opinion on the univocity of being is the fact that his thought on the matter shifts over the course of his career and his sudden passing in 1308 around the age of 43 left the bulk of his writings in a state of partially-edited disarray that can obscure its development, thereby lending credence to conflicting readings. Recent scholarship has done much to ameliorate this difficulty, but an understanding of Scotus’s development must also be parsed along the lines of the Aristotelian categorial metaphysics that dictate his approach. Accordingly, section 2 of this essay is devoted to such preliminary considerations. Section 2a lays out a chronology of Scotus’s writings, presenting his early view on the univocity of being (which appears to have reflected the view then standard amongst Oxford scholars). This shows where Scotus’s mature thought on univocity is to be found and helps to resolve difficulties posed by the decidedly mixed and confused treatment that the univocity thesis receives in Scotus’s commentary on Aristotle’s Metaphysics. Section 2b is a study of the thirteenth-century Aristotelian categorial metaphysics that led medieval theologians to rely on analogy to speak of God and creatures and determined the trajectory of Scotus’s thought.

2. Preliminaries

a. Scotus’s Writings and Early Thought

What we take to be Scotus’s mature position on the univocity of the concepts by which we conceive both God and creatures is drawn primarily from his Ordinatio. (Scotus’s Ordinatio is a revised version of the lectures on the Sentences of Peter of Lombard (d. 1160) that Scotus delivered in partial fulfillment of the curriculum of the faculty of Theology. Scotus was at work revising his Ordinatio up until his death.) Yet, Scotus’s thought on the univocity of concepts appears to shift over the course of his career. Whereas the Ordinatio presents Scotus’s considered opinion, other texts (such as his earliest, logical writings) conform to what we now know to be the standard, mid-thirteenth century Oxford tradition that looks on the term ‘being’ as either equivocal (from the point of view of the logician, who deals with concepts as such, that is, independently of the entities they conceive) or analogous (for the metaphysicians and natural philosophers (that is, physicists), who consider mind-independent realities) (Pini 2005a). Hence there is a striking dissonance between early and late Scotus as regards the univocity of the concept of being, and we find Scotus in his early-period commentary on Aristotle’s Categories (construed by medievals as a work on the properties of terms) stating that the term ‘being’ is in fact equivocal or analogical, from the respective standpoints of logic, on the one hand, and metaphysics and physics, on the other:

This term ‘being’ is simply equivocal . . . . However an utterance that for the logician is simply equivocal . . . is analogous for the metaphysician or the natural philosopher . . . Thus . . . ‘being’ is posited by the metaphysician as analogous . . . But for the logician, it is simply equivocal (Questions on the “Categories” of Aristotle Q. 4, sect. 37-38, trans. mine). (Unless otherwise noted, all translations are my own.)

Medieval semiotic theories develop along the lines laid down by Aristotle in his logical treatise on the properties of statements (the Perihermenias), better known today by its Latin title De Interpretatione. The first chapter presents a semiotic that remained open to a variety of interpretations throughout the middle ages as to whether it construes words primarily as signs of ideas or of extramental things. Scotus allows that there are good reasons to hold either opinion and leaves open the matter (Quaestiones in libros Perihermenias Aristotelis 1.2.51; see also Buckner and Zupko 186-87; Read 16-18). Aristotle’s account is as follows:

Spoken sounds are symbols of affections in the soul, and written marks symbols of spoken sounds . . . What these [that is, spoken sounds] are in the first place signs of – affections in the soul – are the same for all; and what these affections are likenesses of – actual things – are also the same (trans. Ackrill 16a3-8).

Affections in the soul, or concepts, are for medievals the components of mental propositions. Ultimately, concepts are traceable to impressions that things make on our minds and hence the medieval account is an empiricist one. Concepts are likenesses of the entities that they represent and are thereby that by which things are cognitively present to one who conceives. The property of a term whereby it brings to mind a concept is described by medieval thinkers as the term’s signification (Read 9-10).

Logic, for Scotus, considers, among other things, the signification of terms. Signification should not be understood in light of our contemporary notion of meaning, which allows us to speak loosely of the meaning of a term as what it brings to mind. The meaning of a term can be looked up, whereas an expression’s signification is ultimately traceable to something. Again, Scotus conceives of signification as a mental event, a state (or, in scholastic terminology, derived from Aristotle, an ‘accident’) of the mind through which an entity is cognitively present for the person who is conceiving (Pini 2015). On this view, significative utterances are said to be either univocal or equivocal dependent on what concepts they bring to mind. Drawing from Aristotle’s account in Categories 1, univocal (or synonymous) things have a name and definition or account in common. By extension, a term is univocal over two or more entities when it signifies the same idea for each. Again, things are equivocal (or homonymous) when they only have a name in common, the corresponding definition or account differing in each case. Accordingly, an equivocal term signifies differently in its different applications. Hence the term ‘man’ is univocal to Socrates and Plato, whereas it is equivocal when applied to Socrates and a painted image. As this example suggests, equivocity is not always the result of happenstance, absent any real relation, as it is with, for instance, the word ‘bank’, which refers to financial institutions and spots along rivers. Borrowing from Boethius’s (d. 480 C.E.) influential commentary on Aristotle’s Categories, medievals termed these types of equivocity ‘deliberate (a consilio)’ and ‘chance (a casu)’, respectively. Analogy is an instance of deliberate equivocity. Interestingly, the Oxford tradition of the latter-thirteenth and early-fourteenth centuries does not allow that analogy derived from real relationships carries over to the semantic level. Other thinkers, such as Thomas Aquinas who was active in Paris and Cologne, believe that the signification of analogous terms reflects real relations.

So, the term ‘healthy’ is analogous, signifying either an individual, her complexion or the medicine she takes:

This mode of community of idea is a mean between pure equivocation and simple univocation. For in analogies the idea is not, as it is in univocals, one and the same, yet it is not totally diverse as in equivocals; but a term which is thus used in a multiple sense signifies various proportions to some one thing; thus ‘healthy’ applied to urine signifies the sign of animal health, and applied to medicine signifies the cause of the same health (Summa Theologica (ST) Ia.13.5c) (All translations of ST are from the Fathers of the English Dominican Province).

The Oxford logicians, on the other hand, tend toward decoupling ontology and semantics. Terms that bring to mind different things (a person and her complexion) elicit discrete concepts and are therefore simply equivocal, regardless of the real relation that holds between the things signified. For instance, the term ‘being’ could be used to signify subsistent entities (substances) and inherent entities (accidents such as complexion that are parasitic upon substances for their existence). Substances and accidents stand in a real relationship to one another in reality; but, for the logician, this relationship is not reflected in the discrete concepts of each. The expression ‘being’ said of a substance signifies differently than the same expression said of an accident and hence, for the logician, the expression used in these ways is equivocal. For the metaphysician and the physicist, who are concerned primarily with things in the world, on the other hand, the expression is analogous. Perhaps this distinction between concepts and things proved influential for the later Scotus, inasmuch as the univocity thesis considers concepts apart real features of the extramental things that they conceive. Be that as it may, the influence of the Oxford tradition explains why early in his career Scotus holds that the term ‘being’ is equivocal in logic and analogous in metaphysics and physics.

In contrast to the logical writings’ clear rejection of the univocity of the concept of being, the account of the Questions on the Metaphysics of Aristotle is confused and indecisive. The unedited state of Scotus’s writings at the time of his early death is likely to blame and Giorgio Pini’s recent scholarship (2005a) persuasively casts the work as a conflation of early and late drafts that upon completion would have rendered an account in keeping with Scotus’s mature theory. Nevertheless, the impression remains that Scotus was never satisfied with the univocity thesis. In his Quodlibetal Questions (a record of Scotus’s participation in a public theological inquiry, composed near the end of Scotus’s life in either 1306 or 1307), Scotus states that it does not matter for the purpose of the investigation whether the concept of being at issue is analogical or univocal, provided that we allow that being is somehow common to God and creatures (14.39). Hence Steven Marrone (1983) suggests that perhaps Scotus was ready to give up univocity should anything better come along.

Owing at least in part to these conflicting claims and open questions, contemporary interpretations of Scotus’s thought on the univocity of concepts have reflected something of the dichotomy seen in Antonius and Peter over whether Scotus’s account entails a reality common to God and creatures (see above, section 1). And yet, with the recent completion of the critical editions of both Scotus’s Ordinatio (in 2013) and his philosophical writings (the Opera philosophica, in 2006), we are better able to chart the evolution of his thought on the univocity of the concept of being. And certain long-standing difficulties seem to have disappeared. The aforementioned logical writings along with Scotus’s earliest draft of the Questions on the Metaphysics endorse the denial of the univocity of being standard at the time and place of their composition. His Ordinatio, by contrast, reflects Scotus’s theological studies in Paris, where he was introduced to conceptual devices that led him to rethink the univocity of being. Scotus was at work revising his early writings in light of his mature position on univocity right up until his untimely death in 1308, hence the odd character of his Questions on the Metaphysics (Pini 2005a). Contemporary research then paints a developmental picture of Scotus’s thought. That said, as has been noted, there remains evidence that even late into his career Scotus was never completely satisfied with the univocity thesis, which threatened to “to destroy all philosophy.” By ‘all philosophy’ Scotus means Aristotelian philosophy and more specifically the thirteenth-century categorial metaphysics that developed out of Aristotle’s thought. Section 2b is an account of this system.

b. The Aristotelian Paradigm: Thirteenth-Century Categorial Metaphysics

Medieval thinkers view Aristotle’s Categories as a work on the properties of terms. Under this broad classification, there was much debate over whether Aristotle was discussing: (1) mental acts or concepts; (2) linguistic entities; (3) extra-mental features or properties of the things about which we think and speak; or (4) words, concepts and properties, but in different ways (Gracia).  Scholars divide the work into three parts: the Prepredicamenta (chapters1-4), the Predicamenta (chapters 5-9) and the Postpredicamenta (chapters 10-15). (The Greek title of Categories is ‘katēgoriai’ meaning ‘predicates’, translated into Latin as ‘praedicamenta’.) The Prepredicamenta presents distinctions between homonyms and synonyms (see above, section 2a), substances and accidents and universal and particular terms (for example, ‘whiteness’ and ‘white’) as well as a list of things said (praedicamenta) without combination. The Predicamenta discusses these ‘things said’, the Postpredicamenta taking up a variety of tangential topics. Medieval debate over the subject matter of the Categories focused on the nature of the things said without combination, viz., substances and the nine categories of accidents. During the thirteenth century, it was generally agreed that the Categories treats words, concepts and things. For the metaphysician, the text presents a classification of extramental things (with substance most properly termed ‘being’ and the things in the nine categories of accidents derivatively so called), whereas the logician studies the work as a treatise on concepts of second intention. (Concepts of second intention are concepts derived by reflection on first-intention concepts, which are of things in the world. The concept species is a second intention concept.)

Each category (or highest genus) may be characterized as an ordered hierarchy of predicates (Ord. II, d. 3, p. 1, q. 4, n. 89). At the bottom of this hierarchy exists the individual substance or accident that is the ultimate locus of predication. Individuals are classed into species by means of differentiae (differences) that set species within the same category apart from one another (as, medievals believe, humans are set apart from other animals by virtue of the quality of rationality). Genera, in turn, can serve as species under still higher genera (animal, for instance, is a species of body – the animate type). The hierarchy culminates in a highest genus or category that is said of everything contained within it (as, for example, all things in the category of substance are termed ‘substances’). Things in different categories are primarily diverse, meaning that they are not classified in terms of any common predicate. Hence, the predicates that go into the definition of a particular type of thing within a particular category are particular to that category. Things in the same category, on the other hand, are different rather than diverse, meaning that (by virtue of belonging to the same category) they share in some predicate or predicates (as, for example, both humans and cats are sensitive, living, animate material substances) (Metaph. 10.3, 1054b13-32). As the predicates are category specific, univocity is not transcategorial, terms taken from one category apply only analogically to things in other categories, as, famously, when accidents (or inherent beings) receive the denomination ‘being’ by virtue of their relationship to the substances in which they inhere, about whom ‘being’ is more properly said (Metaph. 4.2). (See above, section 2a.)

Recall that Scotus’s thesis regarding the univocity of religious language needs a concept of being broad enough to apply to everything, whence metaphysics as a natural theology studies the transcendental attributes that are coextensive with being as such (see above, section 1 and below, 6a). So characterized, Scotus’s project faces several interwoven difficulties. First, although metaphysicians of the Latin West had explicitly held that there exist transcendental attributes of being since the time of Phillip the Chancellor (d. 1236 C.E.), they abided by the aforementioned restriction on univocity that it not be transcategorial and relied on analogy to secure an approximation of conceptual unity across the diverse genera (see above section 2a and below, 5c). Hence, for Scotus’s system to get off of the ground in an Aristotelian universe, he had to find a way to accommodate the transcategorial prohibition. The move to render being a super-genus or highest category over and above the ten highest genera and thereby secure univocity by denying that it is transcategorial would appear tempting, but it is ruled out, as the medievals agree with Aristotle that being cannot be a genus. Genera are predicated of the species and individuals that fall under them, but not of the differences that constitute these species (Topics 122b17-23). For instance, ‘animal’ is said of human beings but not of the rational quality that sets humans apart from other types of animals; for no quality that pertains to animal as such can serve to distinguish one type of animal from another. If, then, being were a genus, the differential qualities that constitute types of beings would not fall under the genus of being and hence would not exist. Differences do in fact exist and hence being is not a genus (Metaph. 3.3, 998b14-999a22).

Scotus avoids transcategorial predication without rendering being a genus by casting his univocity thesis as a semantic thesis:

It is plain, therefore, from what has been said that God and creatures are in reality wholly diverse, agreeing in no reality . . . and nevertheless they agree in one concept such that there may exist one concept common to God and creatures fashioned by an imperfect intellect (Lect. 1, d. 8, n. 129, Vat. XVII:46; see also Ord. I, d. 3, pt. 1, qq. 1-2, nn. 38-40, Vat. III.25-27; q. 3, n. 163, Vat. III.100-101, d. 8, n. 136, Vat. III:221).

The type of being that figures into the univocity thesis is not a highest genus above both God and creatures. It is, rather, a mental construct arrived at through abstraction. Both God and creatures are beings, but they need not agree in anything real. Rather, they fall under a common concept that lacks any immediate referent as it is prescinded from the considerations of degree that make all the difference as regards its instantiation.

3. Contemporary Scholarship

For Scotus, univocal concepts must refer to both God and creatures without collapsing the metaphysical space that separates them. Accordingly, Scotus is careful to note that univocal concepts are, as Richard Cross (2001, 13) puts it, “vicious abstractions,” referring properly to neither God nor creatures. This being the case, it appears as if Scotus’s univocal concepts may leave the theologian empty-handed. As Scotus is working to make space for univocity absent real commonality, it is perhaps unsurprising that he appears as a protean figure in contemporary minority reports that criticize him as either idolatrous or apophatic. David Burrell thinks that Scotus does not appreciate the problematic nature of our conceptual access to mystery and N. Trakakis reads into Scotus an anthropic conception of God on account of which he accuses Scotus of idolatry. Catherine Pickstock, on the other hand, believes that by rendering the distinction between God and creatures one of degree, Scotus paradoxically makes the distinction over into one of kind, inasmuch as there would exist a nonnegotiable epistemic gulf between the two (1998, 2005). By way of contrast, Richard Cross (2001), Stephen Dumont (1992), Steven Marrone (1983), Jan Aertsen and Wouter Goris (2013), Peter King (2003) and Thomas Williams (2005) count among the scholars who represent the majority view that recognizes that Scotus’s univocity theory is a semantic theory that does not require that God and creatures share in any real, common trait. To the contrary, the claim that Scotus’s univocity thesis entails that there is some reality common to the two does not fully appreciate the importance of the distinction between semantics and ontology that Scotus is careful to draw:

Note how there can be a first intention [that is, a real concept] of a and b which is indifferent, and nothing of a single nature corresponds in reality, but two formal objects wholly diverse are understood in one first intention (Ord. I, d. 8, n. 136, Vat. IV:221) (See also above, section 2b.)

The univocity thesis concerns concepts, not things and this distinction is crucial for Scotus. But, this is not to suggest that Scotus thinks that we do not have any knowledge of God. We know God through univocal concepts of empirical origin:

Those things that are known of God are known through the species [that is, a mental grasp] of creatures . . . Creatures, which imprint proper species in the intellect are also able to imprint the species of the transcendentals which agree in common with them and God. And then the intellect by means of its proper power is able to use many species simultaneously for the purpose of conceiving simultaneously those of which these are species, e.g. the species good and the species highest and the species act for the purpose of conceiving some highest and most actual good (Ord. I, d. 3, pt. 1, n. 61, Vat. III:42).

God contains the perfection of every creature (Ibid., d. 8, pt. 1, q. 3, n. 116, Vat IV:207-208). Univocal concepts are of God inasmuch as univocal concepts are in their genesis of creatures who imitate or represent God (Ibid., d. 3, pt. 1, qq. 1-2, n. 56, Vat III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179). Hence univocal concepts are mere mental constructs only to the extent that they pertain properly to neither God nor creatures when entertained apart from relevant modal considerations, which considerations (as has been stressed) make all the difference. Scotus invites us to think of these univocal concepts along the lines of a concept of whiteness absent any particular degree of intensity (Ord. I, d. 8, q. 3, n. 138).

4. Scotus on our Natural Knowledge of God

Scotus offers various proofs that we possess concepts univocal to God and creatures. Perhaps the most well know argument is that from certain and doubtful concepts:

Every intellect certain of one concept and doubtful of others has the concept of which it is certain as other than the concept of which it is doubtful . . . But the intellect . . . can be certain of God that God is a being, doubting as to this being whether it is finite or infinite . . . Therefore, the concept of being as regards God is other than this [concept] and that [concept]. And therefore for its part it is neither and it is included in both. Therefore, it is univocal (Ord. I, d. 3, n. 27, Vat. 3:18).

Scotus cites the debates amongst Presocratic philosophers over the nature of the first principle to show that the concept of being is common to the composite concepts infinite-being and finite-being, which overlap at the simple concept of being. This simple concept is common to the composite concepts such that in both its modalization does not alter its intensional content, that is, the common concept of being remains exactly the same concept of being in each case. A person who mistakenly thinks of fire as the uncaused first principle or constituent of all things (and therefore as infinite with respect to being) can be corrected but would not then cease to think of fire as a being. The intensional content of the concept of being is thus univocal for the concepts infinite-being and finite-being: “This certain concept, which is for its part neither of the doubtful ones is preserved in both of them” (Ibid., n. 29, Vat. 3:19). Scotus’s point is that concepts said of God and creatures retain a core content univocal to both instances. Joining the concept of finitude or infinity to a concept does not alter its meaning but merely produces a new, composite concept.

We arrive at these concepts univocal to God and creatures through experience (see below, section 5b), but when these concepts are joined with the concept of infinity, they apply only to God. There are two types of concepts that are proper to God:

I say that it is possible to arrive at many concepts that are proper to God and that do not agree with creatures. Concepts of this type are the concepts of all of the perfections taken simply, in the highest degree. And the most perfect concept, through which as if by description we most perfectly know God, is by conceiving every perfection simply and in the highest degree. Nevertheless, a concept more perfect and yet simpler, available for us, is the concept of infinite being. This concept is simpler than the concept ‘good being’, ‘true being’ or concepts of other, similar things; because ‘infinite’ is not a quasi-attribute or property of being, or of that of which it is said. Rather it signals an intrinsic mode of that entity, such that when I say ‘infinite being’ I don’t have a concept that is like an accidental concept, composed out of the subject and property, but, rather, I have an essential concept of a subject in a certain grade of perfection, viz. infinity, just as ‘intense white’ doesn’t express the same things as an accidental concept like ‘visible white’, indeed the intensity expresses an intrinsic grade of whiteness in itself. And thus the simplicity of the concept ‘infinite being’ is evident (Ord. I, d. 3, pt. 1, qq. 1-2, n. 58, Vat. III.40).

Concepts that are proper to God describe only God. The first type of concept proper to God is a descriptive, cluster-concept composed of attributes and perfections conceived in the highest possible degree – infinite-goodness, infinite-wisdom, and so forth, bundled together as it were (Frank and Wolter 150-51). Another type of concept, yet more appropriate to the divine nature, is of God conceived simply as infinite being. This latter concept is superior to the cluster-concept for several reasons. First, the concept of infinite being only applies to God (Ibid., n. 60, Vat. III: 41-42). Second, unlike the cluster-concept, infinite being does not explicitly comprise distinct traits. This respects the medieval insistence that God is metaphysically simple (see, for example, De primo principio 4). The essence of God cannot comprise aspects that stand in a relationship of potentiality with respect to one another. Otherwise, the existence of God would require some account as to why it is the way that it is and God would not be God, that is, the uncaused (or unaccounted for) first cause. Third, the concept of infinite being is superior to the cluster-concept because the transcendental attributes and perfections are coextensive with being as such, as infinite being God has every perfection (see below, section 6a). Fourth, the distinction between infinite and finite being is one of degree and it is because the distinction between God and creatures is one of degree that Scotus avoids rendering being a genus over and above both God and creatures (see below, section 6b). (Nevertheless, we should not think of infinite being as a divisible quantity. Infinite being is, for Scotus, indivisible. See Cross 2001.) Finally, by Scotus’s estimation, the concept of God’s infinite being is the most fertile ground available to the natural theologian seeking to deduce various divine attributes.

In his De primo principio (A Treatise on God as First Principle), Scotus describes God’s infinite being as a “most fertile conclusion, which if it had been proved of you at the outset, would have made obvious so many of the conclusions we have mentioned so far” (4.47, trans. Wolter, 1966). Following a lengthy series of demonstrations that the divine essence is infinite, Scotus then concludes that (among other things):

Catholics can infer most of the perfections which philosophers knew of you . . . You are the first efficient cause, the ultimate end, supreme in perfection, transcending all things. You are uncaused in any way and therefore incapable of becoming or perishing; indeed it is simply impossible that you should not exist . . . You are therefore eternal . . . You live a most noble life . . . You are happy . . . You are the clear vision of yourself and the most joyful love . . . You . . . understand in a single act everything that can be known . . . You possess the power to freely and contingently will each thing that can be caused and by willing it through your volition to cause it to be. Most truly then you are of infinite power . . . You alone are simply perfect, not just a perfect angel, or a perfect body, but a perfect being . . . You are one God, than whom there is no other (Ibid., 4.84-87).

Just as our understanding of the goodness and wisdom that we ascribe to God originates in experience, so too does our notion of infinity. In his fifth Quodlibetal Question, Scotus walks us through the process of reflection by which we arrive at this concept. Aristotle defines the infinite as infinite with respect to quantity. No matter how many discrete quantities one takes away from it, an infinite amount remains (Physics 3.6, 207a7-9). Hence, Scotus notes that this infinity can never be in existence as a whole. We are then asked to imagine per impossibile that the entirety of this infinity should be present at once, hence “if this could be done we would have an actually infinite in quantity, because it would be as great in actuality as it was potentially” (5.6, trans. Alluntis and Wolter). God’s infinite perfection is conceived along the lines of this model of a quantitative infinity taken as a whole:

If we think of something among beings that is actually infinite in entity, we must think of it along the lines of the actual infinite quantity we imagined, namely as an infinite being that cannot be exceeded in entity by any other being. It will truly have the character of something whole and perfect. It will indeed be whole or complete (Ibid., n. 7).

God is then “infinite in perfection or power” (Ibid., n. 8).

In summary, we know God through the numerous concepts of various perfections and attributes by which creatures imitate and represent God. We know of God that God is an infinite being, and because God is an infinite being, God possesses these perfections in the highest possible degree. For this reason, of all of the things that we know of God, the most significant is that God is infinite being. It remains, then, to discuss the intensional or informational content of concepts univocal to God and creatures (section 5) with an eye toward why Scotus thinks we need these concepts (section 5a), how we acquire them (section 5b), why analogical concepts will not do in their place for the natural theologian (section 5c) and, finally, how Scotus uses concepts univocal to God and creatures to render metaphysics a natural theology (section 6a) without thereby “destroying all philosophy” (section 6b).

5. Univocity

Theological considerations are at the heart of Scotus’s univocity thesis. First, Scotus holds that theology is pointless absent any concepts univocal to God and creatures, as theologians would literally have no idea what they are talking about. (Section 5c, below, discusses why Scotus does not believe that the generally accepted account of God talk as analogical will do.) Again, theologians present certain conclusions as the product of sound reasoning and Scotus (naturally enough) holds that sound reasoning requires the univocity of concepts (Cross 2006). As regards this second point, Scotus’s description of univocal concepts draws attention to their role in demonstration:

I say that God is not only conceived in a concept analogous to the concept of a creature (namely a concept which is entirely different from a concept said of a creature), but also in some concept that is univocal to God and creatures. And so that there won’t be any contention about the term ‘univocal’, I call a concept univocal which is one such that its unity suffices for a contradiction when the concept is affirmed and denied of the same thing; likewise, it suffices for a syllogistic middle, so that the extreme [minor and major] terms joined in the middle that is one in this way are concluded to be joined to one another without the fallacy of equivocation (Ord. I, d. 2, qq. 1-2, n. 26, Vat. III:18).

a. Univocity and Natural Theology

We can think of Scotus’s univocity thesis as a thesis regarding how theological language has to work if it is to furnish the concepts that are needed to render theology a deductive science. The need for univocity is evident in the simplest syllogism, consider: ‘A loving parent cares for her child. God is as a loving parent and hence God cares for God’s creatures’. If the signification of the term ‘loving’ is not fixed across the demonstration but rather shifts from premise to premise, then what we know of parental love might not have any bearing or relevance when it comes to a proper understanding of the love of God. But then it would seem that natural theology is a dead-end practice. As Scotus says, we need a univocity sufficient to avoid the fallacy of equivocation.

Again, just as demonstration cannot proceed absent terms whose meanings are fixed, Scotus believes that without such terms we literally do not have any idea what we are saying when we talk about God. If some ideas do not pertain equally well to God and creatures, if the data of experience do not somehow map onto the divine essence, we know nothing of God, the correct account (ratio) of any divine attribute or perfection need not have anything at all in common with a similar correct accounting of the attribute as it is manifest in creatures. As Scotus puts it, if things were really this bad, we would have no better reason to call God wise than a rock (Ord. I, d. 2, qq. 1-2, n. 40, Vat. III:27).

Apart from univocity and analogy, Scotus had another option when it came to knowledge of God proffered in the negative theology of Rabbi Moses Maimonides (d. 1204 C.E.), who held that we know of God only what God is not (negative theology or strong apophaticism is also referred to as the way of remotion or the via negativa). On Scotus’s reckoning, even this supposed lack of knowledge presupposes some positive knowledge of God. Every denial entails an assertion and when we deny that God has some attribute, this is on the basis of positive knowledge that shows us that it is inconsistent to affirm this attribute of God. Likewise, and in keeping with Thomas Aquinas (ST Ia.13.2c.), Scotus notes that negative theology is incompatible with Christian faith: “We don’t fall intensely in love with negations” (Ord. I, d. 2, qq. 1-2, n. 10, Vat. III:5).

In sum, if we lack concepts univocal to God and creatures, Scotus believes that natural theology must fail on several counts. We could not construct sound proofs with God as the subject and all of our concepts of God would prove to be vacuous inasmuch as all knowledge is tied to experience and experience could not then serve to provide any correct account of God. Hence, Scotus charges that “All masters and theologians appear to use a concept common to God and creatures, though they deny this when they do it” (1 Lect. d. 3, n. 29).

b. Illumination Theory and Abstraction

Medieval illumination theory holds that God is somehow responsible for our having knowledge. God’s role in our acquisition of concepts is seen as more or less active often dependent on a thinker’s respective adherence to either a Platonic or an Aristotelian framework. During his middle period (c. 365-c. 347 B.C.E.), Plato (428/427-348/347 B.C.E.) states that various natural kinds, attributes (and perhaps even artefacts) acquire essential predicates by means of a type of vaguely described participation in unique, eternal, immutable, archetypical exemplars (termed ‘forms’ or ‘ideas’) that are more and less perfectly imitated by these various particulars. (See, for example, Republic 504e–518c and 596e–597a, Phaedo 100b–102a3, and Phaedrus 247c3–247e6. For the dating of these works in Plato’s middle-period, see Kraut.) Direct access to Plato’s writings in the middle ages was limited to a fragment of his Timaeus. Nevertheless, Plato’s thought was transmitted to medieval thinkers in a variety of ways, including the writings of Augustine (354-430 C.E.). By way of contrast, by the end of the twelfth century, the Latin West had access to more or less the entirety of the surviving writings of Aristotle that today comprise his corpus (prior to this medievals had access only to the Categories and On Interpretation as well as Porphyry’s (d. 305 C.E.) Isagoge, a tremendously influential introduction to Aristotle’s logic). Whereas Plato grants ontological priority to the immaterial forms and insists that the best knowledge we have is of these archetypical templates, Aristotle’s Categories upends this picture by rendering everyday substances the primary locus of predication:

All the other things are either said of the primary substances as subjects or in them as subjects. So if the primary substances did not exist it would be impossible for any of the other things to exist (trans. Ackrill, 2b4-6).

For Aristotle, predicates apply only to individual substances; they do not correspond to hypostasized, otherworldly Platonic essences such as goodness and beauty. Substances are therefore prior “by nature (tē phusei)” and hence responsible for the existence of the accidents for whom to be is to be in another (Cat. 14b11–13). Mundane substances, not otherworldly forms, ground our knowledge. It was accordingly natural for Aristotelian and Augustinian accounts respectively to downplay and emphasize the need for illumination.

Augustine straightforwardly identifies Plato’s forms with the divine ideas (De diversis Quaestionibus octoginta tribus liber unus, q. 46, 1-2). And Augustine’s account of knowledge of God incorporates direct illumination. Augustine’s On The Trinity spells out the abstractive process whereby we approach knowledge of God’s goodness and details the requisite illumination in which the idea of God’s goodness is “impressed” on us:

[Reflect on] ‘this [particular] good’ and ‘that [particular] good’; [and then] take away ‘this’ and ‘that’, and see good itself if you can; so you will see God who is good not by another good, but is the good of every good . . . In all these good things . . . we would be unable to call one better than the other . . . if the idea of the good itself had not been impressed upon us, according to which we approve of something as good, and also prefer one good to another (8.3, quoted in Frank and Wolter, 138).

For his part, Scotus holds that were the human intellect so weak as to require an illumination to form concepts of God, this very weakness would likewise undercut our ability to receive these concepts (Ord. I, d. 3, pt. 1, q. 4, n. 225, Vat. III:136). Rather, Scotus will allow for a general form of illumination to the extent that God both produces objects in intelligible being and is also that in virtue of which these objects move us to understanding (Ord. I, d. 3, pars 1, q. 4, n. 268, Vat. III: 163-64). Accordingly, Scotus believes that we can form concepts proper to God and creatures through purely natural means, apart from any special activity on the part of God over and above God’s having put into place certain factors. Scotus spells out how we do this in a discussion that runs parallel to Augustine’s while avoiding any reference to special illumination in the acquisition of concepts univocal to God and creatures:

Every metaphysical inquiry about God proceeds in this fashion: the formal notion of something is considered; the imperfection associated with this notion in creatures is removed, and then, retaining the same formal notion, we ascribe to it the ultimate degree of perfection and then attribute it to God . . . Consequently, every inquiry regarding God is based upon the supposition that the intellect has the same univocal concept which it obtained from creatures (Ibid., qq. 1-2, n. 39, Vat: III:26).

c. Analogy and Univocity

Scotus believes that natural theology rests (tacitly or otherwise) on the assumption that experience furnishes concepts univocal to God and creatures. But is not Scotus overhasty in his univocity-or-nothing approach to knowledge of God (see above, section 5a)? After all, for Scotus’s contemporaries, analogy is good enough for the purposes of natural theology. Thomas Aquinas makes this point when he states that whereas we lack terms univocal to God and creatures, demonstration can nevertheless proceed by means of analogical terms:

No name is predicated univocally of God and of creatures. Neither, on the other hand, are names applied to God and creatures in a purely equivocal sense, as some have said. Because if that were so, it follows that from creatures nothing could be known or demonstrated about God at all; for the reasoning would always be exposed to the fallacy of equivocation. Such a view is against the philosophers, who proved many things about God, and also against what the Apostle says: “The invisible things of God are clearly seen being understood by the things that are made” (Romans 1:20). Therefore, it must be said that these names are said of God and creatures in an analogous sense (ST Ia.13.5c).

Medieval theories of analogy develop out of Aristotle’s Physics and Metaphysics, where he discusses the many meanings that we attach to the term ‘being’. In the latter work, Aristotle investigates the possibility of metaphysics as the universal science of being qua being, ranging from substances and their modes or accidents to the first unmoved mover (4.1, 6.1). But, demonstration is not transcategorial, as diverse entities have nothing in common (Posterior Analytics 1.7; see also above, 2b). Hence, to function as a science that cuts across the categories, metaphysics uses what Aristotle terms ‘pros hen (toward one)’ equivocation or analogy, which conceives diverse entities under a concept that applies primarily to one and in a secondary or derivative sense to the other. Hence even accidents are termed ‘beings’ inasmuch as they derive their existence from substances of whom being is properly said (Metaph. 4.2). Aquinas uses Aristotle’s scheme to cast metaphysics as the study of creatures and God as their source, reliant on terms that signify in prior and posterior senses to supply the science with its universality (Wippel).

Separating the likes of Aquinas, on the one hand, and Scotus, on the other, were the Condemnations of 1277, drafted as a reaction to so-called Latin Averroist readings of Aristotle that developed out of the reception of the commentaries on Aristotle of the Muslim philosopher Averroes (d. 1198 C.E.), Latin Averroism suggested a possible disparity between truths of reason, on the one hand, and revelation, on the other. Though he was a strident and successful critic of this interpretation of Aristotle, some of Aquinas’s views were lumped in with those of the Averroists, leading to the condemnation of certain of Aquinas’s positions, for instance, that we can know of God only that God is, or exists. (In 1325, two years after Aquinas’s canonization, the Condemnations were repealed to the extent that they touched on his works.) Henry of Ghent (d. 1293), who had a hand in drafting the Condemnations, viewed Aquinas as too apophatic. Whereas Aquinas held that metaphysics studies God only indirectly as the cause of categorial beings, Henry construes metaphysics as the study of being taken absolutely, comprising both God and creatures (Dumont 1998b). Again, Henry holds that we have essential or quidditative knowledge of God (quidditative knowledge answers the question ‘What is it (Quid est)?’). Quidditative knowledge of God is of the divine attributes grasped in an imperfect or quasi-accidental manner (Dumont 1998a). Such knowledge of God and so broad a metaphysics requires concepts that are general enough to apply to God and creatures without suggesting that the two share in anything real, so as to avoid collapsing the metaphysical distance between them. As we shall see, Henry’s attempt to accommodate this demand will open the door to Scotus’s univocity thesis. Henry seeks concepts sufficiently general to apply to God and creatures in a model of pseudo-concepts of being and the various perfections, which concepts initially strike us as common to God and creatures owing to the concepts’ vagueness. On reflection, these pseudo-concepts are exposed as each being the conflation of two concepts, one proper to God, the other to creatures. As the pseudo-concept in fact comprises utterly distinct concepts, its existence does not entail that God and creatures actually share in any real feature. Henry calls these distinct concepts ‘analogical’ with respect to one another as they are of traits that apply primarily to God and in a derivative sense to creatures – though Henry sometimes speaks of the vague pseudo-concept itself as an analogous concept (Summa, a. 21, q. 3). The analogous concept that pertains only to God is ‘negatively undetermined’ (not open to any further determination by means of some advening perfection), whereas its creaturely counterpart is ‘privitively undetermined’ (conceived apart from the determinations that are bound up with its instantiations in creatures). It is because in either case the concepts are of being and its attributes as undetermined (either negatively or privitively) that the concepts were initially conflated (see Dumont 1998a and 1998b, and Quodlibeta 13, q. 10; Summa a. 21, q. 2; a.24, qq. 6-7).

Henry’s pseudo-concept that merely seems common to God and creatures is the progenitor of Scotus’s univocal concepts under which we conceive both. Scotus sees that if Henry’s account is correct, concepts of creatures tell us nothing of the creator and hence experience teaches us nothing of God. Hence, Scotus contends that on Henry’s account, an analogical concept of God is in fact “entirely different from a concept said of a creature” (Ord. I, d. 2, qq. 1-2, n. 26, Vat. III:18). Scotus therefore replaces the analogous pseudo-concept with the univocal concept and modalizes negative and privitive indetermination into the degrees of intensity that characterize the instantiation of traits in God and creatures, respectively (Dumont, 1992). Scotus’s attack on analogy is then directed at Henry’s version of analogy, which supposes radically distinct concepts only mistakenly thought to pertain to God and creatures. As regards the traditional sense of analogy wherein terms apply primarily to God and in a secondary sense to creatures, Scotus would likely insist that if religious language does not preserve a univocal conceptual content common to both senses, it devolves into chance equivocity as discussed above in section 2a (Williams 2005 and Cross 2012).

6. Metaphysics as Natural Theology

Scotus takes up the notion of the univocal concept of being that renders metaphysics a natural theology in response to the question as to whether we have natural knowledge of God. Ultimately, Scotus will conclude that although we cannot naturally grasp the divine essence in its individuality as it is distinct from all things, we nevertheless can naturally acquire a concept whereby we conceive God essentially and quidditatively as the subject of inherence with respect to the divine attributes. Scotus distinguishes his theory from Henry’s on the grounds that the latter’s quidditative knowledge of God does not pertain directly to the divine essence but is rather “quasi accidental (quasi per accidens)” (Ord. I, d. 3, pt. 1, qq. 1-2, nn. 25, 56, Vat. III.16-17, 38-39). As regards the properties of the divine essence that we arrive at in metaphysics, for Scotus these remain identical with the divine essence and yet formally distinct from one another inasmuch as they may be considered without reference to one another. (Scotus recognizes a formal distinction between inseparable aspects (or formalities) of one and the same individual, for example, an individual’s rationality and animality, such that they may be considered apart from one another. In the case of the divine essence the formal distinction implies even less composition than in that of creatures, wherein various formal aspects united in an essence perfect one another, as, for example, the rational quality may perfect animal nature. See Hall 136, n. 38; Noone; King, n. 13; Ross and Bates n. 13; Alluntis and Wolter 505-09). Nevertheless, as our grasp of what we would attribute to God leaves off at the level of a univocal concept under which we conceive both God and creatures in a manner that is proper to neither, we do not know the divine essence in a proper and particular manner; our finite mind’s finite grasp of a concept univocal to God and creatures proves inadequate when we allow that the attribute thus conceived is constituent of the infinite divine essence (see above, section 4). Scotus’s caution on this point grows out of his understanding of the unity in diversity of the divine essence. An infinite entity must possess every perfection of being (Quodlibet 5.8-9) while remaining utterly simple (De primo principio 4.75). Moreover, God’s infinite being exceeds finite being beyond any relative measure or proportion (Quodlibet 5.9). Hence the distance between God and creatures is secure.

a. Metaphysics and the Transcendentals

Scotus conceives of metaphysics as the universal science of what he terms the transcendentals as such (Questions on the Metaphysics, prologue). The medieval theory of the transcendentals has its roots in Plato and Aristotle and was developed by Augustine, Boethius, Pseudo-Dionysius the Areopagite (late-fifth or early-sixth century C.E.) and Avicenna (d. 1037 C.E.). Phillip the Chancellor codifies the theory in his Summa de bono, which asks how we speak of both God and creatures as good and proposes that goodness pertains to God and creatures (in respectively absolute and relative senses) inasmuch as goodness (and unity and truth) are transcendental attributes or properties of being as such.

Following Aristotle (Cat. 5), medieval thinkers recognize ten categories or highest genera of things that are, substance, on the one hand, and its various accidental modes (such as quantity, quality, relation, and so forth), on the other (see above, section 2b). The ten categories together comprise all things except God. Since the transcendentals are the attributes of being as such (that is, as conceptually prior to its division into finite (categorial) and infinite (divine) being), they therefore cut horizontally across the various categories and extend vertically to take in God and creatures. As regards unity, truth and goodness, these were thought to be coextensive properties of being. Apart from the coextensive properties of being, Scotus’s account of the transcendentals recognizes transcendental pure perfections and disjunctions. From Anselm’s (d. 1109 C.E.) Monologion, Scotus derives the notion of pure perfections as perfections that are absolutely and unqualifiedly better than whatever is incompatible with them. Hence it is better to be wise than not and if a dog cannot be wise it would be better for it were it not a dog but rather something that can attain wisdom (De Primo Principio 4.10). Transcendental disjunctions, on the other hand, are disjunctions whose extremes take in all things, for example, finite-infinite (Ord. I, d. 8, q. 3). Note that only the attributes of being are coextensive with all beings. The disjunctions are opposed to one another in the sense that they are mutually exclusive within one and the same individual and the pure perfections do not characterize all entities (neither dogs nor instances of whiteness are wise). Hence, strictly speaking, the pure perfections and disjunctions are transcendental only inasmuch as they aren’t contained under any one particular genus and not because they characterize all things.

As noted, transcendentals pertain to being as such prior to its division into finite and infinite being. Yet, Scotus does not hypostasize being as such. He does not maintain that being as such exists somehow independently of either God or creatures. Rather, all being is modalized being, infinite or finite being. Scotus’s talk of being considered in its indifference to finite and infinite modes refers to the univocal concept of being that pertains to both God and creatures in a manner that is not proper to either inasmuch as the univocal concept does not take into account the relevant modal characteristics that govern its various instantiations. But when we account for the relevant modal factors, this results in the production of new, complex concepts. As Scotus points out, we can be certain that God is a being, whereas we remain in doubt as to whether God is a finite or an infinite being and hence the complex concept of infinite being that is affirmed of God differs from both the simple, univocal concept of being, on the one hand, and that of creaturely, finite being, on the other (Ord. I, d. 3, pt. 1, qq. 1-2, n. 27, Vat.III:18) (see above, section 4).

Working out the implications of metaphysics as the science of the transcendentals as such, Scotus believes that the metaphysician is able to demonstrate that God exists and can ascribe to God various perfections and attributes. Proof of the existence of God draws on transcendental disjunctions such as necessary-or-contingent and relies on the principle that “as a general rule by positing the less noble extreme of some being, we can conclude that the nobler extreme is realized in some other being” (Ibid. d. 39, n. 13). Hence, Scotus’s strategy is to demonstrate God’s existence by means of transcendental disjunctions such as ‘necessary-contingent’:

If some being is contingent, then some being is necessary. For . . . it is not possible for the more imperfect extreme of the disjunction to be existentially predicated of being particularly taken, unless the more perfect extreme be existentially verified of some other being upon which it depends (Ibid.). (For the complete proof, with commentary, see Frank and Wolter, 40-107. Other versions of the proof are at Lect. 1, d. 2, q. 1, nn. 38–135; Reportatio 1, d. 2, q. 1; and De primo principio).

Moreover, the metaphysician’s proof is superior to the natural philosopher’s Aristotelian proofs of an unmoved mover, inasmuch as we know God more perfectly and immediately when we conceive God as necessary being rather than as first mover (as the former attribute is more intimately bound up with the divine essence). As regards perfections and attributes of the divine essence, the latter are known to belong to God inasmuch as they characterize all things, whereas the former are ascribed by means of a perfect-being theology that endorses the principle that pure perfections belong necessarily and in the highest degree to the highest nature (De primo principio, 4.3). Hence when the natural theologian has deduced that God is the highest being, the pure perfections are then known to apply to the divine essence, with our creaturely understanding of these perfections serving as the basis of our knowledge of God (Wolter, 1950).

b. Does Scotus “Destroy All Philosophy?”

Scotus’s natural theology rises or falls with the success or failure of the univocity thesis. Univocity is not supposed to be transcategorial; hence, Scotus’s contemporaries use analogy to predicate across the highest genera and of God and creatures (see above, sections 2a and 5c). On this scheme, Scotus’s claim that being (and with being its transcendental attributes) is univocal to God and creatures and this risks elevating being to a highest genus (see above, section 2b); how else can the concepts of being and its transcendental attributes be univocal across the categories on an Aristotelian worldview? Yet being cannot be a genus, genera are not said of their differences and yet the differences that specify types of beings certainly do exist (see above, section 2b). Perhaps even worse, were being a genus over God and creatures, God and creatures would then agree in some reality, rendering God metaphysically complex (composed of that common reality along with a reality that would uniquely determine the divine essence) and of a kind with creatures. God would no longer be God.

Scotus recognizes that the thesis of the univocity of being and its transcendental attributes would appear to ask that being function as a super-genus above the ten highest genera and that his scheme therefore threatens to collapse the metaphysical space that separates God and creatures. Scotus’s solution is to use the concept of being as such (that is, as conceptually prior to its division into finite (categorial) and infinite (divine) being) as a stand in for any such super-genus. Unlike such a super-genus, however, the concept of being as such is a mental abstraction that does not pertain to anything at all until the relevant modal considerations have been introduced and hence God and creatures needn’t agree in anything real in order to be conceived under the concepts of being as such its transcendental attributes. As noted, however, Scotus suggests that these modal differences entail a difference of kind: “The infinite exceeds the finite in being beyond any relative measure or proportion that could be assigned” (Quodl. 5.9). Be that as it may, the univocity thesis does not concern what God is. The thesis is about how we think and talk about God and the conditions to which religious language must conform in order to advance sound arguments. Hence the univocity of concepts under which both God and creatures are conceived is compatible with the metaphysical gap between God and creatures. But, it should be noted that the distance between God and creatures does not prevent our learning about God through experience. Like other medieval thinkers, Scotus holds that the attributes we ascribe to God belong primarily to God and in a secondary or derivative manner to creatures. Though the understanding of the attributes in question that we build up through experience is admittedly imperfect, it is nevertheless an understanding of God. Accordingly, Scotus’s univocity thesis conforms to the medieval consensus that inasmuch as concepts are of creatures that imperfectly imitate or represent God, they are of God imperfectly conceived (Ord. I, d. 3, pt. 1, qq. 1-2, n. 56, Vat III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179).

7. References and Further Reading

a. Primary Sources

  • Aquinas, Thomas. Summa Theologica. Translated by Fathers of the English Dominican Province.
  • Aristotle. The Complete Works of Aristotle: The Revised Oxford Translation. Edited by J. Barnes. 2 volumes. Bollingen Series. Princeton: Princeton University Press, 1984.
  • Duns Scotus, John. Opera omnia. Edited by C. Balić, et al. Vatican Scotistic Commission. Rome: Polyglot Press, 1950-
  • Duns Scotus, John. Duns Scotus on Time and Existence: The Questions on Aristotle’s ‘De interpretatione’. Translated with introduction and commentary by Edward Buckner and Jack Zupko. Washington, D.C.: The Catholic University of America Press, 2014.
  • Duns Scotus, John. Duns Scotus, Metaphysician. Translated and edited with commentary by William A. Frank and Allan B. Wolter. West Lafayette, Indiana: Purdue University Press, 1995.
  • Duns Scotus, John. John Duns Scotus, Philosophical Writings: A Selection. Translated with introduction and notes by Allan Wolter. Foreword by Marilyn McCord Adams. Indianapolis: Hackett, 1987.
  • Duns Scotus, John. John Duns Scotus, God and Creatures: The Quodlibetal Questions. Translated with introduction, notes, and glossary by Felix Alluntis and Allan B. Wolter. Princeton, N.J: Princeton University Press, 1975.
  • Duns Scotus, John. John Duns Scotus, A Treatise on God as First Principle. Translated and edited with commentary by Allan B. Wolter. Chicago: Franciscan Herald, 1984
  • Ghent, Henry. Quodlibeta Magistri Henrici Goethals a Gandavo Doctoris Solemnis. Paris, I. Badius, 1518; repr. In 2 vols., Louvain, Bibliothèquee, SJ, 1961.
  • Ghent, Henry. Summa quaestionem ordinariarum. Paris 1520; repr. In 2 vols., ST. Bonaventure, NY, Franciscan Institute, 1953.
  • Ockham, William. Opera Philosophica I – Summa Logicae St. Bonaventure, N.Y: Editiones Instituti Franciscani Universitatis S. Bonaventurae, 1974. 899 p., eds. Boehner, Philotheus, Gál, Gedeon, 1915- Brown, Stephen.

b. Secondary Sources

  • Burrell, David. “John Duns Scotus: The Univocity of Analogous Terms.” The Monist 49 (October 1965) 639-58.
  • Cross, Richard. Duns Scotus. Oxford, 1999.
  • Cross, Richard. “Where Angels Fear to Tread.” Antonianum 76 (2001): 7-41.
  • Cross, Richard. “Duns Scotus on God.” Ashgate, 2005.
  • Cross, Richard. “Univocity and Mystery.” In New Essays on Metaphysics as Scientia Transcendens. Edited by Roberto Hofmeister Pich. Fédération Internationale des Instituts d’Études Médiévales, 2007.
  • Cross, Richard. “Duns Scotus and Analogy: A Brief Note.” The Modern Schoolman 89:3/4 (2012): 147-54.
  • Dumont, Stephen D. “The Univocity of the Concept of Being in the Fourteenth Century: John Duns Scotus and William of Alnwick.” Mediaeval Studies 49 (1987): 1-31.
  • Dumont, Stephen D. “Transcendental Being: Scotus and Scotists.” Topoi 11 (Sept. 1992): 135-48.
  • Dumont, Stephen D. “Henry of Ghent and Duns Scotus.” Medieval Philosophy 3 (1998a): 291-328.
  • Dumont, Stephen D. “Scotus’s Doctrine of Univocity and the Medieval Tradition of Metaphysics.” In Was ist Philosophie im Mittelalter? Edited by Jan Aertsen and Andreas Speer. Walter de Gruyter, 1998b.
  • Goris, Wouter and Aertsen, Jan, “Medieval Theories of Transcendentals”, The Stanford Encyclopedia of Philosophy (Summer 2013 Edition), Edward N. Zalta (ed.), URL = http://plato.stanford.edu/archives/sum2013/entries/transcendentals-medieval/
  • Gracia, Jorge J. E. “Categories Vs. Genera: Suárez’s Difficult Balancing Act.” In Categories and What is Beyond. Proceedings of the Society for Medieval Logic and Metaphysics Volume 2. Cambridge Scholars Publishing, 2011: 7-18.
  • Hall, Alexander. Thomas Aquinas and John Suns Scotus: Natural Theology in the High Middle Ages. Bloomsbury, 2009.
  • Hall, Alexander “Confused Univocity?” In Proceedings of the Society for Medieval Logic and Metaphysics 7 (2007): 18-31; reprinted in Medieval Metaphysics; or is It “Just Semantics”?  Cambridge Scholars Publishing, 2011. Co-edited with Gyula Klima.
  • Ingham, Mary Beth. “RE-Situating Scotist Thought.” Modern Theology 21:4 (2005): 609-618.
  • King, Peter. “Scotus on Metaphysics.” In The Cambridge Companion to Duns Scotus. Edited by Thomas Williams, 15-68. Cambridge: Cambridge University Press, 2003.
  • Klima, Gyula. “Nominalist Semantics.” In The Cambridge History of Medieval Philosophy. Volume 1. Edited by Robert Pasnau and Christina Van Dyke, 159-172.
  • Kraut, Richard. The Cambridge Companion to Plato. Cambridge: Cambridge University Press, 1992.
  • Marrone, Steven. “The Notion of Univocity in Duns Scotus’s Early Works.” Franciscan Studies 43 (1983): 347-95.
  • Noone, Timothy. “Alnwick on the Origin, Nature and Function of the Formal Distinction.” In Franciscan Studies 53 (1993): 231-61.
  • Pickstock, Catherine. After Writing: On the Liturgical Consummation of Philosophy. Oxford: Blackwell Publishers, 1998.
  • Pickstock, Catherine. “Duns Scotus: His Historical and Contemporary Significance.” Modern Theology 21, no. 4 (2005): 543-574.
  • Pini, Giorgio. Categories and Logic in Duns Scotus: An Interpretation of Aristotle’s Categories in the Late Thirteenth Century. Studien und Texte zur Geistesgeschichte des Mittelalters, Bd. 77. Brill, 2002.
  • Pini, Giorgio. “Univocity in Scotus’s Quaestiones Super Metaphysicam: The Solution to a Riddle.” Medioevo 30 (2005a): 69-110.
  • Pini, Giorgio. “Scotus’s Realist Conception of the Categories: His Legacy to late Medieval Debates.” In Vivarium 43.1 (2005b): 63-110.
  • Pini, Giorgio. “Two Models of Thinking: Thomas Aquinas and John Duns Scotus on Occurrent Thoughts.” In Intentionality, Cognition, and Mental Representation in Medieval Philosophy, edited by Gyula Klima, 81-103. Medieval Philosophy: Texts and Studies. Fordham University Press, 2015.
  • Read, Stephen. “Concepts and Meaning in Medieval Philosophy.” In Intentionality, Cognition, and Mental Representation in Medieval Philosophy, edited by Gyula Klima, 9-28. Medieval Philosophy: Texts and Studies. Fordham University Press, 2015.
  • Ross, James and Todd Bates. “Duns Scotus on Natural Theology.” In The Cambridge Companion to Duns Scotus. Edited by Thomas Williams, 193-238. Cambridge: Cambridge University Press, 2003.
  • Trakakis, N. N. “Does Univocity Entail Idolatry?” Sophia 49 (2010): 535-555.
  • Williams, Thomas, ed. The Cambridge Companion to Duns Scotus. Cambridge: Cambridge University Press, 2003.
  • Williams, Thomas. “The Doctrine of Univocity is True and Salutary.” Modern Theology 21, no. 4 (2005): 575-585.
  • Wipple, John. “Metaphysics.” In The Cambridge Companion to Aquinas. Edited by Norman and Eleonore Stump, 85-127. Cambridge, Cambridge University Press, 1993.
  • Wolter, Allan B. Transcendentals and their Function in the Metaphysics of Duns Scotus. New York: St. Bonaventure, 1946.

 

Author Information

Alexander Hall
Email: AlexanderHall@clayton.edu
Clayton State University
U. S. A.

Arnold Geulincx (1624—1669)

Arnold (or Arnout) Geulincx was an early-modern Flemish philosopher who initially taught at Leuven (Louvain) University, but fled the Catholic Low Countries when he was fired there in 1658. He settled at Leiden, in the Protestant North, where he worked under the patronage of the Cartesian Calvinist theologian Abraham Heidanus (1597-1678), and tried to obtain a post at Leiden University. Geulincx was never to procure a steady position in his new surroundings, and ultimately died in poverty as a victim of the 1669 Leiden plague. On the basis of Descartes’ philosophy, he developed a range of philosophical ideas that sometimes closely resemble Spinoza’s, but always have a particular flavour of their own. His contributions in the fields of logic, metaphysics and ethics have earned him a place not only in the history of Dutch Cartesianism, but in Western intellectual history at large.

As a result of accusations that he had been a Spinozist in disguise, Geulincx’ name was almost erased from history after 1720, but nineteenth-century historians rehabilitated Geulincx for having been a forerunner of Immanuel Kant. Nowadays, Arnold Geulincx is primarily known as a representative of seventeenth-century “occasionalism”, and as an original thinker in-between Descartes and Spinoza. Despite a certain impact he made on his immediate Leiden pupils, such as the Dutch Cartesians Cornelis Bontekoe (c. 1644-1685) and Johannes Swartenhengst (1644-1711), and on the English philosopher Richard Burthogge (1638-1705), as well as on a number of enlightened members of the Dutch Calvinist clergy during the last quarter of the seventeenth century, Geulincx’ most significant influence in intellectual history to date has been on the novels and plays of Samuel Beckett (1906-1989), as well as, through Beckett, on late twentieth-century French philosophy.

Table of Contents

  1. Life
  2. Logic and Method
  3. Metaphysics
  4. Ethics
  5. Anti-Aristotelianism
  6. A Philosophy of Wonder
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Arnold Geulincx was born in the city of Antwerp, which despite having lost its former glory as a hub of world trade and a centre of the arts, had regained new vigour as the home of Counter-Reformation culture in the Southern Netherlands—a new spirit that was evidenced in the paintings of Peter-Paul Rubens, Jacob Jordaens and Anthonie van Dijck, as well as in the Baroque church of Saint Carolus Borromeus, a Jesuit monument consecrated in 1625. Geulincx’ father apparently did well as the city’s messenger to Brussels, since he bought a large house just around the corner of Saint Carolus Borromeus’ Church when Arnold was around thirteen years of age, and another, adjacent one, a year later. While Jan Geulincx, one of Arnold’s younger brothers, studied with Jacob Jordaens for some time, Arnold was destined for an academic career and left Antwerp to go to university in January 1640. In Leuven, he studied arts and philosophy at the College of the Lily, obtaining his licentiate on November 19, 1643 ranking second best in a class of 159 students. Reading theology for some time, Geulincx was appointed junior professor in philosophy at Lily College in December 1646.

Not much is known about Geulincx’ early career, but it is reasonable to assume that he made a strong impression with his rhetorical skills, founded on the remarkable proficiency in Latin he had already exhibited during his Antwerp school days. In the autumn of 1649, Geulincx’ career perspectives seemed secure enough for his parents to give up their life in Antwerp and join their son in Leuven, the area they originally came from. Another three years hence, Geulincx became senior professor and was asked to deliver a series of speeches during the end-of-the-year Saturnalia festivities. Protests against the nova philosophia at Leuven may have been prompted by Geulincx’ opening address on December 16, 1652, where he ventilated Baconian ideas and outlined recommendations for changes to be made to the university curriculum.  Initially, however, this did not in any way hinder a successful continuation of his academic career.

Disgrace and downfall came only in 1658, when Geulincx was dismissed, presumably on account of attempting to breach the rule of celibacy for university professors by planning to marry his cousin Susanna Strickers— a privilege that had been granted to his Lily tutor William Philippi (1600-1665) in 1630 only after mediation by the Brabant Council. Reportedly, the 1630 agreement had been made on the explicit condition that this would be the last time. A later eighteenth-century source mentions disputes with his colleagues and debts as reasons for Geulincx’ dismissal, but these have been impossible to trace. Since there is no evidence that the committee that sacked him had any problems with Geulincx personally, it may well have been the case that he simply had to choose either not to marry or to leave.

Religious considerations, however, may also have played a part. A letter of recommendation signed May 3, 1658 by the three Leiden theologians: Abraham Heidanus, Johannes Coccejus (1603-1669) and Johannes Hoornbeek (1617-1666), not only indicates that Geulincx had turned his back on the Catholic faith after he had taken in St. Augustine’s theory of grace, but also that he had initially visited Holland of his own accord in January 1658, “under the pretext of another trip to this province” (Eekhof, 1919: 19). Upon his return to Leuven, Geulincx had found that a successor had been appointed in his place. Although, as the text tells us, he had already decided to give up his position at Leuven, he had not expected the hostile reaction he was met with, since the letter specifies that “he barely escaped a life sentence.” If it is indeed the case that Geulincx confronted his colleagues in January 1658 with the embarrassing fact that he had left Leuven prompted by the intention to convert to Protestantism, it is likely they treated his case with utmost efficiency and discretion. Such intentions would have come at a very untimely moment. Amidst condemna­tions of Jansenism issued by the Vatican, and declarations of political freedom made by the Brabant Council, there was extremely little room to maneuver for Leuven’s university professors. They may well have been happy to explain Geulincx’ dismissal, if at all, in terms of marriage plans rather than dogmatic preferences. Rumors about Geulincx’ debts, moreover, may have had their origin in the fact that, under these circumstances, Geulincx had to leave everything behind in a hurry, and flee to Leiden penniless.

In his new home town, Geulincx was to graduate in medicine on September 17, 1658, no doubt in order to be able to earn a living. He married Susanna on December 8. Rather than to become a doctor, however, his ambition was to resume his career as a professor of philosophy. After a series of appointments and dismissals, Geulincx was finally appointed as junior lecturer in late 1662, with the help of Heidanus; first in logic; then in metaphysics. He was temporarily appointed as Professor extraordinarius in 1665, but was allowed to teach ethics only in February 1667. From June to November 1669, Geulincx was again newly appointed, now in order to teach rhetoric. He died in poverty in November 1669, having failed to pay any rent for the apartment he had shared with Susanna since October 1668. When Susanna died around the New Year, there was only some furniture left to compensate the couple’s creditors.

2. Logic and Method

With two works on the subject of logic, Geulincx had nevertheless started off his Leiden career in a positive mood. In the first of these works, his Logica suis fundamentis restituta (Logic Restored to its Foundations, 1662), Geulincx interprets negation mainly as propositional negation, that is, as acting on the whole of a proposition, not on terms. The other book, Methodus inveniendi argumenta (A Method for Finding Arguments, 1663), used set theory relations to demonstrate logical principles. For this way of approaching logic, the Dutch philosopher Gabriël Nuchelmans (1922-1996) would later refer to Geulincx’ logic as a “containment theory of logic”, in which relations of containment illustrate how statements are implied by other statements. Containment may explain logical consequence, for instance, since the propositional content of a statement q may be implied by that of p, just as, according to Geulincx, every proposition p will entail any number of further statements implied by p. Interpreting the way in which subjects relate to predicates in terms of relations of containment as well, Geulincx considered subjects as the denumerable “parts” of conceptual “wholes”, and considered the connection between subjects and predicates to be made on the basis of the “relation in which they stand to one another within the hierarchical structure of a conceptual field” (Nuchelmans, 1988: 40).

Producing a modernised summary of Geulincx’ propositional logic in the 1939 issue of Erkenntniss, the Swiss logician Karl Dürr (1888-1970) portrayed Geulincx as an early representative of symbolic logic. Geulincx presented his logical principles in a purely conceptual form and evidently depended on earlier scholastic traditions, such as in his formulation of De Morgan’s laws, which reproduce the fifteenth-century account John Versor offered in his commentary on Peter of Spain. Yet, according to Dürr, Geulincx’ logic contained all the elements of a mathematical logic, complete with variables and logical constants, as well as other remarkable features, such as a Tarskian definition of truth.

To weigh the sophistication of a seventeenth-century system of logic against its medieval forerunners, or to assess its significance for the development of later formal logic is, however, a complicated matter. In a later study, Dürr compared Geulincx’ achievement with similar works in logic, such as the Port-Royal Logic (1662), and works by Johannes Clauberg (1622-1665), Leibniz and Girolamo Saccheri (1667-1733). Dürr came to the conclusion that, especially in the area of propositional logic, Geulincx’ system was richer than that of most of his contemporaries, whilst in the field of term logic, his basic rules for the formal validity of syllogisms surpassed even those of Leibniz in elegance and precision (Dürr, 1965).

As a senior professor in Leuven, Geulincx had previously shown an interest in Baconian philosophy and had proposed to revise the university’s curriculum in such a way that natural philosophy might be studied as a separate field that also included logic and mathematics, as well as forms of experimentation. It is unknown whether Geulincx developed Cartesian views in Leuven as well, as did his Leuven colleague William van Gutschoven (c. 1618- 1667) and his tutor William Philippi (c. 1600-1665) at some point. However this may be, it was only in Leiden that Geulincx began to develop a Cartesian line of argument in natural philosophy, metaphysics and ethics, and expounded views on God’s causal role in nature that would later be interpreted as “occasionalist”.

3. Metaphysics

The appeal to God’s causal activity would become a central feature of both Geulincx’ metaphysics and his ethics, but the way in which he justified and explained the need for a divine administration of the activities normally attributed to “secondary causes”—that is to say, to individual persons and things—differs markedly from the arguments seen in the works of medieval Islamic “occasionalists” and Cartesian contemporaries such as Louis de la Forge, Géraud de Cordemoy and Nicholas Malebranche. Rather than developing the theological view that God exercises full power over man’s causal and epistemological functions; or questioning the metaphysically problematic notion of an exchange of accidents between substances; or, finally, dismissing the possibility that purely corporeal bodies might have a power to move either themselves or other bodies, Geulincx developed his so-called “occasionalist” position on the basis of an interpretation that grounds the idea of causality on the inner experience of active involvement (Renz & Van Ruler, 2010). What may pass for causality in the strictest sense is revealed by what human beings are familiar with, and what they experience within themselves as their own activities: the conscious awareness of “doing” things. Geulincx thus turns the Cartesian focus on human awareness, with its potential for deliberate and conscious activity, into the bedrock of a metaphysics of causal activity. With the notion of activity being linked to states of mental awareness, causality itself becomes the privilege of conscious minds, and a phenomenon for which the subject “doing” them is uniquely responsible.

At the same time, the scope of human activity is greatly reduced on the basis of such a criterion. Since the Cogito, or human consciousness, realises that there are many thoughts (cogitationes) that do not depend on the subject having them, Geulincx very early on in his Metaphysica Vera drew the conclusion that, “[t]here is a knowing and willing being distinct from me” (Geulincx, 1892: 150). It is this being, God, who arouses in us, through his manipulation of matter, the thoughts for which, not knowing how they come about, we cannot claim responsibility ourselves. On the basis of this consideration, Geulincx came to formulate the maxim that has become known as the first axiom of his philosophy: Quod nescis quomodo fiat, id non facis, in other words: “What you do not know how to do, is not your action” (Geulincx, 1892: 150).

In the Metaphysica Vera, or True Metaphysics, first published posthumously in 1691, the focus on the various causal roles of God and man gives rise to a tripartition of the discipline into an Autologia, a philosophy of the Self; a Somatologia, or a metaphysics of the World; and, finally, a Theologia, on God. To include a discussion of the physical universe in an exposition on metaphysics is something that would have been uncharacteristic for Descartes, but it is a move towards a deeper, metaphysical, understanding of nature that Geulincx shares with Spinoza. In fact, although the Metaphysica Vera is an unfinished text that was never authorized and leaves many questions unanswered, it testifies to the way in which various ontological conceptualisations in Spinozism have their antecedents in Geulincx. One of these is the distinction of causal levels into substantial and modal spheres. A significant aspect of Geulincx’ understanding of physical reality is his duplication of the world into a world of “becoming” and a world of “being”— a distinction Geulincx relates to Plato. According to this view, all individual bodies, with their states of “presence” and “absence”, belong to the world of becoming. Based on the idea that a world of mere effects cannot be all there is, Geulincx’ Platonic interpretation of the Cartesian universe introduces the notion of a Body-as-such, in which these effects find their ontological foundation. Carefully avoiding any reintroduction of the Aristotelian terminology of “substance” and “accident”, Geulincx thereby reintroduces the idea of an ontological distinction between the enduring entities of Mind and Body on the one hand, and their varying “modal”, that is, spatio-temporal manifestations on the other. Formulated in Platonic terms in Geulincx and in Aristotelian terms in Spinoza, this quasi-scholastic strategy to distinguish substantial from accidental levels of being results in a metaphysical interpretation of reality in terms of a diversity of ontological spheres – an interpretation that goes well beyond Descartes, but that we find in both Geulincx’ Metaphysica vera and Spinoza’s Principia Philosophiae Cartesianae, Short Treatise and Ethics (Van Ruler, 2009). In Geulincx, moreover, Descartes’ indistinct metaphysical categorizations, in which a single universal matter occurs next to a set of countless individual minds and a single God, is transformed into a strict metaphysical dualism according to which there are only two things: God, or Mind, on the one hand, and World, or Matter, on the other. Placing human minds in God, moreover, Geulincx also prefigured Spinoza in his way of arguing that human minds, like human bodies, are parcels of a larger field, or “modes”.

4. Ethics

Parallels with Spinozistic ways of thinking equally occur in Geulincx’ treatment of the subject of ethics. In both authors, Descartes’ natural philosophy serves as a new basis for the neo-Stoic view that morality should primarily be seen as a way of mentally dealing with inevitable patterns of causality in nature and human social life. According to Geulincx, moreover, the application of reason to all areas of experience is the practical upshot of a mental attitude focused on a “love of God”. Contrary to Spinoza, Geulincx had no qualms with the idea that one is free whether or not to align oneself mentally to the necessary course of things. To put it in Geulincx’ own words: whereas one always obeys God, one has the option whether or not to obey reason – and this is what constitutes the criterion of morality.

With his focus on reason, Geulincx conforms to a general tendency within Renaissance moral philosophy. At the same time, he interprets what is reasonable in his own peculiar way, introducing a new set of four cardinal virtues, namely diligence, obedience, justice and humility, in place of the old quadriga of temperance, fortitude, justice and prudence. These virtues are all aimed at reason. Accordingly, rather than being directed towards other human beings, what Geulincx prescribes as obedience is an obedience to reason, just as humility is a mental humility in the face of reason, diligence involves a diligent attention to reason, and justice is the acceptance of a just and reasonable mean.

Reason should always be followed, but in the context of such encouragements to mental subservience, the example with which Geulincx illustrates obedience is easily misread. Even the wretched life of a slave, Geulincx argues, may be lived in freedom, as long as the slave is able to direct his will to the call of reason and to endure even “an appalling and cruel slavery” by obeying orders not because it is the will of his master, but because it is his own (Geulincx, 1986: 82; See also 1893: 23; 2006: 24). Despite its awkward way of seemingly sanctioning slavery, this argument only carries to the extreme another conception predominant in both classical and Renaissance traditions of Western moral philosophy, and most straightforwardly expressed in (neo-)Stoic sources: the notion that mental freedom does not depend on the relative force of outward circumstances, but is brought about exclusively by an inner consent to the demands of reason.

In combination with his metaphysical view on the limitations of human causal activity, such a radical endorsement of intellectualist and indifferentist arguments would seem inevitably to lead to a moral position emphasising a passive or even submissive attitude. Geulincx, however, did not preach quietism. The complete text of his Ethics was published only posthumously in 1675, presumably by Bontekoe, under the title of Gnōthi seauton, or Know Thyself, but Geulincx had already issued a Dutch version of the first of its six “Treatises” as Van de Hooft-deuchden (“On the Cardinal Virtues”) in 1664. Far from teaching resignation, the book contains an exceptionally practical list of ethical maxims and reads like a self-help manual in popular psychology rather than a moral treatise in the traditional sense of the word. What, according to Geulincx, is reasonable for a human being to do in the light of the “human condition”— a concept he may have taken over from the French moralist Pierre Charron, or from his Leuven professor in theology Libert Froidmont (1587-1653)— is to abide by seven moral guidelines, or “obligations”: to accept death, to avoid suicide, to take care of one’s health and of that of one’s species, to learn a trade, to earn a living, to relax now and again, and never to curse one’s ancestry or day of birth.

With respect to all of these guidelines, Pierre Charron’s De la Sagesse (1601; revised edition 1603) may have provided Geulincx with a model for the kind of things a moral treatise should instruct (De Vleeschauwer, 1974). Geulincx, however, explained his obligations on the basis of a quasi-Cartesian metaphysical groundwork that at first sight seems to undermine rather than to support them. Denying, like Spinoza, the possibility of any interaction between the body and the mind, Geulincx comes to the conclusion that the human being is only an onlooker, a “spectator” of the outside universe: “I am a mere spectator of a machine whose workings I can neither adjust nor readjust” (Geulincx, 1893: 33; 2006: 34). This would seem to make all human activity not only irrelevant, but downright impossible. Geulincx, however, argues that we should nevertheless be mindful to fulfil certain actions we know from experience God wishes us to perform. We have to search for food, for instance, in order to survive, and we should try to comply in as far as we are able with such evident commitments. Indeed, an attentiveness to the basic facts of life is what links the two aspects of what Geulincx presents as his ethics of ‘humility’. On the one hand, this is the “occasionalist” Inspection of Oneself that tells us we find ourselves in a situation we neither control nor really understand; and, on the other hand, the list of “Obligations” that mark the obvious tasks we have to fulfil, and thus comprise a Disregard of Oneself. We should always choose what we know to be best. The only thing we should not do, according to Geulincx, is to bother about the outcome of our wishes, all of which are ultimately up to God. Thus, in the end, it is only our intentions that matter. In a famous example, Geulincx argued that it is for God to decide whether or not one is killed by the dagger with which one penetrates one’s heart. How one’s volitions are matched by activities produced in the material sphere “outside” is necessarily beyond us.

Geulincx does not speculate on the question to what extent we may rely on God’s resolve. Since the way in which God links physical to mental states is unknown to us, it is unclear whether Geulincx himself expected God either to have established a permanent world order or to produce an incessant number of miracles. As German commentators argued in late-nineteenth century debates on the possible impact of Geulincx on Leibniz, the analogy of two independent but synchronised clocks that Geulincx introduced in order to explain the relation between body and mind, seems to accentuate the Cartesian idea of a law-like regularity in nature. This is a position consistent with the emphasis laid on the notion of reason in Geulincx’ ethics. Yet where human volitions are in play, such as in Geulincx’ example of the dagger, or in his references to the phenomenon of paralysis, it would seem that God might have a more immediate role to play.

In the end, a solution to such metaphysical questions is not Geulincx’ primary concern in the context of ethics. As far as morality is concerned, it does not matter whether God makes a singular decision or whether he lets all physical conditions play their proper roles whenever one wishes to pierce one’s heart with a dagger. The moral point is, that this should not have been one’s intention in the first place. In this sense, the example is not so much meant to elucidate a metaphysical viewpoint, as it is indicative of Geulincx’ preoccupation with questions of life and death, and with the idea that the realm of the moral is defined by the mental attitude one takes with respect to preserving the condition that one finds oneself in as a conscious being. This is also the way in which to read the ethical axiom that Geulincx introduced as a counterpart to his earlier metaphysical maxim. The slogan Ubi nihil vales, ibi nihil velis, “Wherein you have no power, therein you should not will” (Geulincx 1893: 164; 2006: 178), applies not so much to any specific activities, but rather to human existence—“the human condition”— as such.

Although it is very likely that Spinoza (whose friend Lodewijk Meyer studied with Geulincx) must at least have known Geulincx by name, and although one may trace many coincidences in their works, there is no convincing evidence that the two men either knew each other, or knew each other’s work (Van Ruler, 2006). Likewise, it is unknown to what extent Geulincx’ moral philosophy may have inspired Spinoza. Spinoza may have been thinking of Geulincx for instance when, in his own Ethics, he explicitly denied that humility is a virtue, since this was the single most important of the four cardinal virtues for Geulincx. Spinoza may, on the other hand, also have wished simply to make an unequivocal statement against the traditional glorification of humility in Christian theological contexts.

Contrary to Spinoza, Geulincx presented his own moral philosophy as a philosophy compatible with Christian views. Interpreting certain Christian themes in purely naturalistic ways, such as by taking the “devil” solely to stand for a mental propensity to persist in inconsiderate behaviour, Geulincx’ Christian philosophy was unorthodox, but it was also paradoxical in various respects. He considered his moral philosophy to be an ethics exclusively founded on reason. Still, God’s word, or so Geulincx argued, had worked for him like a microscope: once Scripture had revealed the truth, he was now able to decide questions of right and wrong without its help – in other words, purely on the basis of reason. The obvious implication of this is that pagan philosophers could never have been able to find their way in matters of moral philosophy, and this is indeed what Geulincx concluded. True spiritual redemption was open only to philosophers acquainted with what Scripture had shown to be reasonable: the idea that one has no title to one’s life and that this insight should bear fruit in an attitude of humility. Geulincx might still include Platonic, Stoic and even Aristotelian concepts, and stick to classical forms of philosophical analysis in his ethics, but he dismissed all pagan philosophies for having been developed on the basis of inappropriate motivations. All pagans had urged “for the Land of Cockaigne”; they had craved for pleasure rather than having searched for God (Geulincx 1893: 52-54; 1966: 116-118). The pagans, in other words, had consciously aimed at achieving happiness, when all they should have been doing was to look for what is right. The difference between these two roads, Geulincx admits, is a very subtle one, for since reason and Christianity themselves lead to happiness, one has to be extremely careful to avoid the pitfall of self-centered motivations even as a Christian philosopher.

If, as Geulincx argues, one has to flee happiness in order to pursue it, it may seem tempting to try to flee happiness exactly for the reason of acquiring it. In that case, however, Geulincx argues, happiness “will not pursue you” (Geulincx, 1893: 58; 2006: 57). In other words, while one knows happiness will result from the fulfilment of a duty, one still needs to fulfil the duty without doing it with the aim of acquiring happiness, or it will not work. The notions of “Obligation” and “Law” may help to avert any psychological dilemmas here. Laws, according to Geulincx, never correspond to obvious forms of self-interest, or they would not be laws. If only we direct our mind “to refer nothing of what we do or do not do to our Happiness, but everything to our Obligation” and thus “pledge” ourselves “wholly to God” (Geulincx, 1893: 58 and 57, respectively; 2006: 57 and 56), there is no problem. Libertas will be the immediate, if paradoxical, effect of obedience; and happiness, Felicitas (or beatitude, Beatitudo, a word Geulinx uses for Felicitas only when explaining matters in the accepted scholastic terminology) will present itself automatically as the mental bonus for abiding by the way of virtue. Simply doing what God and reason demand, the wise man is able to disconnect his mind from sensory impressions, and to assent to what happens in God’s universe not according to what is most agreeable to him, but according to the way in which reason presents things as they are.

The complicated dialectics of receiving happiness in return for virtue caused Geulincx to touch upon theological questions as well. If Geulincx became a Jansenist in Leuven after having imbibed Augustine’s theory of grace, and a Calvinist later in Leiden, he must at some point have become aware that the whole idea of devising a Protestant moral philosophy was something inherently problematic. Theologically speaking, there could be no question of a Jansinist or Calvinist God distributing happiness in return for our effort. Geulincx was well aware of this, and therefore attempts to deny that he ever implied that God acts in reply to our achievement: “But mark: I did not say that the Humble first love God, and are then loved in return by God. Certainly not, I did not say this, and this should suffice” (Geulincx, 1893: 64; 2006: 63). Philosophically speaking, the rewards of virtue were nevertheless exactly this: God’s love in return for our love of God and reason. In line with a wider tendency in Dutch Cartesianism, Geulincx inevitably had to argue for a strict separation between philosophy and theology in order to save the practical relevance of his moral philosophy.

Besides classical, Christian and Cartesian themes, there may also have been biographical factors involved in shaping Geulincx’ ethics. The precarious living conditions of his Leiden years in particular seem to be reflected in his preoccupation with the insecurities of life and with the possibility of suicide, both of which topics are central to his ethics. Yet such interests may also have had their origin in a special talent for the experience of wonder in Geulincx, as well as an exceptionally subtle philosophical imagination.

5. Anti-Aristotelianism

It is in foreshadowing quasi-Kantian themes that Geulincx’ philosophical discernment appears most conspicuously. Essentially a criticism of Aristotelian ways of thinking, Geulincx’ Metaphysica ad mentem peripateticam, a book that was published only posthumously in 1691, argued that there was an illusory quality to thinking, aside from the illusiveness of sense perception. Not only was it true that our senses, as Descartes had argued, yield a subjective view of the world; according to Geulincx, our intellectual “ways of thinking” (modi cogitandi) distort our conception of reality just as much. Indeed, it is with intelligible species that we impose our ways of thinking on outside things similarly to the way in which we impose sensible species onto the world that do not apply to things as they are in themselves. Both ways, we “always attribute the phantasms (phasmata) of sense and intellect to things themselves”— even if “there is something divine in us that always tells us it is not so” (Geulincx, 1892: 301).

Once more giving a stricter format to Cartesian intuitions than Descartes himself would have done, and prefiguring Spinoza on both accounts, Geulincx distinguished four different kinds of knowledge and drew a sharp distinction between the realm of “imaginations” and the realm of “ideas”. Holding on to a classical notion of scientia that limits the notion of “idea” to the knowledge of the “essence” of a thing, Geulincx interpreted the gradual development of epistemological stages in Platonic rather than in Aristotelian terms and classified the respective levels of knowledge as (1) sense perception, (2) knowledge, or cognitio, (3) scientia, or knowledge with an account; and, finally, (4) the ultimate kind of scientia that is called sapientia or wisdom, which is available only to whomever is accountable for the thing known. Thus offering a seemingly Augustinian-inspired understanding of “ideas” as the kind of things in God’s mind that we must somehow have access to in order to intuit the essences of things, Geulincx in fact denied man any wisdom apart from the wisdom related to his own mental activities, such as our mental activities of love and hate, affirmation and negation and so forth, the reason for this being that to understand these and to will are, in the end, the only things one can actually “do”.

Wisdom accordingly presents itself in Geulincx mainly in a negative way; that is to say, in the form of a recognition that our intellectual capacities are extremely limited with respect to understanding things that occur outside the realm of consciousness. Although the mind knows that all things are either minds or bodies and that infinite mind and infinite extension (that is, God and Body) are ultimately all there is, our “modes of thinking”, in other words, our ways of apprehending reality, misrepresent things as they are in themselves by seeing them as separate “beings” that may function as the subject of predication. Yet we have to see them in this way, if we wish to say something about them.

Although there is an immediate Cartesian context as well as a Scotist terminological background to these arguments, and although, like Geulincx, authors such as Clauberg and Johannes de Raey (1622-1702) had also tried to come to terms with the indistinct manner in which Descartes had discussed general metaphysical concepts in the Principia Philosophiae (Aalderink: 2009), Geulincx’ position stands out for the way in which it emphasizes how the human intellect is liable to characterize the outside world in terms of forms of propositional content that portray whatever there is as being divided into objects possessing certain properties. As Geulincx himself remarks (Geulincx, 1892: 199), “few people seem to observe” that this logical mould introduces ontological classifications for which there is actually no basis in reality itself.

Geulincx thus came to criticise a philosophical viewpoint that had been almost universally shared since Aristotle, the idea, namely, that ontological concepts such as the concept of “substance” may function in parallel ways in metaphysics and logic. His criticism of this view (which is not merely a Peripatetic, but in fact a virtually universal human assumption) launched the epistemologically radical idea that the linguistic and logical ways in which our concepts function within our intellectual representations of the outside world, should actually be a warning against taking them seriously in metaphysical terms. According to Geulincx, logical and linguistic distinctions do not necessarily represent things as they are in themselves. Indeed, notions such as “being (ens), substance, accident, relation, subject, predicate, whole and part” only illustrate how we think about objects. As modes of thought we use these notions to express what we mean when we distinguish a thing from its activity or from our judgement of it. Our manner of understanding, however, should not be confused with the way things are structured and organised independently of our representations of them. Nor should we uncritically build philosophical systems on the categories and logical forms that help us to analyse what we experience.

Because of the way in which he gave prominence to, and ultimately dealt with, the question of the knowability of “things as they are in themselves”, Geulincx’ position has often been associated with the critical philosophy of Immanuel Kant. Ernst Cassirer (1874-1945), for instance, saw both Geulincx’ thesis of the unknowability of “things in themselves” (translated in German as Dinge an sich) and his view that all human understanding is dependent on “forms of thought” brought in by ourselves, as prefigurations of the Kantian position. Although he remained careful not to deny the differences between Geulincx and Kant, the Flemish Geulincx scholar (and former nazi-sympathiser in exile) Herman de Vleeschauwer (1899-1986) in 1957 agreed that if one defines “Criticism” as the theory according to which “we know things only by the medium of our forms of thought”, one could no longer “regard it as the personal discovery of Kant” (De Vleeschauwer, 1957: 63).

In general terms, Geulincx’ alertness to the possible incongruity between the logic of our thoughts and the structure of the outside world may indeed be compared to Kant’s. It may even be extended beyond Kant to serve as a comparison between the Flemish Cartesian’s criticisms of scholastic views and Wittgenstinian, as well as postmodern censures of the metaphysical suggestion that logical forms reflect an ontological structure of things. At the same time, Geulincx stood closer to other seventeenth-century denunciations of Aristotelianism inspired by Descartes, such as John Locke’s. Exposing scholastic metaphysics as a logical scheme functional only within the domain of our daily interaction with macroscopic objects, Geulincx’ evaluation of Peripatetic metaphysics, although it is cast in a rather scholastic terminology itself, anticipates Locke’s view in so far as it confirms the idea that there is a purely nominal aspect to the Aristotelian manner of metaphysical categorisation.

And yet Geulincx’ Metaphysica ad mentem peripateticam creates a sense of epistemological alienation that goes far beyond Locke’s criticism of the notion of substance. If, as a direct consequence of Cartesian natural philosophy, Geulincx argued that scholastic types of analysis in metaphysics might be exposed as logico-linguistic frameworks only, this not only meant that there is a certain contingency to the “essences” derived from mere experience; it also meant that the logic of substance itself was mistaken, and that, accordingly, the search for “substantiality” was ill-conceived. Geulincx did, of course, accept the existence of a universal “Body”, but for him, this idea was not dependent on the vague conceivability of a substantial substrate to which one might attach accidental properties. For Geulincx, the notion of Body-as-such may simply be deduced from the fact that one finds many “thoughts” (cogitationes) in one’s conscious experience that do not depend on oneself. Accordingly, there is something out there, something orchestrated by God. This is the World itself, no less— but there is no sense in continuing, like Locke, to see this World as a substance with properties, or to lament the indistinctness of this “something”. With respect to substan­tiality, we should rather be aware that we are misled by our own intellect into searching for it in the everyday world of things. In the Principia philosophiae, Descartes himself had already argued against trying to conceive of substantial beings behind the forms of “extension” and “thought” that we find in nature. Geulincx drew the ultimate conclusion by arguing that the search for a universal “something” of which the property of being extended is an “accident”, arises from the mistaken belief that the world is structured along the lines of our “modes of thought”.

As a consequence, Geulincx does indeed come close to Kant in the sense that his emphasis on the unknowability of things is modified by the idea that the world as it is “in itself”, remains hidden to our observation and eludes our limited epistemological capabilities to grasp what is actually there. Still, Geulincx’ arguments are very different from Kant’s. According to the Flemish philosopher, our intellect imposes a grid on our experience on account of which we necessarily envision the external world as a world of “things”. Doing so, our metaphysical imagination follows the linguistic and logical habit of distinguishing substantives from adjectives in language and subjects from predicates in logic. The problem with scholastic metaphysics is that it draws ontological conclusions from such cognitive ways of dealing with reality. Just as we attribute our sense impressions to the outside world even though, at least at a certain level of mental development, we become aware that such attributions are incorrect, so too should we, with respect to our intellectual understanding of things, come to doubt the way in which we attribute our cogitationes to things in themselves.

According to Geulincx, there is hardly a way to avoid this, and we cling to the idea of distinguishing beings from properties with even more tenacity than we adhere to the idea of attributing mentally experienced qualities to external things in sense experience. Posing the question how we come to conclude that there is a real basis for distinguishing between subjects and predicates, Geulincx rejects the common scholastic ways of arguing for an actual relation of “inherence” between them. He does, however, offer an alternative ground for our habit of seeing things this way. Whenever we refer to things either as “beings” or as “properties”, it may be that we do so because of the relative stability of our various sense impressions: “The real cause (…) may be, that people see some things as more firm, stable and lasting, others as more fluid, fleeting and frail. Thus (…) light and darkness, colours and sounds and all similar things are regarded as more fluid than body or extension” (Geulincx, 1892: 305). What, in other words, modern psychology and evolutionary biology might consider to be innate propensities, Geulincx was tempted to explain on empiricist grounds. Repeatedly confirming that fluid impressions find their support in firmer ones, rather than that firm marks rest on fleeting signals, our senses will encourage our intellect to follow suit and conceptualise the world in terms of independent beings and their dependent properties.

Rather than to Locke, Kant, or Wittgenstein, Geulincx accordingly compares best to Geulincx himself. A similar interpretation of the way in which the human intellect conceptually rearranges sense experience is found only in the works of his pupil Richard Burthogge. According to Burthogge, the senses give us “external qualia, which reason interprets as predicable of substances or subjects”; a position that, in the terminology of analytical philosophy, has been interpreted as a form of “idealism” (Ayers, 2005: 195). As with so many other statements of this underestimated English philosopher, however, this particular view derives straight from Geulincx’ Metaphysica ad mentem Peripateticam.

Again coming closer to Kant than to Locke, Geulincx developed his epistemological arguments vis-à-vis Aristotelianism not so much in order to make room for a new understanding of nature, but rather in order to heighten our philosophical awareness of the fact that we are fundamentally ignorant of what the world is like independently of our experience. As with Kant, moreover, there is a certain religious susceptibility at play in Geulincx’ philosophical concerns. Exhibiting a mental predisposition coloured by Augustinianism in all of his works, Geulincx would always keep wondering at the ineffable character of God’s universe and our position in it.

6. A Philosophy of Wonder

If Geulincx hardly compares to other philosophers in the Western tradition, others did take their inspiration from Geulincx. Having developed an interest in seventeenth-century philosophy during his assistantship at the École Normale Supérieure in Paris from 1928 to 1930, the Irish poet and novelist Samuel Beckett would take up a close study of Geulincx’ works (the Metaphysica Vera and Ethics in particular) at Trinity College Dublin, in the spring of 1936. As a direct result of this interest, Arnold Geulincx was to play a crucial role in Beckett’s Murphy (finished in June 1936 and published in 1938)—a book that presents its leading character preferably sitting naked in his London apartment, tied to a teakwood rocking chair. Implicit references to Spinoza and explicit references to Geulincx accompany the way in which Murphy’s inner experience is detailed, and is further explained in later chapters.

It has been well-established how Geulincxian imagery, such as that of the cradle (which Geulincx used to explain the relationship between our will and God’s, and which Beckett turned into a rocking chair), the two synchronised clocks, and the passenger walking on the deck of a ship against the vessel’s direction (an image Geulincx himself may have derived from Justus Lipsius), were continually reused by Beckett well beyond Murphy; how Geulincxian expressions, such as “coming hither, acting here, departing hence”, turn into metaphorically rich elements of literary structure in Beckett; and how Geulincx’ overall theme of power and impotency would continue to resonate in Beckett’s plays, prose and cinematographic works. If it is true that “[what] chiefly endured for Beckett from Geulincx was his acceptance of ignorance as the basic human condition, his ethic of humility and his advocacy for ascetic withdrawal and rigorous self-examination” (Herren, 2012: 195), it is also clear why Geulincx might come to function as a replacement for Descartes in Beckett, and as “a philosopher who spoke to [Beckett] as no other had” (Cordingley, 2012: 49). The contrast between Geulincx and Descartes may also serve to accentuate that there was a Geulincxian conceptual background to what twentieth-century philosophers may have derived from Beckett’s plays. On account of the element of ineffability that Geulincx added to Cartesianism, it has been argued that the notion of an “absence of self-presence”, particularly in thinking and in authorship (a theme taken up by French philosophers such as Blanchot, Foucault and Derrida), found a Geulincxian inspiration in Beckett (Uhlmann, 2006: 113).

The great difference, however, between Geulincx on the one hand and twentieth-century French philosophers inspired by Beckett’s absurdist plays on the other, is that Geulincx—like Beckett himself, for that matter—had no inclination to diminish the importance of subjective experience. Indeed, it is precisely in this respect that Geulincx’ “experiential” defence of occasionalist arguments was squarely at odds with Malebranche’s alternative notion of God’s pre-ordination of human minds. Later commentators have been surprised by such disparities within occasionalist philosophy (Nadler, 1999), or have even drawn the misguided conclusion that Geulincx was an inconsistent occasionalist (Terraillon, 1912; Rousset, 1999). In fact, rather than to explain away human mental activity on the grounds of theological or determinist dogma, Geulincx not only took the inner world of conscious­ness as his starting point in philosophy, but also saw it as a cause for wonder at the singularity of the human condition. If things prove themselves to be ineffable, it is to the human subject that they do so. Similarly, if outside things remain inscrutable, it is only of inner experience itself that our knowledge is genuine and absolute.

Samuel Beckett is believed to have broken away from making further dogmatic use of philosophy after his post-war realisation that “All I am is feeling” (Uhlmann, 2006: 72). His interest in Geulincx, however, did not suffer from this. If a mood of estrangement, coupled to a painstaking examination of the inner life, is what Beckett found familiar in Geulincx, it is significant that Beckett never studied the quasi-Kantian arguments from the Metaphysica ad mentem peripateticam, arguably Geulincx’ most radical philosophical text. Apart from some transcriptions taken from the Metaphysica vera, Beckett took his notes mainly from Geulincx’ Ethics. A familiarity of viewpoints must have been obvious to Beckett in these texts as well, which may add to our conviction that Beckett’s prolonged interest in Geulincx was based primarily on an affection that went beyond specific images or doctrines of philosophy.

There was obviously “something of a friendship across centuries” between Beckett and Geulincx (Tucker, 2012: 181), apparently motivated by the articulation of a shared experience that Beckett cherished in Geulincx, and that presumably involved a recognition of something very intimate and relatively rare, even if it had been expressed in such technical philosophical contexts as a religiously motivated metaphysics and a theory of ethics combining classical and Christian themes.

The ultimate secret to Geulincx’ appeal may be that his philosophical texts, despite their traditional setting, have a captivating strangeness to them, which is linked to the alienating topics they address. Whatever his philosophy may have done for Beckett’s artistic development, it is beyond doubt that, just as in Samuel Beckett’s case, Geulincx’ Baroque blend of Augustino-Cartesianism will continue to impress likeminded readers by its unique evocation of the timeless motif of human metaphysical ignorance, as well as by its humbling expression of amazement at the mystery of existence.

7. References and Further Reading

a. Primary Sources

  • Geulincx, Arnold, Opera philosophica, vol. 1, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1891.
    • Geulincx’ Orationes and the Logica restituta
  • Geulincx, Arnold, Opera philosophica, vol. 2, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1892.
    • The Methodus, as well as the metaphysical and physical works
  • Geulincx, Arnold, Opera philosophica, vol. 3, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1893.
    • Ethics, ethical disputations and Notes on Descartes
  • Geulincx, Arnold, Sämtliche Schriften in fünf Bänden, ed. H.J. de Vleeschauwer, Stuttgart-Bad Cannstatt: Frommann-Holzboog, 1965–1968.
    • A handy reprint (in 3 vols.) of the Opera Philosophica
  • Geulincx: Présentation, choix de textes et traduction, ed. Alain De Lattre, Philosophes de tous les temps vol. 69, Paris: Seghers, 1970.
    • A selection of Geulincx’ texts in French
  • Geulincx, Arnout, Van de hoofddeugden. De eerste tuchtverhandeling, ed. Cornelis Verhoeven, Baarn: Ambo, 1986.
    • The first part of the Ethics in a modern version of the Dutch original
  • Geulincx, Arnold, Metaphysics, ed. Martin Wilson, Wisbech: Christoffel Press, 1999.
    • First English edition of the Metaphysica vera
  • Geulincx, Arnold, Ethics, With Samuel Beckett’s Notes, ed. Han van Ruler, Anthony Uhlmann and Martin Wilson. Leiden and Boston: Brill, 2006.
    • The complete Ethics in English with a transcription of Beckett’s notes
  • Geulincx, Arnold, Éthique, ed. Hélène Bah-Ostrowiecki, Turnhout: Brepols, 2010.
    • The Ethics in a modern French edition

b. Secondary Sources

  • Aalderink, Mark, ‘Spinoza and Geulincx on the human condition, passions, and love’, Studia Spinozana vol. 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism, Würzburg: Königshausen & Neumann, 2006, pp. 67-87.
    • On the Augustinian concept of love and its impact on Geulincx and Spinoza
  • Aalderink, Mark, Philosophy, Scientific Knowledge, and Concept Formation in Geulincx and Descartes, Utrecht: Zeno, 2010.
    • Published dissertation on the epistemological differences between Descartes and Geulincx
  • Armogathe, Jean-Robert, and Vincent Carraud, ‘The First Condemnation of Descartes’ Œuvres: Some Unpublished Documents from the Vatican Archives’, in: Daniel Garber and Steven Nadler (eds.), Oxford Studies in Early Modern Philosophy, vol. 1, Oxford: Clarendon, 2003, pp. 67-109.
    • Contains the only known reference to Geulincx’ marriage plans as a reason for his dismissal
  • Ayers, M.R., ‘Richard Burthogge and the Origins of Modern Conceptualism’, in: Tom Sorell and G.A.J. Roger (eds.), Analytic Philosophy and History of Philosophy, Oxford: Clarendon, 2005, pp. 179-200.
    • On Geulincx’ most important pupil in epistemology
  • Cassirer, Ernst, Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit (Berlin, 1906-1923), ed. Dagmar Vogel in 2 vols., Hamburg: Meiner, 1999.
    • On Geulincx and Kant
  • Cooney, Brian, ‘Arnold Geulincx: A Cartesian Idealist’, Journal of the History of Philosophy, vol. 16 (1978), pp. 167-180.
    • English language introduction to Geulincx
  • Cordingley, Anthony, ‘École Normale Supérieure’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 42-52.
    • On Samuel Beckett’s intellectual development during the late 1920s and early 1930s
  • Dürr, Karl, ‘Die mathematische Logik des Arnold Geulincx’, The Journal of Unified Science (Erkenntnis), vol. 8 (1939-40), pp. 361-8.
    • A translation of Geulincx’ logic in modern terms
  • Dürr, Karl, ‘Arnold Geulincx und die klassische Logik des 17. Jahrhunderts’, Studium Generale 18 (1965-8), pp. 520-541.
    • Geulincx’ logic in the context of other seventeenth-century sources in the field
  • Eekhof, A., ‘De wijsgeer Arnoldus Geulincx te Leuven en te Leiden’, in Nederlandsch Archief voor Kerkgeschiedenis, new series, vol. 15 (1919), pp. 1-24.
    • On the Letter of Recommendation of 3 May 1658
  • Herren, Graley, ‘Working on Film and Television’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 192-202.
    • On Beckett’s psychology and Geulincx’ influence on his screenplays
  • Kossmann, E.F., ‘De laatste woning van Arnold Geulincx’, in Bijdragen voor Vaderlandsche Geschiedenis en Oudheidkunde 7-3, pp. 136-138.
    • On Geulincx’ last residence and debts
  • Land, J.P.N., ‘Arnold Geulincx te Leiden (1658-1669)’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 3 (1887), pp. 277-327.
    • On Geulincx’ Leuven dismissal and Leiden career
  • Land, J.P.N., ‘Aanteekeningen betreffende het leven van Arnold Geulincx’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 10 (1894), pp. 99-119.
    • On Geulincx’ life in Flanders
  • Land, J.P.N., Arnold Geulincx und seine Philosophie. The Hague: Martinus Nijhoff, 1895.
    • A Geulincx biography.
  • Lattre, A. de, L’occasionalisme d’Arnold Geulincx, Paris: Les Editions de Minuit, 1967.
    • Published dissertation on Geulincx’ philosophy
  • McCracken, J.D, Thinking and Valuing: An Introduction, Partly Historical, to the Study of the Philosophy of Value. London: Macmillan, 1950.
    • Interpretation of Descartes, Geulincx and Spinoza as a particular school of ethics
  • Monchamp, Georges, Histoire du Carté­sianis­me en Belgique, Bruxelles et St. Trond: F. Hayez, 1886.
    • An as yet unsurpassed history of Cartesianism in the Southern Netherlands
  • Nadler, Steven, ‘Knowledge, Volitional Agency and Causation in Malebranche and Geulincx’, British Journal for the History of Philosophy 7 (1999-2), pp. 263-274.
    • On similarities and differences between Malebranche and Geulincx
  • Nuchelmans, Gabriël, Geulincx’ Containment Theory of Logic, Amsterdam: Koninklijke Nederlandse Akademie van Wetenschappen / Noord-Hollandsche Uitgevers Maatschappij, 1988.
    • Detailed account of Geulincx’ use of set theory in logic
  • Paquot, Jean Noël, Memoires pour servir a l’histoire litteraire des dix-sept provinces des Pays-Bas, de la principauté de Liege, et de quelque contrées voisines, vol. 13, Louvain: De l’imprimerie academique, 1768.
    • Reference to Geulincx’ presumed Leuven quarrels and debts
  • Pfleiderer, Edmund, Leibniz und Geulincx: Mit besonderer Beziehung auf ihr beiderseitiges Uhrengleichniss, Tübingen: Tübinger Universitäts-Schriften, 1884.
    • Start of the controversy on the image of the synchronised clocks in Geulincx and Leibniz
  • Renz, Ursula, and Han van Ruler, ‘Okkasionalismus’, in: Hans Jörg Sandkühler (ed.), Enzyklopädie Philosophie, Hamburg: Felix Meiner, 2010, vol. 2, pp. 1843-1846.
    • On the diversity of occasionalisms
  • Rousset, Bernard, Geulincx entre Descartes et Spinoza, Parijs : Vrin, 1999.
    • Posthumously published monograph on Geulincx
  • Ruler, Han van, ‘“Something, I know not what.” The Concept of Substance in Early Modern Thought’, in Lodi Nauta and Arjo Vanderjagt (eds.), Between Imagination and Demonstration. Essays in the History of Science and Philosophy Presented to John D. North, Leiden: Brill, 1999, pp. 365-93.
    • On Geulincx, Locke and the notion of individuality in scholastic and Cartesian thought
  • Ruler, Han van, ‘Geulincx, Arnold (1624-1669)’, in Wiep van Bunge, Henri Krop, Bart Leeuwenburgh, Han van Ruler, Paul Schuurman and Michiel Wielema (eds.), The Dictionary of Seventeenth and Eighteenth-Century Dutch Philosophers, in 2 vols, Bristol: Thoemmes, 2003, vol. 1, pp. 322-331.
    • Extended dictionary entry on Geulincx and his works
  • Ruler, Han van, ‘Geulincx and Spinoza: Books, Backgrounds and Biographies’, in Studia Spinozana 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism. Würzburg: Königshausen & Neumann, 2006, pp. 89-106.
    • On whether Geulincx and Spinoza knew each other or each other’s work
  • Ruler, Han van, ‘Spinozas doppelter Dualismus’, transl. Andreas Fliedner, in: Deutsche Zeitschrift für Philosophie 57 (2009-3), pp. 399-417.
    • On parallel forms of dualism in Geulincx and Spinoza
  • Terraillon, Eugène, La morale de Geulincx dans ses rapports avec la philosophie de Descartes, Paris: Alcan, 1912.
    • Short work on Geulincx’ occasionalism
  • Thijssen-Schoute, C. Louise, Nederlands Cartesianisme, Amsterdam: Noord-Hollandsche Uitgevers Maatschappij, 1954; new ed. by Theo Verbeek, Utrecht: HES, 1989.
    • Source book on Dutch Cartesianism
  • Tucker, David, Samuel Beckett and Arnold Geulincx: Tracing ‘a literary fantasia’, London: Continuum, 2012.
    • Detailed study and interpretation of all of Beckett’s references to Geulincx
  • Uhlmann, Anthony, Samuel Beckett and the Philosophical Image, Cambridge: Cambridge U.P., 2006.
    • On Beckett’s use of philosophical themes and their literary and philosophical impact
  • Uhlmann, Anthony, Chris Conti and Andrea Curr (eds.), Arnold Geulincx Resource Site, funded by the Australia Research Council: www.geulincx.org
    • A website dedicated to Geulincx research by The Beckett and Geulincx Research Project
  • Uhlmann, Anthony (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013.
    • A volume of articles on Beckett’s intellectual biography
  • Vander Haeghen, Victor, Geulincx. Étude sur sa vie, sa philosophie et ses ouvrages, Diss. Liège, Gent: Vander­haeghen, 1886.
    • Complete intellectual biography
  • Vanpaemel, Geert, Echo’s van een wetenschappelijke revolutie. De mechanistische natuur­wetenschap aan de Leuvense Artesfaculteit, Brussel: KAWLSK, 1986.
    • On Leuven University’s curriculum and Geulincx’ proposals for change
  • Verbeek, Theo, ‘Geulincx, Arnold (1624-69)’, in Edward Craig (ed.), Routledge Encyclopedia of Philosophy, vol. 4, Londen: Routledge, 1998, pp. 59-61.
    • Concise account of Geulincx’ philosophy and its relation to Descartes
  • Vleeschauwer, Herman J. de, Three Centuries of Geulincx Research, Mededelings van die Universiteit van Suid-Afrika / Communications of the University of South Africa, Pretoria 1957.
    • Bibliographical outline of Geulincx-interpretations
  • Vleeschauwer, Herman J. de, ‘Ha Arnold Geulincx letto il « De la Sagesse » de Pierre Charron?’, Filosofia 25 (1974-2 and 1974-4), pp. 117-134 and 373-388.
    • On Charron’s De la Sagesse as a model for Geulincx’ Ethics.

 

Author Information

Han van Ruler
Email: vanruler@fwb.eur.nl
Erasmus University
The Netherlands

Neocolonialism

The term “neocolonialism” generally represents the actions and effects of certain remnant features and agents of the colonial era in a given society. Post-colonial studies have shown extensively that despite achieving independence, the influences of colonialism and its agents are still very much present in the lives of most former colonies. Practically, every aspect of the ex-colonized society still harbors colonial influences. These influences, their agents and effects constitute the subject matter of neocolonialism.

Jean Paul Sartre’s Colonialism and Neocolonialism (1964) contains the first recorded use of the term neocolonialism. The term has become an essential theme in African Philosophy, most especially in African political philosophy. In the book, Sartre argued for the immediate disengagement of France’s grip upon its ex-colonies and for total emancipation from the continued influence of French policies on those colonies, particularly Algeria. However, it was at one of the All African People’s Conferences (AAPC), a movement of political groups from countries in Africa under colonial rule, which held conferences in the late 1950s and early 1960s in Accra, Ghana, where the term was first officially used in Africa. At the AAPC’s “1961 Resolution on Neocolonialism,” the term neocolonialism was given its first official definition. It was described as the deliberate and continued survival of the colonial system in independent African states, by turning these states into victims of political, mental, economic, social, military and technical forms of domination carried out through indirect and subtle means that did not include direct violence. With the publication of Kwame Nkrumah’s Neo-colonialism: The Last Stage of Imperialism in 1965, the term neocolonialism finally came to the fore. Neocolonialism has since become a theme in African philosophy around which a body of literature has evolved and has been written and studied by scholars in sub-Saharan Africa and beyond. As a theme of African philosophy, reflection on the term neocolonialism requires a critical reflection upon the present socio-economic and political state of Africa after independence from colonial rule and upon the continued existence of the influences of the ex-colonizers’ socio-economic and political ideologies in Africa.

Table of Contents

  1. Introduction
  2. History of Neocolonialism
  3. Neocolonialism: Related Concepts
  4. Colonialism
  5. Imperialism
  6. Decolonization
  7. Neocolonialism: The Last Stage of Imperialism
  8. The Myth of Neocolonialism
  9. Neocolonialism Today in Africa: The Era of Globalization
  10. Conclusion
  11. References and Further Reading

1. Introduction

Neocolonialism can be described as the subtle propagation of socio-economic and political activity by former colonial rulers aimed at reinforcing capitalism, neo-liberal globalization, and cultural subjugation of their former colonies. In a neocolonial state, the former colonial masters ensure that the newly independent colonies remain dependent on them for economic and political direction. The dependency and exploitation of the socio-economic and political lives of the now independent colonies are carried out for the economic, political, ideological, cultural, and military benefits of the colonial masters’ home states. This is usually carried out through indirect control of the economic and political practices of the newly independent states instead of through direct military control as was the case in the colonial era.

Conceptually, the idea of neocolonialism can be said to have developed from the writings of Karl Marx (1818-1883) in his influential critique of capitalism as a stage in the socio-economic development of human society. The continued relevance of Marxist socio-economic philosophy in contemporary times cannot be denied. The model of society as structured by an economic basis, legal and political superstructures, and a definite form of social consciousness that Marx presented both in The Capital (1972) as well as in the Preface to the Critique of Political Economy (1977) remains important to socio-economic theory. Marx presents theories which explain a certain kind of evil in capitalism. Today, capitalism has produced the multinational corporations that can assemble far more effective intelligence behind their often nefarious designs than any nation’s government can assemble to try to hold multinationals at bay. As things go now with the capitalist system, there is an indication that there are some foresights in some of Marx’s prognostication. The world seems to continue to acquiesce to the vast control of economic and political resources by the wealthiest 1%. No doubt, Marx’s prognostications have been vindicated in many ways than they have been refuted.

Consequent analysis by Sartre, in his critique of French economic policies on Algeria, was an attempt to combine his existentialist idea of human freedom with Marx’s economic philosophy in order to better establish his opposition to France’s economic colonization of Algeria. Proper coinage of the term neocolonialism in Africa, however, is attributed to Nkrumah who used it in his 1963 preamble of the Organization of African States (OAU) Charter and later, as the title of his 1965 book, Neocolonialism: The Last Stage of Imperialism.

In a simple context, neocolonialism is a class name for all policies, infrastructures and agents actively contributing to society, which indirectly serve to grant continuity to the practices known to the colonial era. The essence of neocolonialism is that while the state appears to be independent and have total control over its dealings, it is in fact controlled by outsider economic and political influences (Nkrumah1965, 7). The loss of control of the machineries of the states to the neocolonialists underlies the basis of Nkrumah’s discourse.

In his article “Philosophy and Post-Colonial Africa”, Tsenay Serequeberhan explicates the nature of neocolonialism in Africa in a manner that reveals how Europe propagates its policy of socio-economic and political dominance in post-colonial Africa. For Serequeberhan, neocolonialism in Africa is that which internally replicates in a disguised manner what was carried out during the colonial period. This disguised form constitutes the nature of the European neocolonial subjugation as it concerns the politics of economic, cultural, and scientific subordination of African states (Serequeberhan 1998, 13). With this, we can describe the general nature of neocolonialism as a divergence in national power—political, economic, or military—which is used rather lopsidedly by the dominant power to subtly compel the dominated sectors of the dominated society to do its bidding. The method and praxis of neocolonialism lies in its guise to enjoin leaders of the independent colonies to accept developmental aids and support through which the imperial powers continue to penetrate and control their ex-colonies. Through the guise of developmental aids and support, technological and scientific assistance, the ex-colonial masters impose their hegemonic political and cultural control in the form of neocolonialism (Serequeberhan 1998, 13).  In such a situation, the leaders of the seemingly independent African states become minions to the whims and caprices of the ex-colonial lords or their multinational corporations in terms of the management of the affairs of the new states. Prima facie, it would seem that the neocolonial state is free of the influence of imperialists, and it appears to be governed completely by its own indigenes. In truth, though, the state remains under its former colonial masters and their accomplices. Being under the continued impression that the former colonialists are superior and more civilized, the leaders of the supposedly new independent states continue to practice and encourage the people to imbibe the ways and cultural practices, and more essentially the economic control, of the imperialists.

Within a neocolonial situation, therefore, the imperialists usually maintain their influence in as many sectors of the former colony as possible, making it less of an independent state and more of a neo-colony. To this end, in politics, economics, religion, and even education, the state looks up to its imperialists, rather than improving upon its own indigenous culture and practices. Through neocolonialism, the more technologically advanced nations ensure their involvement with low income nations, such that this relationship practically annihilates the potential for the development of the smaller states and contributes to the capital gain of the technologically advanced nations (Parenti 2011, 24).

In On the Postcolony, Achille Mbembe further examines the nature of neocolonialism in Africa and says that the underpinning theory on which neocolonialism rests consists of bald assertions with no tenable arguments to support it. Evidently, in his view, after colonialism has ended in Africa, the West did not consider that Africans were capable of organizing themselves socially, economically and politically. To Mbembe, the reason for holding such ideas and advancing them is simply because the African is believed to be intellectually poor and is reducible to the level of irrationality. In his words, the capacity for Africans to rationally organize themselves is “understood through a negative interpretation” (Mbembe 2001, 1). This interpretation reveals the African as never possessing things and attributes that are properly part of human nature, or rather (even if reluctantly granted the status of the human) those things and attributes are generally of lesser value, little importance, and of poor quality (Mbembe 2001, 1). In other words, since Africans and other people that are different in race, language, and culture from the West do not possess the power, the rigour, the quality, and the intellectual analytical abilities that characterize Western philosophical and political traditions (Mbembe 2001 2), it is then difficult to assume that they would have the rational capacity to organize themselves socially, economically and politically. In a rejoinder to this bald assertion and negative interpretation, Mbembe retorts that the West has always had insurmountable difficulties with accepting an African theory on the “experience of the Other”, or on the issue of the “I” of others to which the West seems to perceive as foreign to it. In other words, the typical Western tradition has always denied the existence of any “self” but its own. It has always denied the idea of a common human nature, such that, “a humanity shared with others, long posed, and still poses, a problem for Western consciousness” (Mbembe 2001, 2).

Fundamentally, this denial is not peculiar to the period of neocolonialism alone. It has a history that dates back to the period of the trans-Atlantic slave trade and colonialism. In his book, The Invention of Africa, V. Y. Mudimbe asserts that there are three methods that are representative of the colonial structure in Africa: the domination of physical space, the reformation of natives’ minds, and the integration of local economic histories into the Western perspective. This structure constitutes the three complementary aspects of the colonial organization which embraces the physical, human, and spiritual elements of the colonizing experience (Mudimbe 1988, 2). This colonial structure is aimed at emphasizing a historicity that promotes discourses on African primitiveness, which is used in justifying why the continent needed to be conquered and colonized in the first place (Mudimbe 1988, 20). Citing what Ignacy Sachs calls “europeocentricism”, Mudimbe says this model of colonialism is to “dominate our thought and given its projection on the world scale by the expansion of capitalism….it marks contemporary culture imposing itself as a strongly conditioning model for some and forced deculturation for others” (Sachs 1971, 22). In all of this, according to Mudimbe, europeocentricism is anchored on the denial of the “Other” in European consciousness. It continues to be a denial in spite of the assertion of the “existence of the other” in Paul Ricoeur’s meditation on the irruption of the other: “when we discover that there are several cultures instead of just one…be it illusory or real, we are threatened with destruction by our own discovery. Suddenly, it becomes possible that there are just others, that we ourselves are an “other” among others” (Ricoeur 1965, 278). To this end, if one accepts Ricoeur’s assertion on the existence of cultural pluralism one would be able to affirm the foundation of Mudimbe’s submission in his The Idea of Africa that in each continent, for example Africa, “there are natural features, cultural characteristics, and, probably, values that contribute to its reality different from those of, say, Asia and Europe” (Mudimbe 1994, xv).

It is based on this distinct reality of each culture that William Abraham in The Mind of Africa examines the problems and challenges that face post-colonial Africa vis-à-vis the continent’s interaction with Europe. Abraham acknowledges the existence of neocolonialism in Africa, but proposes an integrative form of culture whereby certain positive aspects of Western culture may be integrated with African culture in order to forge a common bond (Abraham 1962, 83). Abraham however emphasizes that in spite of the on-going social, economic and political change in Africa due to the impact of neocolonialism, Africa’s culture must be guarded from being eroded by Western influence and civilization, or what he refers to as “the externality of an outsider” (Abraham 1962, iv)

The above description of the nature of neocolonialism and its different dimensions sparsely elaborates on the themes of subjugation and an apparent imposition of a hegemonic economic, political, and social order mostly in the guise of trade relations or developmental aid grants by the imperialists. The attendant implication of this relates to how post-colonial African states have seemingly failed to apply themselves to the problems of self-maintenance.

2. History of Neocolonialism

Towards the late nineteenth century through to the latter half of the twentieth century, some European countries, such as Britain, France, Belgium, and Portugal, had colonized a large number of African nations, setting up economic systems that allowed for seemingly extensive exploitation. Decades after World War II, these European nations granted political independence to their colonies in Africa, but still found a way to retain their economic influence and power over the former colonies. From the 1950s when many African colonies began to gain independence, they soon realized that the actual liberation that they had anticipated was outlandish. So, in spite of the assumption of Africans to political leadership positions, Africans soon realized that the economic and political atmosphere were still under some form of control of the former colonial masters. By implication, post-colonial Africa continued to experience the domination of the Western styled economic model that was prevalent during the period of colonialism. It does appear that the former colonial masters only wanted to grant political independence to their former colonies, and did not want them to be liberated from colonialism. This is why it is inferred that the situation which informs the ideological implementation of neocolonialism in Africa began immediately after the political independence of most African states.

In postcolonial Africa, events and situations have revealed how neocolonialism was nurtured from the moment independence was granted. The elements of neocolonial influences that are apparent within the interactions continually exist between former colonial masters and their former colonies attest to this assertion. For example, the point could be made with regards to the ongoing interactions between France and Francophone African countries such as Cameroon, Togo and Ivory Coast, as well as between Britain and Anglophone African countries such as Ghana, Nigeria and the Gambia

In the case of Cameroon, particularly after the amalgamation of French Cameroon with Southern British Cameroon in 1961, the granting of political independence to Cameroon by France was dependent on certain negotiations on matters of defense, foreign policy, finance, and economy, as well as technical assistance. This resulted in the adoption and institutionalization of the French 5th Republic constitutional model, alongside French political, economic, monetary, and cultural dominance in the new Cameroon (Martin 1985, 192). Following the creation of the French Franc zone, which established the Franc CFA as the general currency for all Francophone countries, the West African colonies became tied in a fixed parity of 50:1 to the French franc, automatically granting the French government control over all financial and budgetary activities (Goldsborough 1979, 72). France also continued its military presence in Cameroon after independence. France established military and defense assistance agreements with Cameroon (Martin 1985, 193). Furthermore, the French institutionalized linguistic and cultural links with all its former colonies, thereby creating the “La Francophonie” heading which served as a platform for reinforcing the assimilation of the French language, culture and ideology (Martin 1985, 198).

Although Britain may have continued to maintain an indirect economic influence through multinational corporations on its former colonies, the direct effects of British’s neocolonial socio-political and political ideologies have diminished significantly over the years. However, the West in general maintains an indirect form of domination over all developing African countries through means such as loans from the World Bank or the International Monetary Fund (IMF). This form of neocolonialism is done through foreign aids or foreign direct investments where strict or severe financial conditionalities are imposed. Such conditionality often renders the neocolonial state subservient to the economic and sometimes political will of the foreign donor.

Clearly, from this brief history of neocolonialism in Africa, one can see that colonialism itself had an epochal dimension on the history of contemporary Africa. This is why the study of neocolonialism has become critical to the study of African history and politics. It is however more crucial to the field of African philosophy because of the need to reflect on the socio-political and moral impacts of neocolonialism on Africa.

3. Neocolonialism: Related Concepts

Included in the themes of African Philosophy, especially African Social and Political Philosophy are the related concepts of neocolonialism, colonialism, imperialism and decolonization. This is rightly so because of the events and activities of European expansionists’ agenda that occurred, especially in Africa’s modern history. Although these epochal events preceded one another, the methods and praxis of colonialism, imperialism, and neocolonialism are only slightly different. Their common denominators include social, economic, political, and cultural subjugations of the colonized. However, the concept of decolonization differs essentially from the others. While others are conceptually linked with exploitation and domination, decolonization is a means for liberation, which could be through a social, cultural, political and economic form of revolution.

a. Colonialism

Broadly construed, the term colonialism can be described as the deliberate imposition of the rules and policies of a nation on another nation. Its strategy is the forced placement of a nation over another that gives room for the opportunity to exploit the colonized nation in order to facilitate the economic development of the colonialist home state.

A definition by Ronald J. Horvath sees colonialism as a “form of domination – the control by individuals or groups over the territory and/or behavior of other individuals or groups” (Horvath 1972, 46). Clearly, colonialism is a tool for expansion and a form of exploitation on all fronts. This is why Robert Young’s view on colonialism is that it “involved an extraordinary range of different forms and practices carried out with respect to radically different cultures, over many centuries” (Young 2001: 17).

The idea behind colonialism basically is the conquest and rule over a country or region by another, allowing for the exploitation of the resources of the conquered for the profit of the conqueror. Colonialism is an instrumental process through which a state acquires and maintains colonies in another territory. The outcome of this, which is the colonial stage of society, alters mildly or altogether the economic, political, social and even intellectual structure of the conquered state.

Between the 1860s and 1900s, Africa as a whole was subjected to various forms of aggression from Europe, ranging from diplomatic pressures to military invasions until almost all African states were finally conquered and colonized. The process of colonization came to its complete stage with invasions of the political, economic and socio-cultural spheres of the African societies.

The first attempts at colonization occurred when the Europeans began to seek trade pursuits outside their own continent, and thus discovered that many other nations, particularly in Africa, had wealth in natural resources which had potentials for their own economic gain.

We can simply say that the nature of Colonialism involves a forced relationship between an indigenous majority and a minority of foreign invaders. Of course, its history can be traced to slavery, where indigenous people, particularly of Africa, were forcibly and violently taken as slaves to plantations in Europe and the Americas. Through slavery, Africa’s sons and daughters in large numbers were violently seized and taken to Europe as sellable commodities (Nwolize 2001, 25). However, as slavery was ending in the 1850s, Europe was packaging another round of violent visitation against Africa. This invasion took off in earnest after the Berlin conference of 1884 – 1885. Colonialism came with further violence. Vandalism, murder, torture, looting, rape, death, and destruction were also the order of the day (Afisi 2009a, 62).

Certain perceived basic assumptions seem to have informed the colonial construction of African savagery which was used to justify the nature of colonial warfare. Works of enlightenment philosophers such as Frederick Hegel’s (1770-1831) Lectures on the Philosophy of World History (1975) and Immanuel Kant’s Anthropology from a Pragmatic Point of View (1798, 2006) essentially informed these assumptions. Hegel speculates about the continent of Africa and asserts that Africa proper “is enveloped in the dark mantle of Night”. To Hegel, “The peculiarly African character is difficult to comprehend, for the very reason that in reference to it, we must give up the principle which naturally accompanies all our ideas–the category of Universality” (1975, 174). Hegel here states that Africans’ lack the category of Universality and, also, situates the African at the level of irrationality. “The Negro,” Hegel writes, “exhibits the natural man in his completely wild and untamed state” (174). The African was, to Hegel, a complete moron who had no idea of decency and could not distinguish his right from left. Similarly, the racist proclivities of Kant lie in his denial of any intellectual endowments and rational abilities to non-white races.

In a further attempt to rationalize colonialism, Lucien Levy Bruhl (1985, 63) standardizes the colonial discourse when he commissioned rationality as a Western signature, and thus granting what he terms mystic or pre-logical thinking to non-Western peoples. These denigrating words in particular refer to the African. These arguments justify the colonialist’s actions and reasons for invading and conquering the territories of the perceived Dark Continent. With this invasion, the entirety of the lives of this indigenous majority came to depend solely upon the powerful invaders. The fundamental decisions affecting the lives of the indigenes were made by the colonial masters. The colonialists gradually perpetuated the socio-economic and political spheres of the state, and finally, the minds of the people. The conquered were made to believe that they were inferior and, as such, only the ways of the colonialists were worthy to be imbibed.

In his article, “Modern Western Philosophy and African Colonialism”, E. Chukwudi Eze queries the rationality that underlies the thoughts and assumptions that emanate from the European Enlightenment philosophers who promoted the ideals of individual freedom and the dignity of the human person on the one hand, and who, on the other hand, were associated with the thoughts and promotion of slavery and colonialism (Eze 1998, 217). For European Enlightenment philosophers, Africans were not in the same logical set as normal humans. Therefore, their advocacy for the ideals of humanity and democracy did not apply to Africans. This justified their arguments for the promotion of “imperial and colonial subjugation of non-European peoples” (Eze 1998, 218). This suggests that there is a distinction for the enlightenment philosophers between, in Cornel West’s words, “sterling rhetoric and lived reality” or, in Abiola Irele’s, between the “word and deed” (Eze 1998).

Refutations have been made against these assumptions, which suggest that Africa, in the words of Walter Rodney (1972), was developing at its pace before the advent of colonialism. However, due to the debilitating effects of colonial rule, African scholars and political thinkers were faced with the serious challenges of socio-political and cultural reconstruction of Africa. The colonialists had imposed European beliefs and values on Africa. Thus, European languages, belief systems, social, economic, and political systems replaced pre-colonial African ones. As a reaction to the effects of colonialism, there was the need to find an alternative ideology for decolonization. The reflective attitude and the thought process in the search of the ideology for decolonization resulted in the abstraction of different philosophical ideas and the development of theories in political philosophy. Consequently, what is known as African social and Political Philosophy started as a reaction to colonialism. This explains the reason why colonialism is an important theme in African philosophy.

b. Imperialism

Imperialism dates further back in history, as it is traced back to the disintegrated Roman Empire. Imperialism can be described as an orientation which holds that a country can gain political or economic power over another through imposed sovereignty or more indirect mechanisms of control. Imperialism does not focus only on political dominance, but also conquest over expansion. It is particularly focused on the acquisition of power by a state over another group of people. It is also described as a state policy, practice or advocacy of extending power and dominion, especially by direct territorial acquisition or by gaining political and economic control of other areas. As Michael Parenti describes it, Western European imperialism first took place against other Europeans such as when Ireland became the first colony of what later became known as the British Empire (Parenti 2011, 11). However, those who virtually faced the thrust of the European, North American, and Japanese imperial powers have been states in Africa, Asia, and Latin America (Parenti 2011, 13).

An understanding of the basic modus operandi of imperialism suggests that foreign governments can govern a territory without significant settlement, quite unlike colonialism in which settlement is a key feature. Imperialism is merely an exercise of power over the conquered regions without immigration of any form.

In his book Decolonising the Mind, Ngugi wa Thiong’o explicates the nature of imperialism, particularly as it affects the culture and language of the African. Wa Thiong’o asserts that imperialism has absolute effects on the economic, political, military, cultural and psychological wellbeing of the people affected. He describes the effect of imperialism on Africa from two main perspectives. First is the socio-economic and political effect of the imperialist tradition on “consolidated finance capital” (Wa Thiong’o 1986, 2). He maintains that the subjugation of Africa’s economic life is done through the use of multinational corporations, and particularly how most African countries have been lured into accepting loans from the International Monetary Fund (IMF). Wa Thiong’o’s concern about the IMF is that the economic life of every worker and peasant of such countries that have taken the loans are mortgaged forever. This is because as such countries continue to service the IMF loans, the organization is entrusted with the power to dictate the direction of the economic policies of those states. This can also be said of the imperialist domination of politics where it is ensured that African states rely on Western models of politics, policing, judiciary practice, and education.

Wa Thiong’o’s second perspective on the consequences of imperialism relates to what he calls “effect of cultural bomb” (Wa Thiong’o 1986, 3). According to him, imperialism uses a cultural bomb to isolate people and estrange them from their identity. This is done by annihilating the people from their heritage, their environment, their names and, above all, their language. Wa Thiong’o asserts that language remains the most essential vehicle through which the human soul can be held captive. In this case, the imperialists are fully aware of this essence and deliberately use “language as a means of spiritual subjugation” (Wa Thiong’o 1986, 9). So in Wa Thiong’o’s submission, this cultural and psychological form of imperialism remains the biggest weapons that undermine the value of the human person and erodes the dignity of the people’s identity. This form of imperialism has the tendency to make people embrace the imperialists’ alien culture, language, and way of life, and be far removed from their indigenous heritage and identity.

The study of the dignity of the African in every form is central to African philosophy. The need to embrace the value of Africa’s cultural heritage that is devoid of any form of imperialist subjugation is essential for the promulgation of African philosophy. It is in the light of this that, as a theme of African philosophy, the study of imperialism remains crucial to understanding its methods and its effects on the socio-economic, political, and cultural life of the African.

c. Decolonization

As a theme in African philosophy, the term decolonization connotes an ideology for true emancipation in post-colonial Africa. We talk of emancipation from cultural, economic, political, psychological forms of colonialism. African philosophers have consistently been concerned with the issue of liberation of the mind, spirit and body, as well as the emancipation of the African from all elements and influences of colonialism. It is as a result of these concerns that the study of decolonization is essential to the project of African philosophy.

The term decolonization can be described as the abolishment of colonialism and the enthronement of a people/nation’s powers over its own territories. It is typically referred to not merely as independence from colonialism but a total liberation from the influences and powers of imperial neocolonialism. It is a situation in which a new state acts under its own volition, free from the direct control of foreign actors. Decolonization refers to the ability or willingness of the previously colonized nation to become free from imperial rule in order to control its own domestic and international affairs. It is also the mechanism and/or ability of a people to be liberated from cultural and psychological domination of foreign influences.

In Africa, for instance, many theoretical assumptions informed the need to necessarily decolonize. In principle, we can trace the spark for an ideology for decolonization as beginning with the rise of communism in the former Soviet Union. The teachings of Marx, Frederick Engels (1820-1895), and Vladimir Lenin (1870-1924) against the exploitation of the masses remain the backdrop of this decolonization. The influences of the teachings of Frantz Fanon on decolonization cannot also be gainsaid. These teachings and influences seemed to have led to the conviction that informed the early African political thinkers on the need to radically decolonize and end the influences of neocolonialism. Some post-independent African thinkers, such as Leopold Sedar Senghor of Senegal, Sekou Toure of Guinea, Julius Nyerere of Tanzania, Obafemi Awolowo of Nigeria, and Kwame Nkrumah of Ghana, among others, were faced with the serious challenges of socio-economic, political, and cultural reconstruction of the postcolonial African states. They were faced with the task of liberating Africa from the imposition of neocolonial European values, languages, and belief systems, social, economic, and political systems which seemed to have replaced the pre-colonial African ones. Consequently, the principle of individualism, believed to have been a European signature, seemed to have replaced the African cultural context of brotherhood, which suggests a welfare system of communalism, collectivism, and egalitarianism; hence, the need for a search for an ideology for decolonization (Afisi 2009b, 33).

As noted above, among the writers on decolonization in Africa, Fanon was one of the prominent figures. In fact, his writings are notably extensive on the process and methods of decolonization and of true liberation. Central to Fanon is the idea that only through decolonization can there be true liberation. Fanon’s strong advocacy of decolonization results from his commitment to the preservation of individual human dignity.

Fanon’s works, Black Skin White Mask (1952) and The Wretched of the Earth (1962), contain radical critiques of the French colonization. He views colonialism as a forcible control of another state, with the word ‘force’ as key. Fanon accuses the colonizers of using force to exploit raw materials and labour from colonized countries. To justify their actions however, the colonial masters proclaim that the natives were savages and that European culture was the most ideal for adoption.

Fanon claims that the colonial situation is by definition a violent one. He condemns the violence inflicted on the colonized by the colonizer. However, he distinguishes between a threefold categorization of violence. This includes; physical, structural and psychological violence. Physical violence implies the somatic injury inflicted on human beings, the most radical manifestation of which is the killing of an individual. Structural violence reflects the fact of exploitation and its necessary institutional form of the colonial situation. Psychological violence is the injury or harm done to the human psyche (Fanon 1952, 39). This third categorization includes elements of indoctrination of various kinds and threats which tend to decrease the victims’ mental potentialities. As a way out of all of these, Fanon advocates a reprisal use of violence against the settlers to enable the colonized regain their self-respect. To him, since the colonial situation is itself a violent one, the colonial masses can only achieve liberation through replicated form of violence. True liberation, according to Fanon, must be accompanied by violence. His submission is that for liberation to be total, accurate and objectively achieved, it has to be accompanied by violence (Fanon 1962, 102).

In Fanon, decolonization requires violence on the part of the colonized. Violence plays a critical role in the decolonization struggle. The colonized must see violence in decolonization as that which leads not to retrogression, but liberation. Fanon sees decolonization as implementation of the concept of ‘the last shall be the first’. It is a psycho-social process, a historical process that changes the order of the world. Decolonization involves a struggle for the mental elevation of the colonized African people (Fanon 1962, 116). So, from all of this, Fanon contends that Africa is in need of true liberation which can only result from decolonization. In his submission, resisting a colonial power using only politics cannot be effective; violence is the best way to attain decolonization.

Arguing from a similar position, Kwasi Wiredu in his book, Conceptual Decolonization in African Philosophy contends that colonialism has not only affected Africa’s political society but also its mental reasoning. He advocates the need for Africans to go through a process of mental decolonization. Wiredu recommends the process of decolonization from two conceptual analyses: first, ‘avoiding through a critical conceptual self-awareness the unexamined assimilation in our thought… of the conceptual frameworks embedded in foreign philosophical tradition’, and second ‘exploiting as much as judicious the resources of our own indigenous conceptual schemes in our philosophical meditations…’ (Wiredu 1998, 117). For Wiredu, the most important function of post-colonial philosophy is what he refers to as “conceptual decolonization”. This simply implies “divesting African philosophical thinking of all undue influences emanating from our colonial past” (Wiredu 1998). Wiredu sees decolonization as a necessary tool for developing an authentic African philosophy that is devoid of any neo-positivist influences.

Wa Thiong’o, in his thinking, believes that decolonization can only take place in Africa when the “cultural bomb” is diffused. This process begins when “writing” is done in the various indigenous African languages. Such writings, which would enhance the renaissance of African cultures, must also carry with it the spirit and content of anti-imperialist struggles. This would ultimately help in liberating the mind of the people from foreign control (Wa Thiong’o 1986, 29). For total decolonization to occur, Wa Thiong’o enjoins writers in African languages to form a revolutionary vanguard in the struggle to decolonize the mind of Africans from imperialism.

4. Neocolonialism: The Last Stage of Imperialism

After the independence of most African nations, Africans soon began to notice that their countries were being subjected to a new form of colonialism, waged by their former colonialists and some other developed nations. It is pertinent to mention that even though neocolonialism is a subtle propagation of social-economic and sometimes political activities of former colonial overlords in their ex-colonies, documented evidence has shown that a country that was never colonized can also become a neo-colonialist state. Countries such as Liberia and Ethiopia that never experienced colonialism in the classical sense have become neocolonial states by dint of their reliance on international finance capital, courtesy of its fragile economic structure (Attah, 2013:71). It is based on this that neocolonialism can be said to be a new form of colonial exploitation and control of the new independent states of Africa, and other African states with fragile economies.

Nkrumah views neocolonialism as a new form of subjugation of the economic, social, cultural, and political life of the African. His postulation is that European imperialism of Africa has passed through several stages, from slavery to colonization and subsequently to neocolonialism being the last stage of the imperialist subjugation and exploitation process. Nkrumah’s (1965) classic, Neo-colonialism: The Last Stage of Imperialism, is an analysis of neocolonialism in relation to imperialism. The book emphasizes the need to recognize that colonialism had yet to be abolished in Africa. Rather, it had evolved into what he calls neocolonialism. Nkrumah reveals the methods that the West used in its shift in tactics from colonialism to neocolonialism. In his words: “without a qualm it dispenses with its flags, and claims that it is ‘giving’ independence to its former subjects, to be followed by ‘aid’ for their development. Under cover of such phrases however, it devises innumerable ways to accomplish objectives formerly achieved by naked colonialism” (Nkrumah 1965). This explains the condition under which a nation is continually enslaved by the fetters of neocolonialism while being independent in theory, and yet being trapped outwardly by international sovereignty, so that it is actually directed politically and economically from the outside.

Nkrumah contends that neocolonialism is usually exercised through economic or monetary means. As part of the methods of control in a neocolonial state, the imperialist power and control over the state is gained through contributions to the cost of running the state, promotion of civil servants into positions that allow them to dictate and wield power, and through monetary control of foreign exchange by the imposition of a banking system that favors the imperial system.

Nkrumah further explains that neocolonialism results in the exploitation of different sectors of the nation, using different forms and methods: “[t]he result of colonialism is that foreign capital is used for the exploitation rather than for the development of the less developed parts of the world. Investment under neocolonialism increases rather than decreases the gap between the rich and the poor countries of the world” (Nkrumah 1965).

On the link between Neocolonialism and Imperialism, Nkrumah writes that neocolonialism is the worst and most heightened form of imperialism. For those who practice it, it ensures power without responsibility and unchecked exploitation for those who suffer it. He explains that neocolonialist exploitation is implemented in the political, religious, ideological, economic, and cultural spheres of society. He further provides details of the infiltration and manipulation of organized labour by agencies of the West in African countries. He discusses how the mass media is used as an instrument of neocolonialism in the following statement: “[w]hile Hollywood takes care of fiction, the enormous monopoly press, together with the outflow of slick, clever, expensive magazines attends to what it chooses to call ‘news’” (Nkrumah 1965). Religion too, according to Nkrumah, is distorted and used to support the cause of neocolonialism.

Nkrumah’s submission, however, is in the projection that as dangerous as neocolonialism is to the future of Africa, it would eventually, like colonialism, be defeated by the unity of all those who are being oppressed and exploited. He prescribes unity and awareness amongst all Africans.

Buttressing the above submission, Noah Echa Attah in his paper, “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria” (2013), traces the root of underdevelopment in Africa, particularly in Nigeria, to the effects of neocolonialism. In his assertion, African countries have never been truly independent after colonialism had left because the idea of partnering with the ex-colonialists has continued to guide state economic policies. Foreign firms have continued to dominate the business sectors of the economy such that relatively few, but large and integrated foreign firms otherwise called multi-national corporations, have made themselves indispensable to the growth or otherwise of the economy. Local industries in Africa are extensions of metropolitan firms, such that the needed raw materials for the industries depend on very high import content of over 90% from the capitalist economies (Attah 2013, 76). Thus, the continued dependence of industrial investments in Africa on the capitalist intensive technology is strictly aimed at further developing the metropolitan economies.

Attah explicates how Western neocolonialists have collaborated with local bourgeoisie in Africa to perpetuate the exploitation of the people and state economies in Africa. According to him, most of the local bourgeoisie collaborators are not committed to national interest and development, and their aim is to ensure the continued reproduction of foreign domination of the African economic space. The local bourgeoisie are bereft of ideas capable of engendering growth and development. The objective of foreign capital, therefore, is to continue to co-opt the weak and nascent local bourgeoisie into its operations. “The co-optation of the local bourgeoisie into the network of foreign capital condemned the former to the position of ‘comprador’” (Attah 2013, 77).

Adducing from the above exposition, Attah also asserts that neocolonialism is a new form of imperial rule characterized by the domination of foreign capital. His claim is that instead of real independence, what Africa has is pseudo-independence with the trappings of the illusion of freedom. To him, neocolonialism in Africa is made possible due to the roles and actions of local bourgeoisie in collusion with foreign capital. He is concerned that the different African economies have become willing tools in the hands of the West because of their fragility. In many cases, the African states have inadvertently authorized the dependency of African economies on foreign capital, which is a necessary legitimacy for neocolonialism. Neocolonialism, Attah submits, leads to underdevelopment where the local bourgeoisie and the foreign capitals are interested in the economy for personal accumulation rather than national development of the neocolonialist state.

5. The Myth of Neocolonialism

Thomas Molnar (1965) in his paper, “Neocolonialism in Africa?”, asserts that African nations continue to depend on the Western industrial nations for economic aid, loans, investment, market, and other technical assistance because they require this dependency for their development. He acknowledges that the colonial regime in Africa left Africa in destitution, not only materially but also in terms of education and technical training. Molnar also affirms that nobody will deny that the colonialist period sanctioned abuses and exploitation on Africa (Molnar 1965, 177).

However, in spite of the end of colonialism in Africa, Molnar is concerned that African economies have not been properly functional, independent of foreign aids and investments. He claims that the economic presence of the West is imperative for the future progress of Africa’s socio-economic and political stability. The assumption is that only fruitful economic arrangements with western industrialized countries may guarantee Africa’s future (Molnar 1965, 182).

Further, Molnar asserts that the call for decolonization in Africa took place in a hurried, haphazard way. Africans were unready and immature for economic and political independence as of the time it achieved it. As a result of this situation, the West is under obligation in post-colonial Africa to keep up its aid, not as a tribute paid for past colonial situation but as one half of a two-way process of cooperation (Molnar 1965, 183). For its development, Africa needs the West. Since the newly independent African countries will continue to be economically dependent on the West, neocolonialism is not a negative term. In fact, “neo-colonialism” is the only way of getting Africa to the take-off stage” (Molnar 1965, 183).

In a reaction to Molnar’s glorification of neocolonialism in Africa, Tunde Obadina’s article, “The Myth of Neocolonialism” (2000), which is a critical analysis of the colonial situation in Africa, and the myth surrounding indigenous growth and development in post-colonial Africa, gushes at the ‘apologist’ claim about the positive influences of colonialism. According to Obadina, these apologists contend that despite the exploitation of resources perpetrated by the colonialists, their overall influence on the African society in terms of reducing the economic gap between Africa and the West is positive. The argument here is that colonialism improved the living condition of Africans, providing necessary tools for civilization such as formal education, modern medicine, and enlightenment, including shaping the political organization. However, Obadina notes that in spite of these apologetic claims, Africa is still today considered a continent in economic and political crisis. However, the apologists are usually quick to point out that the failure of Africa’s development is due to the conscious or unconscious refusal to adopt the legacies of the colonialists. These apologists even go to the extent of saying that Africa is in such a state because they gained independence a lot sooner than was necessary for them.

Citing D.K Fieldhouse as one of the apologists, Obadina mentions that Fieldhouse is of the view that it would be difficult to imagine what would have become of African countries had the colonial rule not come. Fieldhouse had contended that pre-colonial Africa by itself lacked the capacity, social and economic organization to transform itself into modern states that would result in the establishment of advance economies. According to Fieldhouse, African states today would be a direct replica of what they were in the primitive days if they had not encountered the European culture and civilization.

Fieldhouse’s Eurocentric orientations result from the contention that Africans are by nature irrational, incompetent, and unable to produce anything useful. These orientations have gone to great extents to undermine Africa’s indigenous culture, tradition, religion and even philosophy. Even Gene Blocker opines that ‘the more philosophical African philosophy becomes, the less African it is in content and the more African it becomes, the less philosophical it is in content’ (Blocker 1987, 2).

In contrast to these apologetic arguments, Obadina points out that many nationalist African scholars have raised arguments to criticize the apologists’ positions on the basis that Africa would definitely have developed in a unique way, different from the European system. Obadina asserts that colonialism bore nothing but negative effects on Africa. According to him, Africans lived more ‘enriched’ lives before colonialism, and would have continued in that manner had they not been colonized. The effect of colonial rule has left the continent in a more dilapidated state; it has compromised the nations’ capabilities to develop. Obadina cites Walter Rodney’s assessments of how colonialism only succeeded in making Africa underdeveloped and, worse still, dependent on Western nations. The colonial juxtaposing of people from different cultures, ignoring the already established borders and redrawing Africa’s map created and still results today in various degrees of ethnic conflicts. Furthermore, colonialism undermined pre-colonial political systems which used to be effective for Africans and imposed foreign political concepts which include multi-party democracy. This, according to Rodney and many other critics, has left Africa in serious social and political crises. Obadina takes Nigeria as an example, which, because of its great population and natural resources, had qualities that seem to be leading eventually to her destruction. The party politics, according to Obadina, introduced by the colonialists was the major cause of ethnic conflicts in Africa.

Obadina acknowledges the difficulty in providing an objective analysis of the impact of colonialism in Africa. Despite this, he avers that colonialism in Africa may have some positives. However, what cannot be denied is the fact that it was something imposed, which had no regard for the existing structures already in place. Furthermore, colonial rule was not an idea geared towards the development of the colonized states in any way, but something established solely for the benefit of the colonial states.

Furthermore, Obadina forthrightly asserts that African nations are to be blamed for the continued reliance on their former colonial lords for economic and political direction. This neocolonial situation poses serious danger to the evolution of indigenous-based economic growth, and at the same time, has adverse effects on political stability. It has, according to him, hampered the growth of movements geared towards change. He believes that African nations, after independence, should have shut the door against imports and exports from the West and sought to develop themselves using their own resources, not dependent on foreign corporations. This would have, he says, improved Africa’s infrastructural levels of economic and political growth. In Obadina’s view, if African nations, for example, had pursued this independent economic agenda, they would have survived, because Cuba did so and survived. Obadina opines that the traditional agenda that came with colonialism was the false ‘idea of progress’. With this idea as the fundamental gospel, Africans were made to believe that their living conditions could be positively altered. This, among other things, smoothened their way into the continent, since, after all, it has been peoples’ desires for material improvement. It created in Africans the desire for Western civilization; but the West failed to hand over to Africans the tools for realizing such civilization.

Africa in the early 21st century is a neocolonial continent, according to Obadina. Africa continues to face the problem of dealing with the overbearing presence of Western civilization. In the quest for modernization, the focus is mostly on the Western world and there is little or no focus on the urgent need for internal changes in this same quest. Despite colonial rule in Africa ending only late near the end of the twentieth century, Obadina submits that African nations at the beginning of the 21st century have the responsibility to develop themselves by making changes in their internal structures using indigenous knowledge, while at the same time learning all they can from the influence of the Western world and putting these to use for their own benefit.

In all of these, African philosophers have continued to interrogate the idea of neocolonialism and its effects on Africa’s development. The outcomes of such interrogations continue to form content that need to be taught and studied within the project of African philosophy.

6. Neocolonialism Today in Africa: The Era of Globalization

The heavy dependence on foreign aid and the apparent activities of the multinational corporations in Africa reveal that Africa at the beginning of the 21st century is still in a neocolonial stage of development. The activities of the corporations in Africa, particularly those from Europe and America reveal nothing short of economic exploitation and cultural domination. Early 21st century Africa is witnessing neocolonialism from different fronts, from the influences of trans-national corporations from Europe and America to the form of a new imperial China, which many African governments now seem obligated to. The establishment of the multinational corporations, and more recently Chinese interests in Africa through Chinese companies, appear mainly to exist for the benefits of the home economies of the neocolonialists than to infuse local African economies with cash to stimulate growth and increase local capacity.

In the Africa of the early 21st century, some scholars, such as Ali Mazrui, have opined that the new form of neocolonialism is globalization. Much as the way that neocolonialism has been variously described, Mazrui also describes how globalization “allows itself to be a handmaiden to ruthless capitalism, increases the danger of warfare by remote control, deepens the divide between the haves and have-nots, and accelerates damage to our environment” (Mazrui 2002, 59). This negative perspective on globalization, particularly as it relates to extreme capitalism, essentially corroborates the assertion by Michael Maduagwu that “globalization is only the latest stage of European economic and cultural domination of the rest of the world which started with colonialism, went through imperialism and has now arrived at the globalization stage” (Maduagwu 1999, 65).

Looking at globalization in this way, Oseni Afisi, also condemns it to the corridor of neocolonialism and cultural subjugation.  Globalization becomes the imposition of a particular culture and value system upon other nations with the direct intent of exploitation. What this indicates is that globalization is indeed the engine room for the propagation of neocolonialism and new imperialism on the African soil.  While colonialism has ended, the reality on the ground in Africa in the immediate years after it is that political independence in many African states has not culminated in the much desired economic and cultural freedom (Afisi 2011, 5).

Afisi further opines that the greatest venture upon which the negative impact of globalization in Africa rests primarily is the erosion of Africa’s cultural heritage.  Upon this heritage hinges the political, economic, social, educational subjugation of the continent of Africa.  The forcible integration of Africa into globalization through slavery and colonialism has led to the problem of personal identity and cultural dilemma for the African.  Africa has had to be dependent upon Europe and America, and, more recently, upon China for its development and, one might add, the development of her identity and culture.

By contrast, Olufemi Taiwo, in Africa Must Be Modern: A Manifesto, has a radical position which seeks to have sufficient bearing on Africa’s consideration of the inherent benefits of globalization. Taiwo uncompromisingly defends globalization and suggests that its benefits must be harnessed by Africans. Taiwo berates the level of hostility that Africa has shown towards modernity, stating the regrettable impact of such hostility to the economic, social, and political development of the continent (Taiwo 2014). To Taiwo, Africa must address the challenges of modernity and globalization by embracing them instead of being hostile to them. As Taiwo further posits, Africa needs to fully engage with and derive benefits from globalization and its attendant capitalist democracy (Taiwo 2014).

In a similar vein, D. A. Masolo, in African Philosophy in Search of Identity, remarks that the needs and experiences of Africans today are conditioned by their peculiar cultural circumstances. The nature of these cultural circumstances is that the African of the post-colonial period has embraced modernity, science and technology as part of his/her culture. These African understand the world around them by being open minded and ridding themselves of any traction that may mold their thinking. This is the nature of African identity after colonialism, an identity which will aid in the intellectual construction of a modern African philosophy (Masolo 1994, 251).

7. Conclusion

As a theme of African philosophy, the term neocolonialism became widespread in use—particularly in reference to Africa—immediately the process of decolonization began in Africa.  The widespread use of the term neocolonialism began when Africans realized that even after independence their countries were still being subjected to a new form of colonialism.  The challenges that neocolonialism poses to Africa seem to be related to the socio-economic, cultural, and political development of the people and states of the continent. These challenges have, however, been attributed both positive and negative impacts on the continent.

On the whole, this article is an exposition of the theme of neocolonialism within the project of African philosophy. The introduction is a cursory look at the term “neocolonialism” with a view to clarifying the basic concept of the term. The history of neocolonialism is a historical analysis of the beginning of neocolonialism in Africa. This analysis reveals how the idea of neocolonialism was nurtured before independence was granted to most African states. No doubt, the term neocolonialism has some close relations to some other concepts. This explains the reasons the term colonialism, imperialism, decolonization, and globalization are essential to better understanding neocolonialism. Discussions about the negatives and some positives of colonialism and its offshoot neocolonialism in Africa are exposed in this article. This reveals that neocolonial elements may continue to be an integral part of Africa’s socio-economic, cultural, and political existence. However, some of the social and political philosophical questions which may continue to preoccupy the minds of Africans include: Can neocolonialism be abolished from Africa? Can the positives of neocolonialism outweigh the negatives, or vice versa, in terms of the impacts on the African economy? Will Africa ever be truly decolonized?

8. References and Further Reading

  • Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
    • A discourse on neocolonialism and integrative form of culture in Africa.
  • Afisi, Oseni Taiwo. “Tracing Contemporary Africa’s Conflict Situation to Colonialism: A Breakdown of Communication among Natives”. Philosophical Papers and Review Vol.1 (4): 59-66 (2009a).
    • A historico-philosophical analysis of colonialism as the root of tribal conflicts in Africa.
  • Afisi, Oseni Taiwo. “Human Nature in Marxism-Leninism and African Socialism”, Thought and Practice: A Journal of the Philosophical Association of Kenya, New Series. Vol. 1(2): 25-40 (2009b).
    • A comparative analysis of the nature of man in the philosophical ideologies of Marxism and African philosophy.
  • Afisi, Oseni Taiwo. “Globalization and Value System”. LUMINA. Vol. 22.(2): 1-12 (2011).
    • A discourse on the nature of globalization: its negatives and benefits on Africa’s value system.
  • Attah, Noah Echa. “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria”. Journal of African Studies and Development. Vol.5 (5): 70-79 (2013).
    • A historical analysis of the effect of colonialism in Africa.
  • Blocker, Gene. “African Philosophy”. African Philosophical Inquiry. Vol.1(2): 1- 12 (1987).
    • A critical discussion on the idea and content of African Philosophy.
  • Bruhl, Lucien Levy. How Natives Think, Princeton. N.J: Princeton University Press, 1985.
    • A discussion on the distinction between the mindset of the primitive and the mindset of the civilized human beings.
  • Eze E. Chukwudi. “Modern Western Philosophy and African Colonialism”. E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discourse on the contradictory nature of European Enlightenment period and its promotion of slavery and colonialism at the same time.
  • Fanon, Frantz. Black Skin, White Masks. London: MacGibbon, 1952.
    • An analysis of the psychology of the colonial situation.
  • Fanon, Frantz. The Wretched of the Earth. New York: Grove Press, 1962.
    • A critical analysis of colonialism, cultural and political decolonization.
  • Goldsborough, J. “Dateline Paris: Africa’s Policeman”. Foreign Policy 33. (1979).
    • An analysis of the French African economic policy.
  • Hegel, G.W.F. Lectures on the Philosophy of World History. Trans. H. B Nisbet. Cambrdige: Cambridge University Press, 1975.
    • Hegel’s philosophical analysis of world history.
  • Horvath J. Ronald. “A Definition of Colonialism”. Current Anthropology 13 (1): 45-57 (1972).
    • A general discussion on definition and classification of colonialism.
  • Kant, Immanuel. Anthropology from a Pragmatic Point of View. Robert B. Louden, ed. Introduction by Manfred Kuehn. Cambridge: Cambridge University Press, 2006.
    • A collection of Kant’s lectures on Anthropology and the nature of man.
  • Lenin, Vladimir. Imperialism: The Highest Stage of Capitalism. Moscow: Progress Publishers, 1916.
    • An exposition of the nature and the process of the imperialist’s financial capital in generating greater profits from their colonies.
  • Maduagwu, O. Michael. “Globalization and its challenges to National Cultures and Values: A Perspective from Sub-Saharan Africa” being a paper presented at the International Roundtable on the challenges of Globalization, University of Munish, 18-19 March. (1999).
    • A presentation of globalization as a tool of European economic and cultural domination of the rest of the world.
  • Martin, Guy. The Historical, Economic and Political Bases of France’s African Policy. The Journal of Modern African Studies 23 (2): 189-208 (1985).
    • An analysis of France’s continued influence and power on its former African colonies.
  • Marx, Karl. Capital: The Process of Capitalist Production. Trans. Fowkes. Knopf Doubleday. 1972.
    • A critical analysis of the economic law of capitalist mode of production.
  • Marx, Karl. Economy, Class and Social Revolution. London: Nelson Publishers, 1977.
    • Marx’s selected writings on capitalism and the process of social revolution.
  • Masolo, D.A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
    • A discussion on the debate of what constitutes the study African philosophy.
  • Mazrui Ali. A. “Nkrumanizm and the Triple Heritage in the Shadow of Globalisation” being a paper presented at the Aggrey-Fraser-Guggisberg Memorial Lectures, University of Ghana, Legon, Accra, (2002).
    • A presentation of the effects of globalization on Africa.
  • Mbembe, Achili. On the Postcolony. Berkeley: University of California Press, 2001.
    • A discourse on the nature of neocolonialism and its negative impact in Africa.
  • Molnar, Thomas. “Neocolonialism in Africa?” Modern Age. Spring (1965).
    • An analysis of nature of neocolonial economy in Africa and its positive impacts.
  • Mudimbe V.Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge. Bloomington and Indianapolis: Indiana University Press, 1988.
    • A discourse on the interplay of Western colonialism in Africa, and its denial of the existence of “the other” in European consciousness.
  • Mudimbe V.Y. The Idea of Africa. Bloomington: Indiana University Press, 1994.
    • A discourse on the distinct cultural values that constitute the African reality.
  • Ngugi, wa Thiong’o. Decolonising the Mind: The Politics of Language in African Literature. London: James Curreys, 1986.
    • A discourse on cultural imperialism and on written African languages as a vehicle for Africa’s decolonization.
  • Nkrumah, Kwame. Neo-Colonialism: The Highest Stage of Imperialism. London: Heinemann, 1965.
    • An analysis of the nature of neocolonial economy and its relationship with Imperialism.
  • Nwolize, OBC. “The Fate of Women, Children and the Aged in Contemporary Africa’s Conflict Theatres”, Paper delivered at the Public Annual lecture of the National Association of Political Science Students, University of Ibadan, (2001).
    • A presentation of the effects of colonialism and conflict situations in contemporary Africa.
  • Obadina, Tunde. “The myth of Neo-colonialism” in Africa Economic Analysis, 2000.
    • A critical analysis of colonialism and neocolonialism in Africa.
  • Parenti, Michael. The Face of Imperialism. New York: Paradigm Publishers, 2011.
    • An exposition of the role of multinational corporations in the imperialist conquests.
  • Sartre, Jean-Paul. Colonialism and Neocolonialism, translated by Steve Brewer, Azzedine Haddour, Terry McWilliams; Paris: Routledge, 1964.
    • A critical analysis of French colonial policies on Africa, especially in Algeria.
  • Serequeberhan, Tsenay. “Philosophy and Post-Colonial Africa”in E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discussion on the nature of neocolonialism in Africa.
  • Taiwo, Olufemi. Africa Must Be Modern: A Manifesto. Indianapolis: Indiana University Press, 2014.
    • A discourse on globalization and modernity in Africa.
  • Young, Robert. Postcolonialism: An Historical Introduction. Oxford: Blackwell, 2001.
    • A discussion on the historical & theoretical origins of postcolonial theory.
  • Wiredu, Kwasi. Conceptual Decolonization in African Philosophy: 4 Essays. Ibadan: Hope Publications, 1998.
    • A discourse on decolonization and development of an authentic Africa Philosophy

 

Author Information

Oseni Taiwo Afisi
Email: oseni.afisi@lasu.edu.ng
Lagos State University
Nigeria

Artistic Medium

Laocoon sculptureArtistic medium is an art critical concept that first arose in 18th century European discourse about art. Medium analysis has historically attempted to identify that out of which works of art and, more generally, art forms are created, in order to better articulate norms or standards by which works of art and art forms can be evaluated. Since the 19th century, medium analysis has emerged in two different forms of critical and theoretical discourse about art.Within traditional art forms, such as painting and sculpture, modernist artists and critics began to interrogate art forms and the history of their possibilities in order to discover the necessary conditions for instances of those forms. This modernist interest in medium aimed to strip away unnecessary traditional artistic conventions in order to identify that which is essential to the form. Within newly emergent forms of popular art, such as movies, comics, and video games, artists and critics have attempted to articulate both the ways the norms of these forms of popular art arise from new material and technological modes of creating and interacting with reproducible images.

The possibilities for an art form, whether traditional or newly emergent, can only be discovered by artists in acts of artistic creation. For this reason, the relation between art forms and their media develops and changes as the art forms continue to be discovered and reimagined by artists.

Table of Contents

  1. Introduction
  2. The Challenge of Medium Skepticism
    1. Carroll’s Medium Skepticism
    2. The Need for the Concept of Artistic Medium
  3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium
    1. Aristotle
    2. Aristotle and Horace as Models for Theorizing Art
    3. Music
  4. Gotthold Lessing and the Problem of Art in the 18th Century
    1. Art in the 18th Century
    2. Lessing on Painting and Poetry
    3. Herder and Hegel
  5. The Invention of Photography and the Discovery of Its Artistic Possibilities
    1. The Etymology of the Term “Artistic Medium”
    2. The Challenge of Photography
    3. Accounting for Photography’s Artistic Possibilities
  6. Modernism as the Discovery of Medium
    1. The Emergence of Modernism
    2. Modernism and 20th Century Music
    3. Fried on the Value of Modernism
    4. Postmodernism
  7. New Forms of Popular Art in the 20th Century
    1. Movies
    2. Comics
    3. Video Games
  8. Conclusion
  9. References and Further Reading

1. Introduction

Artistic medium is a term that is used by artists and art critics to refer to that out of which a work of art or, more generally, a particular art form, is made. There are, generally speaking, two related ways of using artistic medium in critical or artistic discourse. On the one hand, we often talk about an artistic medium by reference to the material out of which a work of art is made. Works of art in museums or galleries will often have the medium listed along with the title and the artist’s name on the display card. A painting might have “oil on canvas” or “watercolor” listed along with the artist’s name and the work’s title; a sculpture might have “marble,” “steel,” or “papier-mâché” listed in the same way. On the other hand, we also talk about medium to refer to the way a work of art organizes its audience’s experience in space and time. An actor might talk about the differences in performing on television and on film as performing in two different artistic media. Or a critic might describe television as a “writer’s medium” and movies as a “director’s medium.” Sometimes there may be no interesting differences regarding the material out of which the work was made; for this way of using medium, the crucial differences have to do with the spatiotemporal organization of the audience’s experience of the work of art.

Much of the critical and theoretical interest in the concept of artistic medium stems from a belief that analyzing the material conditions that underlie a particular art form allows us to articulate its norms and standards. Often critics and theorists who make use of the concept of artistic medium do so in order to connect an analysis of an art form’s material basis and conditions with some claim about what artistic norms or standards are proper to the art form. Because the connection between a description of a medium, an art form’s material basis, and the artistic experiences appropriate to that medium is a matter of some controversy, clarification of the philosophical insights and confusions associated with the concept of artistic medium must start not by arriving at its comprehensive definition, but rather by noting the characteristic forms of reasoning in which the concept is used.

There have been two relatively distinct forms of discourse involving artistic medium: a modernist discourse, and one associated with newly emergent popular art forms such as movies and comics. The uses of artistic medium in these discursive traditions have shared important similarities, especially a reliance on the concept to identify what is distinctive about a particular art form and an interest in grounding the norms governing a particular art form in the form’s material basis. But there are important differences as well. Modernist uses of the concept appeal to artistic medium as a way of justifying avant-garde approaches to traditional art forms by making clear how contemporary experimental instances of a form are genuine instances of that form because they inherit the tradition in question by purifying it of all that is inessential and accidental. Proponents of newly emergent popular art forms, on the other hand, are interested in articulating what is unique about the new forms in order to locate their possibilities in distinction from traditional or older forms and to demonstrate how its best instances are works of art.

In recent years, some analytic philosophers of art have suggested that the concept of artistic medium is necessarily a confused one and should be abandoned in favor of other art-critical concepts such as style or genre. For example, Noël Carroll, in his theorization of film in the 1980s and 90s, suggested abandoning medium as a critically inert and confused category. More recently, Carroll has found critical uses for the concept of artistic medium, especially in the analysis of exemplary instances of avant-garde film. Though Carroll does now recognize legitimate applications of the concept of artistic medium in film criticism and theory, it is nonetheless worthwhile to take seriously his initial radical skeptical challenge to critical and theoretical uses of the concept of artistic medium. Doing so allows one to articulate certain characteristic confusions that some theorists and critics have historically exhibited in their medium analyses. But, equally, it allows for the opportunity to clarify what, historically, has characterized the richest and most insightful critical and theoretical uses of artistic medium. As we shall see, these kinds of confusions are apt to arise when the theorist or critic does not remember that artistic medium is an art critical concept. As an art critical concept, what a medium for an art form is can only be known through artists discovering its possibilities in the creation of works within the form.

In general, confusions arise in using artistic medium when theorists and critics do not treat the concept as a critical one, but instead picture a medium as something that could be identified prior to and independently of any particular artistic uses to which it is put. In Art as Experience (1934), John Dewey attempts to combat this possibility for confusion by distinguishing between an artistic medium and raw material. When we identify some collection of matter prior to and independent of any particular artistic context, then we have identified some raw material, which may, it is true, be put to use for various artistic ends. But we cannot specify what artistic possibilities are available to artists by identifying and analyzing that material. Rather, when a given vehicle is taken up and explored within a particular artistic problematic or tradition, artists discover it as an artistic medium. It is thus through the work of artists that the artistic possibilities of an artistic medium can be discovered, and not by analyzing the material in isolation. In this sense artistic medium essentially is a critical concept. What is possible within a particular medium is discovered by artists as they attempt to explore a particular artistic problematic or inherit a particular artistic tradition. For this reason, what the medium of an art form is, as Theodor Adorno insists in Philosophy of New Music (1948), is a historical question. There is no fixed, ahistorical answer to the question, “What are the material conditions for painting, or music, or any particular art form?”

In order to clarify the nature of the concept artistic medium, this article takes two different, although closely related, lines of approach. This article will first clarify the roles artistic medium can rightfully play within critical and theoretical discourses by responding to the challenge of medium skepticism, which takes the concept to be necessarily confused. Then, it will outline the history of artistic medium’s emergence by describing the forms of critical reasoning in which the concept has been characteristically used. In so doing, the article will articulate why the concept has been so important for the development of new forms of popular art and for avant-garde and modernist experimentation, and why the concept has been vulnerable to characteristic confusions.

The first section of this article will engage with the challenge of medium skepticism. Medium skepticism, a position recently prominent in the philosophy of art, holds that artistic medium gives rise to a set of characteristic confusions because the concept is both essentializing and one that grounds its reasoning in a priori reflection upon the nature of the material basis of an art form. As we shall see, those two theoretical temptations are not inherent in the concept but are dangers only given a certain picture of how we determine what the medium is.

Then, employing Adorno’s thought that our understanding of what a medium is must be located in the history of the development of its art form, the article describes the emergence of the concept of artistic medium and the history of its critical and theoretical uses in the development of modern arts. First, there is a brief account of how philosophers and critics in the ancient world and the European tradition theorized artistic possibilities relative to a given art form prior to the emergence of artistic medium as a critical and theoretical category: namely, by identifying an art form and its norms and standards by specifying its proper experience. Then, the two sites of emergence for the concept of artistic medium are described: first, in the 18th century, in the critical work of Gotthold Lessing and, most importantly, his reflections on the differences between painting and poetry; second, most decisively, in the 19th century, in response to the invention of photography, its potential as a new art form, and its relation to painting. This complex historical field within which the concept of artistic medium emerged allows us to locate the centrality of the concept in 20th-century artistic discourses and also the philosophical confusions associated with it.

2. The Challenge of Medium Skepticism

a. Carroll’s Medium Skepticism

 In the late 1980s and early 1990s, Noël Carroll issued what we may call the challenge of medium skepticism; he argued that medium analysis of film is necessarily confused and that film’s medium can be identified, but that it has no artistically normative implications. Carroll’s interest in the concept of artistic medium originated from within film theory, but his claims about medium analysis being an essentializing discourse, and his recommendations that theorists and critics abandon the concept of artistic medium in favor of other art theoretic concepts like genre and style, apply to the use of medium as an art theoretic concept in general. More recently, Carroll has moved away from his medium skepticism and has acknowledged uses for artistic medium, especially in describing and evaluating avant-garde or structural film. Nonetheless, evaluating the challenge of medium skepticism is valuable in order to clarify how critics and theorists use artistic medium in characteristically confused ways and how the concept can be used in ways that avoid those confusions. In responding to the challenge of medium skepticism, we articulate the value of artistic medium.

At the root of Carroll’s concern is his contention that medium analysis ultimately depends, either implicitly or explicitly, on an illicit judgment from the nature of material conditions underlying the work of art to a set of norms and prescriptions meant to govern an art form. Carroll contends that medium analysis must slip into a theory of medium specificity in which each art form has a single medium and each medium has a distinctive feature that does and should characterize artistic creation within the form: the medium’s distinctive feature or power provides the aim the art form and its practitioners should pursue. Carroll’s rejection of medium specificity, which he sees as the inescapable heart of medium analysis, consists of two related objections: first, that medium analysis necessarily essentializes by identifying an art form with a single medium and a medium with a unique characteristic; and second, that medium analysis is structured around a priori reflection on the nature of the medium, yielding normative prescriptions about which artistic experiences are appropriate that in fact merely reflect one’s theoretical biases or idiosyncratic tastes.

b. The Need for the Concept of Artistic Medium

It is true that much medium analysis essentializes, but critical or theoretical discourse involving medium is not necessarily essentializing. It may seem that the modernist tradition, for example, supports Carroll’s contention that use of artistic medium is necessarily essentializing, since exploration of an art form by means of its media often meant stripping away all that could be in order to discover what was essential to the art form. However, the modernist question of what is essential to the art form is itself a particular, historically-located question about artistic medium, and whatever answers modernist artists generated need not be taken as definitive of the timeless and unchanging essence of some particular art form. In fact, rejection of essentializing claims is revealed as necessary for sound medium analysis, since there is no independent grasp on what counts as an artistic medium outside of the context provided by a particular artistic problem or concern. This can be somewhat obscured for us because of the importance of modernism for our understanding of artistic medium as a concept; it is characteristic of modernist artists to take an art form itself as a question or problem to be explored. The modernist question of what, for example, constitutes the conditions of painting is part of an artistic project that takes as its starting point the history of painting and looks to inherit that tradition by stripping away all that is inessential to painting. But modernist artists and critics identify and explore shape or surface or color as they arise as problems or conditions for painting at a particular historical moment, not because of some timeless understanding they have of the nature of the media as such.

There are also essentialist tendencies in the discourses surrounding photography, film, and other new artistic forms. These essentialist claims often arise because the medium analysis starts from the problem of the new technological bases for these art forms. In this tradition of medium analysis, theorists and critics start with reflection on the nature of, say, photography as a new technological, productive process and draw aesthetic prescriptions or standards from the ontological structure of photographic experiences. Rudolf Arnheim, a film theorist prominent in the 1930s and after, offers a perspicuous example of a commitment to medium specificity in his theorization of film. Arnheim identifies the characteristic differences between a black and white photographic moving image and reality, and prescribes their accentuation as the basis for film art. Arnheim’s theoretical commitments to the purity of film as a medium led him to reject the development of color film and sound film as detracting from the artistic possibilities of silent black and white movies. As an example of medium analysis gone wrong, Arnheim’s restrictive prescriptions exemplify the reasons for Carroll’s rejection of medium specificity theories, for Arnheim’s commitment to the purity of silent film and rejection of the possibilities of sound movies reflects his own theoretical views about the unique characteristics of film itself, but voiced as an essentialist understanding of film. Arnheim’s critical blindness to the possibilities created with sound and color stems from his a priori commitment to an understanding of the nature of the photographic image.

But if Arnheim’s use of the concept artistic medium is subject to the confusions of medium specificity that Carroll warns of, other paradigmatic instances of medium analysis for emergent popular art forms do not fall prey to them. The critic and philosopher Walter Benjamin, in his “Little History of Photography” (1931), for example, articulates unique characteristics of photography that make possible artistic expression. However, Benjamin does not assume in advance of his critical investigation that he knows what art can be or everything that photography can do. Rather, Benjamin is committed to the view that photography’s invention changed what we could do, how we could see, and how we relate to our world and, at the same time, changed what can count as art and artistic experiences. Benjamin starts not with an essentializing, a priori analysis of the nature of photography in itself and for any possible use, but with a critically honed appreciation for the photographer Eugène Atget’s work and how that work discovers and explores certain characteristic possibilities for photography. Benjamin aims to identify the artistic possibilities within emergent artistic practices, what he identifies in his essay “The Work of Art in the Age of Its Technological Reproducibility” as art’s developmental tendencies. He does not attempt to offer an analysis of the medium independent of its emerging artistic uses and prescribe what those uses should be. Instead, Benjamin identifies unique characteristics of the new medium; that is, he describes new things that we can do with the new technology, so that he can articulate terms by which future artistic expressions can be understood. This is a critical judgment, to be evaluated in light of past and future instances of photography and photographic arts.

Carroll, then, worries about certain forms of theoretical confusion and critical blindness that can arise around commitments to medium specificity. Sometimes critics and theorists concerned with emergent popular art forms, like Arnheim, do succumb to the temptation towards medium specificity and prescribe some range of appropriate artistic experiences based on a supposed a priori, essentialist understanding of the nature of the technological basis for the art form. But the best critics and theorists engaged in medium analysis are not attempting to prescribe to artists which experiences are proper to the medium; rather, they are critically evaluating works of art in order to offer an analysis of why the art works on its audience in the ways that it does, and to articulate new possibilities that change what art can be.

Within the modernist tradition of medium analysis, most artists, critics, and theorists do not begin with an a priori, essentialist analysis of the medium in order to identify how it should be used in particular art forms. Instead, it is characteristic of the modernist tradition that the art form itself is taken up by artists as a problem or a question. Such artists, theorists, and critics do not draw out artistic prescriptions based on an understanding of the medium in isolation from any particular use. Rather, questions of shape and color are explored so as to find ways in which shape and color are conditions of possibility for painting. As we shall see, some postmodernist theorists and critics, such as Rosalind Krauss, share a version of Carroll’s worry that interest in exploring a medium demonstrates a commitment to some form of medium specificity: they object to the role of medium exploration in modernist arts, believing that modernist artists continually rediscover the same few automatisms or forms of repetition that they then explore as if they have, through the originality of their creation, taken on the art form itself.

It is common in critical discussions of art to use art form and medium more or less interchangeably and to talk about, for example, “the medium of painting.” It is worth distinguishing between an art form as a particular form of experience and a medium as the material conditions that underlie that form of experience and make it possible. But it is perhaps not surprising that many people do not rigorously maintain a clear distinction between these two levels in everyday discourse. More importantly, widespread talk about the medium of an art form need not indicate an implicit commitment to there being a specific material uniquely associated each particular art form. Instead, we may think about talking about the medium of an art form as indicating a level of analysis. Furthermore, what is picked out by some claim about medium is a contextual question relative to the history of the art form and the particular artistic problematic the work in question explores. If talking about the medium of an art form can indicate a level of analysis rather than a commitment to the medium specificity thesis, then we are open to thinking about the medium as a shifting collection of automatisms or forms of productive repetition that can evolve through the history of the art form.

Though Carroll worries that confusions can arise in using the concept of artistic medium, he acknowledges in his later work that critics and theorists have found productive critical and theoretical uses of the concept. In general, medium analysis that is grounded in exploring how characteristic experiences constituting a work of art or an art form are structured and achieved does not fall prey to the dangers of medium specificity that worried Carroll. For medium analysis pursued in this manner is critical, articulating the automatisms that underlie ongoing artistic practices. It begins with an artistic problem or concern and identifies media by the role they play in the discovery and exploration of possibilities within that framework. It does not start with reflection on the nature of the medium itself in order to deduce what characteristic properties or features are appropriate to exploit for artistic ends. Theorists and critics are most likely to fall victim to medium specificity confusions when they picture the medium of an art form as some type of raw material that can be grasped independently of and prior to its uses within an art form. Instead, our grasp on what a medium is and can be only arises within the context established by an artistic problem or tradition, to be discovered by artists and articulated by critics.

Furthermore, Carroll warns against attempting to identify an art form with a single artistic medium. There is no reason to think that there is some single material condition that constitutes the possibilities within an entire art form. Instead, within art forms, there are different media, different forms of repetition or automatisms, which have significance in structuring characteristic instances of the form. This means that, for example, film itself, considered as a technological innovation, may not always provide the most productive level of unity for medium analysis. Rather, we can ask: What are the various media at work in popular movies, in documentaries, in cartoons, and in particular film arts? How do those automatisms allow for the discovery of the artistic possibilities within particular forms of film art? Approaching medium analysis in this way allows one to locate the concept of artistic medium as a critical tool for understanding modern art experiences in which art is taken to have its distinct forms of experience that have their own norms and standards without necessarily committing to any essentialist assumptions.

3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium

Prior to the 18th century, theorists of art articulated the norms and standards characteristic of a given art form without any reference to a medium or the material conditions that underlie the work of art. Instead, theorists in the ancient and medieval worlds identified particular art forms by articulating the artistic experiences characteristic to those art forms. In so doing, they then could develop an account of what was and was not appropriate within an art form, given the experience at which the art form necessarily aimed.

a. Aristotle

Aristotle’s account of tragedy in his Poetics is one of the earliest examples of this way of theorizing about an art form and is a paradigmatic instance of it. Aristotle develops his account of tragedy as an art form by identifying the characteristic experience—catharsis—instances of the art form provide for audiences. He is able to develop a number of claims about how a tragedy should be structured in order to achieve this characteristic experience. Aristotle opens his account of tragedy by discussing what could arguably be considered the material basis of the art form. He identifies rhythm, melody, and verse as the means by which the effects fundamental to tragedy are achieved. However, he immediately notes that other art forms also utilize those same means and argues that what is specific to tragedy as an art form cannot be identified by analysis of the means by which instances of the art form are achieved. Instead, he turns his attention to the features of tragedy that are specific to the art form by analyzing the experience that structures the form.

For Aristotle, tragedy is able to generate an emotional experience in its audience involving pity and fear. His name for this experience is catharsis. The precise nature of this Aristotelian catharsis has been and continues to be a matter of great debate: it has been taken to be an experience of emotional discharge, of emotional purification, and of moral education, to name just a few interpretations. Fortunately, we do not need to determine what exactly Aristotle took catharsis to be in order to note the general shape of his reasoning about tragedy as an art form.

That tragedy aims for catharsis, a specific emotional experience for those who watch the drama, defines the nature of tragedy as an art form for Aristotle and organizes his analysis of how tragedies are structured. Most importantly, for Aristotle, tragedy can best achieve an experience of catharsis in its audience because it, unlike history or epic poetry, has a dramatic form. The events of a tragedy unfold as the audience watches; the audience apprehends events as they unfold, constituting the unity of the plot as a single action. The audience of a history or epic poem, on the other hand, learns of many actions and events, and their interrelation and history, by means of a narrative rather than dramatic form. Because the audience members for a tragedy witness the events of the drama as they play out in front of them, they are able to understand how those events, on the one hand, have an inexorable logic and, on the other, arise from choices that the protagonist makes that could have been otherwise. It is tragedy’s dramatic form that allows the audience to experience the choices made by the protagonist as both highly contingent, in that other options or other paths are always available, and necessary, in that the protagonist is such that the central action of the tragedy is characteristic of him. This tension between the contingency and the inevitability of the events depicted in a tragedy arises because the audience witnesses the protagonist make a choice and come to live with, or be destroyed by, its consequences. There is, therefore, an intimate connection between the dramatic structure of tragedy as an art form and the emotional cathartic experience available only to tragedy’s audiences.

Tragedy’s characteristic experience of emotional catharsis for the audience accounts for a number of features of the art form. Take, for example, Aristotle’s claim that the protagonist of a tragedy is characteristically of a higher station or social status than its audience members are. On his view, audience members who recognize the protagonist as their social better are in position to achieve the appropriate emotional catharsis because the events that unfold are understood to be the result of the protagonist’s choices and not merely to be the result of intractable or unfortunate circumstances; the heights from which the protagonist falls clarify the consequences of his action. Whether or not we agree with Aristotle about the reasons he offers for the fact that Greek tragedies characteristically centered on royal figures or even demigods, we can see that he is identifying the art form’s characteristic experience in order to explain why the art form has the features it does. Other features of tragedy that Aristotle accounts for include the extent to which the central action is grounded within basic familial structures and tensions and the role of the protagonist’s ignorance in the completion of the action.

Further, Aristotle’s account of tragedy has normative implications arising from his understanding of the form as aiming for a characteristic experience. In articulating the characteristic experience for the audience at which the art form aims, Aristotle identifies and explains why certain features, such as the high social status of its protagonist and the protagonist’s ignorance, aid in producing catharsis. Thus, his account of tragedy and how it is structured outlines a set of norms and standards that arise from the aim of achieving the art form’s characteristic experience.

b. Aristotle and Horace as Models for Theorizing Art

Aristotle’s analysis of tragedy as structured by the aim of achieving a characteristic experience in its audience is paradigmatic of artistic analysis and theory in the ancient Greek and Roman world and then again in Europe up through the 18th century. Horace’s Ars Poetica, for example, identifies poetry by its characteristic experience, namely, an experience of apprehending unity of action. Beginning in the Renaissance, Europeans began more extensive theorizing about art and art forms and they relied on Horace and, to a lesser extent, Aristotle, in order to justify a variety of accounts of art forms and the characteristic experiences that establish their norms and standards. For example, literary theorists such as Lodovico Castelvetro and Julius Caesar Scaliger, writing in 16th century Italy, both defended poetry as an imitative art (following Horatian principles) and argued for the legitimacy of literature produced in popular, vernacular languages. Such arguments were soon adopted across Europe by theorists such as Joachim du Bellay and Philip Sidney, who developed largely Horatian-style defenses of vernacular poetry as an imitative art. In the 17th century, Aristotle’s Poetics achieved a kind of prominence as a model for theorizing the norms of art forms; especially in France, dramatists such as Pierre Corneille and literary figures associated with the Académie française, took Aristotle to argue for the unity of action, of place, and of time as fundamental to dramatic structure. Although it is certainly possible to identify appeals to medium within this tradition of literary criticism inspired by Aristotle and Horace, most obviously in the arguments made in favor of works written in vernacular language as a legitimate means of artistic expression, by and large this Horatian tradition of poetics centers around analysis of the art form’s norms and standards by reference to an artistic aim, characteristically imitation, which structures the particular type of artistic experience.

c. Music

By contrast, theorizing about music, both in the ancient world and in the European tradition prior to the 18th century, did make appeal to what we might think of as the medium of music in articulating the norms underlying musical practices. However, what was identified as music’s medium itself changed over time. This is not a weakness of these theoretical accounts but rather an illustration of Adorno’s contention, noted above, that our understanding of the nature of the medium of some art form is itself a product of the history of that form: there is no fixed, ahistorical characterization of music’s medium because what musical experiences are is not fixed but continually discovered in composing and playing music.

The earliest theories about the nature of music within the Western tradition held that music is the expression of a set of natural ratios that are equally expressed macroscopically in the movement of the celestial spheres and microscopically with the harmonies of the human soul. Pythagoras and his followers placed mathematical and musical knowledge at the center of their studies and influenced Plato and other ancient philosophers. One prominent example of this ancient tradition of theorizing music as an expression of natural harmonies is Boethius, a Roman statesman and Neoplatonist philosopher from the 6th century C.E. In his De institutione musica, Boethius distinguishes between three types of music: music of the spheres, music of the human spirit, and instrumental music. Boethius’ account of music as the expression of the natural macro and microscopic harmonies of the universe was important through the European Middle Ages.

As the musical possibilities within the European tradition developed from monophony to polyphony in the late Middle Ages and Renaissance and then, during the Baroque era, to more complex contrapuntal forms of composition, theorizing about the nature of music also changed. Much medieval theorizing was directly inspired by Boethius and oriented around practical instructional concerns for composers and musicians. Indeed, much of the history of Western musical theory is bound up with theories of tuning. By the 16th century, there emerged in the work of theorists such as Gioseffo Zarlino, a new theoretical category—temperament—which made possible accounts better suited to describe the innovations in polyphony and counterpoint composition. Another strand of European music theory that emerged during and after the Renaissance drew on concepts from ancient rhetorical theories in order to describe the space of musical possibilities. By the end of the 18th century, such rhetorical analysis had largely been supplanted by a new type of analysis of musical forms, such as the sonata, in the work of Heinrich Koch and others. These theoretical developments around the nature of musical experience were responsive to the evolving nature and increasing complexity of music composition and performance in the 18th and 19th century.

The preceding is not meant to be a comprehensive overview of theorizing about art and art forms in the ancient and European traditions prior to the 18th century. Importantly, theorizing about artistic medium did not have the central place in theorizing about art forms more generally that it came to occupy beginning in the 18th and 19th centuries. As art established itself as a relatively autonomous region of experience, the concept of medium emerged as a critical means of understanding distinct types of artistic experience. Thus while we can, in retrospect, identify candidates for music’s medium, there is something anachronistic about thinking of them as examples of theorizing about artistic medium, since theorists like Boethius, for example, did not categorize music as one type of artistic experience among others. As Europeans began to conceive of art as a distinct region of experience, the Aristotelian and Horatian model for articulating the norms of an art form by reflection on its overall aim was fundamentally modified with the introduction of sustained consideration of the medium as the means for achieving a particular type of artistic experience.

4. Gotthold Lessing and the Problem of Art in the 18th Century

 With his Laocoön: An Essay on the Limits of Painting and Poetry, published in 1766, Gotthold Ephraim Lessing is often identified as the first theorist or critic to engage in medium analysis. In that essay, he articulates the standards by which painting and poetry should depict bodies in action through an analysis of the spatiotemporal conditions under which the art forms are experienced. Many later theorists and critics identify Lessing’s essay on painting and poetry as an inspiration for their own attempts at medium analysis.

a. Art in the 18th Century

Before examining the particulars of Lessing’s own analysis of painting and poetry, it will be helpful to note something of the wider context of art and art criticism in 18th century Europe. By the mid-18th century in Europe, art had become its own distinct realm of experience, the culmination of a long and complex process in which artistic creation gradually decoupled from religious expression. It is a reflection of art’s new status as a distinct form of experience that the 18th century in Europe saw the emergence of art history and art criticism as intellectual practices and aesthetic experience became an important topic for philosophers. Johann Winckelmann, for instance, developed the first comprehensive account of ancient art, distinguishing between Greek, Greco-Roman, and Roman art, and explicitly took up the Greeks in particular as a model for contemporary artists. Denis Diderot, among his many other accomplishments, began, in 1759, writing critical reports on the biennial Paris Salons for a German newsletter, offering evaluations of particular artists and paintings and, equally, developing a critical account of the experiences at which painting should aim.

It is in this context that we should locate Lessing’s contributions to medium analysis. In writing his Laocoön essay, Lessing shared with Winckelmann and Diderot (all three writing more or less concurrently in the 1750s and 1760s) an awareness of art as a distinct form of experience and thus as posing its own particular questions and distinct problems. Lessing, like Winckelmann, held that art, inasmuch as it was distinct from religious experience, should take beauty as its ultimate aim. Further, the ancient Greek, Hellenic, and Roman artists provide the best model for contemporary artists in large part because, being pre-Christian, their work reflects an unadulterated focus on artistic beauty for beauty’s sake. Later Christian artists were, on this view, required to maintain a sort of double allegiance to the demands of beauty and the teachings of the Church, to the detriment of their work artistically.

Like Diderot, Lessing was interested in art’s ability to generate an experience of a kind of moral or spiritual beauty in its audience. Both Diderot and Lessing believed that painting, for example, can show moments of beauty that are not exclusively visual in nature by encouraging audiences to imagine moral and spiritual possibilities that we do not ordinarily encounter or recognize in our everyday lives. The aesthetic aim means that painters should choose a revelatory moment within the action depicted that offers the chance to think through the nature of that action.

b. Lessing on Painting and Poetry

Lessing’s work of medium analysis in the Laocoön essay begins with an art historical question: did the Laocoön Group, a statue excavated in Rome in 1506 and currently on display at the Vatican, precede or come after Virgil’s account of Laocoön and his death in the Aeneid? In the Aeneid, Virgil recounts the story of Laocoön, a Trojan priest, who warns against bringing the Greek offering of a giant horse statue into Troy. Snakes sent by the gods kill Laocoön and his sons; the Trojans interpret this a sign from the gods that Laocoön should not be heeded and bring the offering into the city, ensuring their ultimate doom. The question Lessing sets out to answer, whether the sculptors inspired the poet or vice versa, serves as a jumping-off point for Lessing’s broader interest in establishing the different norms and standards that govern painting and poetry.

Lessing characterizes painting and poetry in quite abstract and capacious terms. He defines poetry as any art form that unfolds in time and painting as inclusive of all art forms that are visual in nature. Distinctions between the different materials out of which works of art (between marble and oil paint on canvas, say) are not relevant for Lessing’s analysis: he abstracts away from what will later be thought of by some critics and theorists as distinct artistic mediums in order to characterize painting and poetry in terms of the spatiotemporal experience of the audience in apprehending the work. This stands in contrast with later analysis of artistic medium, which often centers on the particular matter out of which works of art are made.

In fact, though often (rightly) credited as the first critic to offer an analysis of artistic medium, Lessing himself does not describe painting and poetry as artistic mediums. It is worth noting that there was no widespread appeal to a concept of artistic medium until the middle of the 19th century, when a term that had its home in scientific contexts was extended into artistic contexts. So, although Lessing is correctly credited with the first developed medium analysis, he describes painting and poetry as different methods for achieving a particular artistic experience.

Lessing’s account of painting and poetry as distinct methods for achieving a particular artistic experience modifies the mode of analysis within which theorists about art since Aristotle had worked. That Aristotle-inspired mode of analysis developed an account of the norms and standards governing an art form by identifying the experience characteristic of the art form and generating an account of the features of the forms in light of their contribution to the overall experience aimed at. Similarly, Lessing’s analysis of painting and poetry takes as its starting point a particular artistic aim; namely the audience’s imaginative apprehension of bodies in action as beautiful, and distinguishes between two different methods for achieving that experience. Unlike Aristotle, Lessing has in mind a general type of experience, the imaginative apprehension of bodies in action as beautiful, achieved by two different methods. Lessing is able to offer an account of the different artistic norms governing painting and poetry because he takes them up as different means by which a kind of artistic experience can be achieved, where each means is constituted by a distinct spatiotemporal structure.

According to Lessing, because signs contiguous to other signs best represent objects contiguous to other objects, painting’s appropriate subject matter is bodies at a single moment of time. Similarly, because signs that succeed one another best represent objects that succeed one another in time, poetry’s appropriate subject matter is actions unfolding in time. Lessing argues that the material conditions of the method determine what is appropriate to that art form:

If it is true that in its imitations painting uses completely different means or signs than does poetry, namely figures and colors in space rather than articulated sounds in time, and if these signs must indisputably bear a suitable relation to the thing signified, then signs existing in space can express only objects whose wholes or parts coexist, while signs that follow one another can express only objects whose wholes or parts are consecutive.

Objects or parts of objects which exist in space are called bodies. Accordingly, bodies with their visible properties are the true subjects of painting.

Objects or parts of objects which follow one another are called actions. Accordingly, actions are the true subject of poetry. (78)

Some subsequent critics and theorists have read Lessing here as offering two distinct tasks for painting and poetry, depicting bodies and depicting action. But Lessing considers these to be two different methods for achieving a single effect; namely, getting the audience to imagine bodies in action. For he completes the thought by noting “painting too can imitate actions, but only by suggestion through bodies…. Poetry also depicts bodies, but only by suggestion through action.” (78) Because Lessing is interested in painting and poetry inasmuch as they are different methods for encouraging audiences to experience beauty through imagining bodies in action, he immediately goes on to outline the norms that govern these different methods. Poets should construct their descriptions of actions by referencing each body participating in the overall action in terms of a single characteristic as it makes its contributions to the action. That allows the audience to imagine each body’s role in the overall action and so not be distracted by unnecessary detail or description. Painters should, according to Lessing, choose to depict the single moment of an overall action that encourages the audience to best imagine the action and one that offers the audience particular insight into what is at stake in the action.

Homer and the sculptors who created the Laocoön Group are exemplary artists in Lessing’s view because they grasp the norms at work in their respective art forms. Homer’s mastery in part stems from his ability to offer descriptions of scenes that are centered around and grow out of an action, as when he describes Agamemnon’s armor and regalia as he is in the act of donning it. According to Lessing, Homer’s usual practice is not to linger on description for its own sake but only to describe objects in the midst of action and only in terms of a single distinct characteristic, so as to encourage the audience’s imagination: for example, he evokes black-prowed ships skimming over a wine-dark sea. Likewise, the sculptors of the Laocoön Group are exemplary inasmuch as they have chosen a moment before the snakes have crushed and broken Laocoön and he has started to scream. Instead, they show his resistance to his suffering, the way in which he is enduring the pain and suffering that will inevitably overwhelm him. In this way, the audience is able to imagine both the enormity of his suffering and the spiritual beauty of his resistance in the face of immense suffering.

Lessing’s analysis of painting and poetry, therefore, identifies them as distinct methods for exploring a shared artistic problematic and considers them insofar as they constitute the aimed-for artistic experience. He is able to articulate a set of critical norms and standards for the art forms by reflecting on their different underlying spatiotemporal conditions. Lessing’s analysis establishes the dependence of the norms of painting and poetry on the spatiotemporal conditions of the experiences of those works of art. This dependence is the result of his choice to begin not with the material conditions of the art forms, but by locating those art forms as participating in a particular artistic aim, that is, the demand that painting and poetry encourage their audiences’ imaginative apprehension of the beauty possible for bodies in action. In turn this aim determines painting and poetry as methods and gives Lessing’s normative recommendations the force they have.

c. Herder and Hegel

 Lessing’s theorization of artistic medium proved influential as questions of art and aesthetic experience moved to the center of philosophical thought at the end of the 18th and the beginning of the 19th century. Johann Herder, for example, in his Sculpture: Some Observations on Form and Shape from Pygmalion’s Creative Dream (1778) extends and complicates Lessing’s approach to artistic medium by denying that painting and sculpture could, as Lessing held, be understood in the same terms because they both offer a single moment of an action up to the audience for contemplation. Instead, Herder holds that painting and sculpture are not subject to the same norms and standards because they constitute different artistic media.

Most decisively, the concept of artistic medium plays a central role in the thought of Georg Wilhelm Friedrich Hegel, arguably the most influential philosopher of the early 19th century. On Hegel’s view, the norms and ideals that structure human interactions develop out of and so are made explicit in the history of human political development, intellectual development, and artistic development. Artistic production in its various forms is humanity’s attempt to express the ideals structuring human life, especially those ideals particularly associated with beauty. But if this is case, then the question arises: Why are there different art forms, given that they all are structured around a shared, if at times inchoate, desire? Hegel’s answer to this question, most developed in his Lectures on Aesthetics (published posthumously in 1835), is that different artistic media serve as the basis for and so give rise to the possible forms of expression within particular art forms. In Hegel’s work, we have perhaps the clearest example of the consequences of conceiving of art as a distinct form of human experience, separate from religion and politics for example. In thinking of art as a general field of experience within which we can distinguish distinct art forms, the concept of artistic medium is critical in allowing Hegel to maintain the unity of art in general while still distinguishing clearly between particular art forms and the norms and standards that govern them.

5. The Invention of Photography and the Discovery of Its Artistic Possibilities

a. The Etymology of the Term “Artistic Medium”

As noted above, Lessing does not describe his analysis of painting and poetry as an analysis of artistic medium. He talks about painting and poetry as different methods for achieving an experience. Indeed, it was not until the middle of the 19th century, almost 90 years after Lessing’s Laocoön essay, that medium began to be used in artistic contexts, referring to the material conditions out of which works of art are made. The Oxford English Dictionary notes that the earliest use of medium in an artistic context, signifying the raw material out of which a work of art is made, is from 1861. This new use of medium in an artistic context grew out of an earlier use that describes the substance (such as oil or water) that painters mix with pigment to create paint. We still speak of oil paint as a distinct medium that differs from watercolor; this use is an extension from an earlier one that identifies oil and water as media in which pigments are mixed.

b. The Challenge of Photography

While the first uses of medium in artistic contexts often referenced the material out of which paint and then paintings were made, the timing of this development is likely connected to a radical problem that gripped the art world of the 19th century; namely, the emergence of technologies that reproduced images: first lithography, and then photography, which reproduces images of our world now past. In the 1820s, Nicéphore Niépce developed the first photoetching process; in the 1830s, a number of inventors, most prominently Niépce’s partner Louis Daguerre and William Fox Talbot, worked independently on developing photographic processes that were capable of capturing images mechanically with much shorter exposures. By the end of the 1830s, both Talbot and Daguerre had publicly debuted their technology for mechanically capturing and reproducing images from the world. The invention of photography was widely felt as a challenge to the received understanding of what could be art and what artistic experiences were proper to painting specifically. But the debates surrounding photography and painting in the 19th century largely centered on whether or in what ways photography could serve as the means of artistic expression.

The new photographic technology was, in many cases, quickly distinguished from processes by which works of art were produced and dismissed as being incapable of producing art. The most prominent argument made against the possibility that photography could be art was based on the mechanical nature of the photographic process. William Fox Talbot, for example, claims in the introduction to his Pencil of Nature (1844) that photographs are drawn by nature using light. Talbot’s view that photography was the production of natural images by mechanical means alone, without the intervention of any human artistry, was widely shared in the 19th century and taken to be grounds that photography could not ultimately be an artistic medium. If the photograph is made by the interaction of natural processes of light and chemicals, then it cannot be a work of art, any more than a tree or a sunset could be.

The 19th century debates about the possibility of photographic art seem, from our 21st century vantage point, hopelessly misguided. But this is largely because we are the recipients of an understanding of what can count as art that has been altered by the development of reproductive technologies like photography and film. Throughout most of the 19th century, it seemed obvious to a large number of critics that the mechanical nature of photography excluded it straightforwardly from consideration as art. Charles Baudelaire for example, in his Salon of 1859, worries that the public is starting to confuse photography for art by mistakenly taking a mechanical means for image reproduction as capable of inspiring imagination. Photographers, on this view, were mere technicians, only capable of reproducing natural images by exploiting the laws of nature. Because there was no human creativity at work in producing the photographic images, those images could not be art and the photographers were not artists.

Those who argued that photography could be art generally took two lines of response. On the one hand, many held that, while photography was ultimately a mechanical process, it could be artistic inasmuch as it is able to mimic painting and the artistic experiences of which painting is capable. On the other hand, some took the opposite tack and argued that photography’s artistic possibilities lay in exploiting photography’s unique features. On the first line of response, photography was artistic by the extent to which it was able to look like painting or otherwise reproduce it. On the second, photography was seen as artistic by the extent to which it distinguished itself from painting’s artistic possibilities by taking advantage of the features that only it possessed.

Photographers attempted to imitate painting in several ways. One of the earliest uses to which photography was put was the reproduction of paintings in order to disseminate widely what would otherwise require a pilgrimage to see. Further, some early photographers took photographs that were essentially reproducing the subject matter of earlier paintings by staging scenes reminiscent of those paintings. These genre photographs are the primary objects of Baudelaire’s denunciation of photography as essentially lacking in human creativity. Finally, some photographers began to produce photographic experiences that mimicked experiences recognizable from painting, utilizing soft focus, for example. Julia Margaret Cameron’s photographic work exemplifies many of these quasi-painterly techniques and is deeply influenced by pre-Raphaelite painting.

On the other hand, many photographers and critics, beginning with the earliest instances of photography, emphasized aspects of photographic production that were taken to be unique to it and therefore unlike painting. In announcing Daguerre’s invention to the French Academy of Sciences in 1839, François Arago emphasized the precision of the photographic image and its ability to allow its audience to see aspects of their world that they would otherwise be unable to see. During the 19th century, photographic practices evolved to capture moments and aspects of the world that are fleeting. Eugène Atget, for example, developed his photographic productive practice around capturing aspects of Parisian life that were disappearing as the city continually modernized, memorializing scenes from a form of life that was fading away. Walter Benjamin, looking back at the development of 19th and early 20th century photography from the vantage point of the 1930s, articulates photography’s unique features in terms of its ability to allow us to become conscious of what otherwise passes before our eyes unrecognized. On Benjamin’s view, photography’s ability to reveal what he terms the optical unconscious constitutes its distinctive power and its value as an artistic medium.

c. Accounting for Photography’s Artistic Possibilities

The problem posed by photography and its relation to art generally and painting specifically led to an approach towards questions of artistic medium distinct from the approach pioneered by Lessing. Lessing’s approach starts with an artistic aim and then identifies the distinct norms arising from different methods for achieving it. Photography instead presented itself as a problem: the question is not how best to achieve a particular artistic experience with a mode of expression or set of material conditions, but instead, to what extent, if any, can this new mode of expression be artistic? The emergence of this new technology raised a pressing question: Can it serve as the basis for artistic creation and, if so, what aspects or features of it are most appropriate for creating art? This approach to medium analysis begins by identifying a particular medium and its unique features or characteristics and determining what artistic experiences artists using the medium should pursue.

Both the early defenders and objectors to photography’s artistic value follow the same basic template for analyzing medium. First, the critic identifies the medium (the photographic technology that constitutes a new mode of expression) and determines its unique features. Then, the theorist or critic evaluates the ways in which those unique features can generate artistic experiences. There are two ways this evaluation can happen. One can reflect on the nature of the medium and the unique features of its productive process and try to deduce an absolute a priori claim about its artistic possibilities independently of critical examination of the works produced; this approach generates confusions or, at best, tendentious critical prescriptions. Alternately, one can engage in a critical investigation of work that utilizes the unique features of the medium in order to articulate how new artistic possibilities are being generated. Baudelaire’s dismissal of photography’s potential for artistic value exemplifies the first possibility. Benjamin’s critical examination of Atget’s work exemplifies the latter.

6. Modernism as the Discovery of Medium

a. The Emergence of Modernism

 As modernism transformed almost all traditional art forms more or less simultaneously during the first half of the 20th century, artistic medium became one of the crucial art critical concepts not just for theorists and critics but for artists as well. For modernist artists, inheriting traditional art forms meant querying the conditions of possibility underlying the art form in order to determine, through discovery and exploration, the necessary conditions for contemporary instances of the art form. For this reason, modernist arts often seemed to critics and some artists to be exercises in shedding, as some things taken to be essential to the form earlier in the tradition are discovered to be mere conventions and thus no longer conditions for contemporary instances of the art form.

There are too many important modernist artists across a wide variety of traditional art forms to give a comprehensive survey of them here. However, it is worth identifying a few of them in order to emphasize the modernist concerns that were deeply shared by artists across a broad swath of different art forms. In dance, for example, Isadora Duncan, the American dancer, rejected the inherited tradition of ballet techniques and thought of her own practice as the exploration of dance’s medium, the human body in its freedom of movement and gesture. In his “Ornament and Crime” (1913), Adolf Loos, an Austrian architect, critic, and theorist, argued against unnecessary architectural ornamentation in ways that heralded the modernist emphasis on purity of form and design. The modernist architectural commitment to form following function in design and the general dictum that buildings should be “machines for living” culminated in the work of architects such as Le Corbusier, Walter Gropius, and Ludwig Mies van der Rohe. In literature, the work of Gertrude Stein, James Joyce, and Franz Kafka, to name only a few, all differently exemplify modernist commitments. In Joyce’s work, for example, we can see a broad modernist development from an exploration of the history of forms of literary expression in Ulysses (1922) to an obsessive examination of the expressive possibilities of language itself in his final book, Finnegans Wake (1939).

b. Modernism and 20th Century Music

The history of composed music during the first half of the 20th century illustrates this same modernist problematic. By the 1920s, a number of composers began to explore new and unconventional forms of composition, including serial and atonal composition. Among the most prominent of these modernist composers was Arnold Schoenberg, who developed twelve-tone technique or row composition. Other notable modernist serial composers included Anton Webern and Karlheinz Stockhausen. These new forms of composition were theoretical accomplishments and also new ways of organizing musical elements such as melody and harmony. Not only did these developments in composition provide new systems of musical organization, but they readjusted audience’s understanding of older forms of composition now thought of as, for example, limited to tonal relations in contrast with modernist atonality. As Theodor Adorno observes, Schoenberg’s twelve-tone technique, for example, stands in contrast with 19th century composers who manipulate and transform the repetition of certain basic musical relations. In Schoenberg’s compositions, however, there is no room for repetition. Rather, the composer moves through a series of distinct relations between pitches. Usually these rows of related pitches do not get repeated but explored once, and then a new row is generated. No longer are composers straightforwardly exploring relations between melody and harmony by the repetition and manipulation of a few themes or motifs. Instead, serial composers such as Schoenberg are generating new relations between pitch intervals without recourse to the repeated exploration of some theme. Thus, in the modernist exploration of music’s possibilities, the medium of music itself undergoes radical developments. In other words, modernist composers no longer took for granted compositional techniques or assumptions that had for prior generations seemed obvious or unproblematic. Instead, modernist composers aimed to generate an entire system of composition and so too a theoretical articulation of the constraints and rules by which their particular system of composition operates. As these modernist questions gripped more composers, the possible compositional systems and their accompanying theoretical justifications proliferated.

c. Fried on the Value of Modernism

Why modernism should have taken hold in a number of traditional art forms more or less simultaneously in the early part of the 20th century remains an important question, one that cannot be answered directly in this article. We will simply note that it did happen and that although many artists and critics embraced the modernist moment with traditional art forms as the promise of clarifying what was truly necessary for those arts, the modernist moment also clearly marked a kind of crisis for those traditional art forms, in which that which had previously been accepted as the possible basis for serious work within the form no longer satisfied artists or audiences.

The logic of modernism is important for understanding the concept of artistic medium because the exploration of the art form’s medium in its purity was central to it. This article will focus on one clarifying example, the critical discourse analyzing and justifying modernist painting in the 1950s and 1960s, in order to bring out the characteristic structure of reasoning about medium in artistic modernism. Three critics in particular, Clement Greenberg, writing in the 1940s and 1950s, and Stanley Cavell and Michael Fried, writing in the 1960s, championed the modernist project in American painting and sculpture; their work offers a perspicuous example of the logic of modernism as an exploration of artistic medium. Greenberg’s “Modernist Painting,” in particular, is an early statement of critical purpose, justifying the modernist project of medium exploration for its own sake. Greenberg saw the modernist project as akin to the Kantian commitment to critical philosophy: like Kant, modernist artists, on Greenberg’s view, engage in a project of criticism, reflecting on the nature of the form in its purity by discovering and articulating its limits. Cavell, in his “A Matter of Meaning It” (1969), identifies modernism as the realization of an art form’s artistic media through the discovery of its contemporary conditions of possibility. The work of the modernist artist, according to Cavell, is to find the criteria for an instance of an art form in the act of inheriting that form.

The dominance of this modernist problematic was challenged in the 1960s as minimalist or conceptual art on the one hand and pop art on the other developed alternative artistic possibilities to be explored. These alternative artistic programs competed with modernist painting by rejecting painting and sculpture altogether as forms for artistic expression. Instead, the aim was to cultivate forms of experience in ways not bound by painting’s forms, its problematics, and its media. For example, pop art was interested in exploring the image and contemporary experiences of images as such, rather than posing the image as a problem situated merely within the history of painting.

This confrontation between modernism on the one hand and pop art, minimalism, or conceptual art on the other was felt as a crisis involving the very existence of painting and sculpture as art forms by a number of artists and critics. Michael Fried, in “Art and Objecthood” (1967), offers perhaps the strongest critical polemic on behalf of modernist painting and sculpture. Fried identifies recent developments in painting as responding to a conflict between minimalists and modernists about how shape should function as an artistic medium:

What is at stake in this conflict is whether the paintings or objects in question are experienced as paintings or objects, and what decides their identity as painting is their confronting of the demand that they hold as shapes. Otherwise they are experienced as nothing more than objects. This can be summed up by saying that modernist painting has come to find it imperative that it defeat or suspend its own objecthood, and that the crucial factor in this undertaking is shape, but shape that must belong to painting—it must be pictorial, not, or not merely, literal. (151)

Fried here identifies the minimalist project as taking what he terms a literal approach to shape, for example, in which shape on its own is apparently explored for its artistic possibilities. By contrast, for Fried the modernist project takes the art form itself as an artistic problematic or a contemporary question and the medium exploration is in service of the exploration of that problematic: What now are the conditions of painting?

Fried’s objection in condemning conceptual artists and minimalists as literalists is that exploration of the medium as such loses connection with what is possible within traditional art forms like painting and sculpture; namely, an aesthetic experience. Painting and sculpture aim at the production in their audiences of a shared moment of judgment, a moment of judgment that audiences together take pleasure in extending and contemplating. The literalists, on the other hand, construct for their audiences experiences that cannot be shared in a single moment of judgment but are necessarily individual explorations of objects within a space over some duration. Fried thinks conceptual and minimal artists offer audiences theatricalized experiences, unfolding for each individual in time without the possibility of a shared moment of aesthetic judgment. For Fried, it is not possible to arrive at the unity of an aesthetic experience simply by the exploration of material conditions in themselves, cut loose from any artistic problematic or aim. In contrast, modernist artists committed to the traditional art forms are interested in discovering the material conditions for experiences that demand aesthetic judgment. The modernist worry, articulated by Fried and Cavell, is that the possibility for authentic experiences of art are lost when the questions of artistic medium no longer arise in relation to an existing art form and its traditions. Substituting theatricalized experiences for serious artistic experiences will mean that people no longer have experiences that are both aesthetic and ascetic. Grounding one’s explorations within the history of an art form in order to offer a contemporary instance of the art form calls for an appropriately serious response on the part of the beholder, a response that demands self-work on the part of the beholder in ways that enrich both the experience and the beholder. In contrast to aesthetic literalism, the modernist project thus involves the cultivation of aesthetic judgment; through contemplation, better understanding of the relation between the present instance of the form and the history of the form is achieved.

d. Postmodernism

For those artists and critics committed to modernist art, the task at hand was the survival of traditional art forms through a radical exploration of what is most essential to a particular form. In so doing, the modernist artist aims to continue the art form by an original contribution to the tradition and creating work that discovers artistic possibilities on behalf of the art form. But to those artists and critics that emerged in the wake of the modernist moment, this stance of the heroic artist revealing possibilities for an art form through creating new instances of the form came to seem inappropriate and a bit self-aggrandizing. Postmodern critics and artists in the 1970s and after developed new approaches to the history of traditional art forms. Rosalind Krauss in “The Originality of the Avant-Garde” (1981) argues that modernists and avant-garde artists imagine that they make themselves the new origin of the art form as they continuously discover its essential conditions. Such modernist artists continually rediscover a few prominent automatisms, forms of repetition, as if they were the essence of painting and their discovery were an act of artistic originality. Krauss argues that rather than discover the essential material conditions of the art, modernist artists returned again and again to a fundamental form of repetition activated throughout the history of painting; namely, the grid. Avant-garde and modernist artists from this point of view do nothing but treat the various forms of repetition and automatisms that constitute the history of the art form as original individual discoveries of the grid and its possibilities for painting.

Postmodern art is characterized by a change in relation to an art form’s tradition. Rather than attempt to investigate the necessary material conditions for contemporary expression within the art form, the postmodernist attempts to otherwise activate a tradition’s discarded conventions through exaggeration, juxtaposition, and unabashed repetition. Modernism’s approach to tradition is to strip away everything conventional and inessential in order to discover the fundamental conditions of the art form. Postmodernism instead approaches an art form’s tradition as a collection of automatisms to be explored and activated again through conscious repetition. Rather than discarding all that is unnecessary, the postmodern artist juxtaposes or exaggerates disparate conventions and so hopes to rediscover possibilities within forgotten automatisms that modernism would have discarded. From the point of view of postmodernism, Fried’s attachment to the experience of working on oneself in order to better behold modernist works of art looks like a self-aggrandizing response analogous to the modernist artist’s “genius” in laying bare the conditions of painting as such. On the other hand, if Benjamin is correct that the invention of lithography and subsequent technologies that produce and reproduce images irrevocably accelerated European art’s transition from forms of experience centered around cult value to forms of experience centered around exhibition value, then modernist commitments to the possibilities of aesthetic-ascetic experience in inheriting traditional art forms may be seen as late attempts to sustain the possibilities of art with cult value.

Modernism coheres around the concept of artistic medium, for the aims of modernism are to discover the possibilities and thus the limits, the strengths, the tensions, the contradictions within an art form and its history by discovering how the form’s material conditions can be transformed into new and newly definitive instances of the form. Because postmodern artists return to the history of the form to discover its discarded conventions and automatisms rather than discarding them, they no longer think of media in terms of the essence of an art form. But although medium is not central to postmodern art, it is nonetheless still useful in critically evaluating works of art. Postmodernism emerged in the wake of modernism; the break with the history of traditional art forms that constituted the modernist moment was a break from conventions that no longer provided conviction for artistic expression. Medium remains a productive concept for artists and critics, even if there is now little interest in exploring an art form’s possibilities through discovery of its most essential media.

7. New Forms of Popular Art in the 20th Century

The 20th century saw the emergence of a succession of new forms of popular art, including movies, comics, and video games. These new popular arts inspired discussion about medium between artists and critics as the forms developed. Especially early in the lives of new popular art forms, questions of medium and medium analysis seem pressing to both artists and critics. This is because new art forms grow by borrowing artistic problems and aims from related earlier forms and by exploring a different material basis that makes new forms of artistic expression possible and in which artistic questions and interests can be pursued, critiqued, or otherwise engaged.

a. Movies

 Movies and film criticism are an exemplary instance of a new form of popular art generating elaborate and often productive discourses about medium. Much of the early history of film criticism and film theory is marked out by exploration of a number of questions related to film as a medium. As film theory began to establish itself as an academic field of interest in the early 1970s, interest shifted away from questions of medium. But early in the development of the movies as an art form, film’s potential for artistic expression was a prominent critical conversation between theorists and artists.

In the 1920s, Soviet filmmakers and critics such as Sergei Eisenstein, Vsevolod Pudovkin, and Dziga Vertov were engaged in a critical discourse about film’s potential for popular art in writing and with their movies. Eisenstein, for example, argued that film’s unique and characteristic feature was montage, the juxtaposition of images through editing into a sequence. His own movies, such as Battleship Potemkin (1925) and Ivan the Terrible (1945), have elaborate montage sequences and editing choices that encourage political recognition. Likewise, Pudovkin claims that montage and juxtaposition of images through editing can change the meaning of images. For both, the emphasis on montage as a unique and characteristic feature of film being central to its possible artistic experiences stems from their interest in the ways in which juxtaposing images can generate both abstract judgments and strong emotional responses. Dziga Vertov’s exploration of the photographic and mechanical basis of the film image was central to his artistic and political project of discovering new ways of allowing his audiences to understand the world around them. He emphasized film’s ability to reveal minute features and gestures, otherwise unseen or unnoticed, so that the audience is able to recognize them as characteristic of the overall environment. One exemplary instance of Vertov’s exploration of the nature of film as a medium of expression is his experimental movie, Man with a Movie Camera (1928).

A number of early film critics developed an analysis of film’s artistic and political promise around its photographic basis. In “The Work of Art in the Age of Its Technological Reproducibility,” (1939) Walter Benjamin emphasized photography’s ability to make visible minute aspects and gestures so as to display the character of people and environments. Similarly, popular movies offer the opportunity to develop new habits of perception that allow audiences to recognize fraught meaningful gestures. Walter Benjamin’s medium analysis is exemplary, for Benjamin explicitly asks of all reproductive technologies that have developed after lithography not, “What features of these material conditions are unique and thus capable of artistic experiences that take advantage of those features?” but rather, “How does the existence of these new reproductive technologies change what art can be?” Benjamin’s form of medium analysis is historically and critically grounded in successful instances of emergent forms of popular art.

Rudolf Arnheim, also writing film criticism in the 1920s and 1930s, offered an analysis of film’s potential as an artistic medium. Unlike Benjamin, who emphasizes film’s ability to reveal the optical unconscious, Arnheim identifies the ways in which the film image differs from everyday images and derives from those features the norms that should serve as the basis for film art. Arnheim’s critical blinders and commitment to an idea about purity of medium led him to argue against the possibility of film art that includes sound because film and sound are distinct media and should not be mixed.

Another early theorist committed to a medium analysis of film and photography is Siegfried Kracauer. Kracauer’s critical articulation of film’s artistic possibilities stems, like Benjamin’s, from the unique capabilities of photography and film, and so he identifies and encourages film’s documentary and democratizing impulses.

André Bazin, a midcentury French film critic who championed the Italian neorealists and cultivated a generation of French film critics and filmmakers, developed an analysis of film’s photographic basis. Bazin emphasizes photography’s ability to satisfy absolutely the desire to preserve the world as the basis for a critical understanding of film and its possibilities. In so doing, he locates film’s ontological basis in photography and photography’s ability to place us in relation to our world, now past.

This tradition of exploring the meaning of film’s ontological status as photographic includes Stanley Cavell’s work, especially his The World Viewed: Reflections on the Ontology of Film (1971). Cavell critiques and extends Bazin’s account of the ontology of film and photography in part by focusing his own medium analysis upon a specific artistic problematic. The World Viewed offers a medium analysis not of film as such, but of popular Hollywood movies. In analyzing the medium of movies, according to Cavell, prior to the 1960s popular movies explored the possibilities and tensions within a problematic of modern action that emerged in the 19th century concerning the possibilities for urbane, stylish, and productive action, but by the late 1960s, a new problematic concerning the contemporary possibilities for action simpliciter was emerging. The World Viewed was written in observance of this transition within popular movies and draws on the conceptual tools of medium analysis in order to register the fact of this transformation.

If the beginning of the 1970s saw the emergence of a new problematic for popular movies to discover and explore, it also saw the establishment of film theory as an academic discipline. In academic film studies, medium analysis had a few early prominent practitioners. Leo Braudy’s The World in a Frame: What We See in Films (1976), for example, offers an analysis of film’s artistic possibilities by distinguishing between the ways in which movie worlds are both closed off from and open to and interpenetrate with our world. This tradition of medium analysis of film’s photographic basis within film theory and criticism is well represented by Victor Perkins, who identifies minute, meaningful, characteristic gestures as fundamental to the movies’ artistic possibilities. Perkins’ commitment to the fundamental role that artistic medium has within his critical practice points to an intimate nexus of considerations of medium and artistic experience within the creation of movies and artistic practices more generally.

However, soon after film theory and criticism found an academic home within film studies in the 1970s, theorists and critics moved away from sustained medium analysis of film or the film arts. Instead, academics developed alternative interpretative frameworks, prominently Lacanian, feminist, and Marxist ones, that displaced the prominence of medium analysis within film theory. In analytic philosophy, as philosophy and film established itself as a domain of inquiry, instances of medium analysis gave way primarily to cognitive science approaches to theorizing film and film experiences. Medium analysis depends on the unity of the aesthetic experience to which the medium in question is able to contribute. A cognitive science approach to the effects possible in certain modes of filmmaking need not concern itself with the unity of aesthetic experience.

In the 1990s, Noël Carroll, a leading advocate of the cognitive-science approach in analytic philosophy and film, argued that medium was necessarily a confused category and should be eschewed by philosophers interested in theorizing movies and other film arts. In the mid-2000s, Carroll adjusted his view and acknowledged uses for the concept of medium, especially in describing the practices of certain experimental or avant-garde film artists. Regardless, for many years since its inception, indeed until the mid-1970s, one prominent form film theory has taken is medium analysis.

b. Comics

Comics, as they have developed as an art form, have also developed critical and theoretic discourses that participate in some form of medium analysis. Much of the most prominent medium analysis has been by artists adopting a critical and theoretic stance with respect to their own artistic practices. Prominent instances of this medium analysis of comics by comics artists include Will Eisner’s Comics and Sequential Art (1985) and Scott McCloud’s Understanding Comics (1993). Both Eisner and McCloud offer paradigmatic instances of medium analysis, in that both are theorizing the particular ways in which comics, as an art form, are able to achieve forms of aesthetic unity in relating image and action.

c. Video Games

Video games and 21st century gaming offer another instance of an emergent popular art form that has inspired early practitioners, critics, and theorists to engage in medium analysis. Much of the academic discourse analyzing gaming grows out of film studies and necessitates some medium consideration as terms and interpretative frameworks are applied in new contexts or, alternatively, theorists attempt to distinguish clearly between experiences that are proper to movies and other narrative visual forms and experiences that are proper to games and gaming. Medium analysis has been an important aspect of developing theoretic and critical discourses about gaming in which game creators and theorists are in conversation.

8. Conclusion

Currently within academia, medium analysis is largely pursued in media studies and disciplines exploring the emergence of new media. In philosophy, medium analysis has recently been utilized in numerous ways within the philosophy of gaming and video games. Given the ways in which screens and screen technology continue to interpenetrate contemporary reality, we can anticipate further recourse to medium analysis in theorizing these new forms of experience. Even if the collapse of interest in modernist projects in the arts has moved contemporary theorizing about art away from medium as a central concept, academic theorists of new media and new popular arts still participate in a discourse of medium analysis.

Artistic medium continues to be a productive critical concept as well for working artists and critics interested in articulating the means by which an artistic experience is structured and organized. That theorists and critics attempting to theorize medium should run into characteristic confusions in defining and theorizing medium stems from the picture they share of medium as an object. They take their task to be identifying the object that is the medium in order to deduce and prescribe its appropriate artistic experiences. But working critics and artists are less likely to think of artistic medium as an object to be studied for its own sake. For such critics and artists, thinking about medium is thinking about how something functions in creating a particular effect or in structuring a particular form of experience. What photography, for example, can be is to be discovered by artists as they pursue particular projects or lines of exploration. In this sense, artistic medium is a critical concept; we can only say what media are constitutive of an art form by critically examining instances of the form. This way of approaching medium analysis, as necessarily a critical pursuit, conceives of artistic media as the capacities for organizing and structuring the audience’s experience as the means of exploring and discovering the possibilities and tensions within an artistic problematic. These capacities for organizing artistic experience are forms of repetition or automatisms that have significance as the means by which a form of artistic experience is structured.

Medium analysis emerged with, and has developed in response to, modern art. As critics and theorists began to argue for art and the aesthetic as a distinct form of experience in the 18th century, independent of its former subservience to religion and able to dedicate itself to aiming only at beauty, medium analysis developed. First developed in Lessing’s work, medium analysis is a critical tool for understanding the norms constituting a capacity for structuring an artistic experience. The value of artistic medium in theoretical and critical discourse is realized when medium is approached not as some raw material to be investigated in advance of its possible artistic uses but as a means by which artists discover and explore possibilities within a particular artistic problematic. Discovery of a medium’s possibilities happens by artists in the creation of new instances of an art form, and by audiences and critics in the experience of particular instances of the art form. Theorists can avoid confusion by remembering that medium is an essentially critical concept, in that what is possible within a medium is discovered by artists as they continue to create and explore.

9. References and Further Reading

  • Adorno, T. W. (2006). Philosophy of new music. R. Hullot-Kentor (Ed.). Minneapolis: University of Minnesota Press.
    • First published in 1947, Adorno here analyzes the work of Schoenberg and Strindberg as exemplary of the new possibilities in 20th century music and identifies the medium of music as a historical phenomenon.
  • Adorno, T. W. (2014). Current of music. Cambridge, United Kingdom: Polity.
    • This is a collection of Adorno’s work on radio, much of it unpublished during his lifetime, which analyzes how radio determines possibilities for experiencing music.
  • Aristotle. (2014). Poetics. In J. Barnes (Ed.). Complete works of Aristotle: The revised Oxford translation (Vol. 2). Princeton, NJ: Princeton University Press.
    • This is the standard English translation of Aristotle’s analysis of tragedy as an art form.
  • Arnheim, R. (1957). Film as art. Berkeley: University of California Press.
    • Arnheim argues that film’s artistic potential is best realized by taking advantage of the features unique to the medium.
  • Atget, E., & Abbott, B. (1964). The world of Atget. New York, NY: Horizon.
    • This is a collection of the work of Eugene Atget, whose photographs of Paris streets, according to Walter Benjamin, exemplify artistic possibilities for photography as an artistic medium.
  • Baudelaire, C. (1980). “The modern public and photography.” In A. Trachtenberg (Ed.), Classic essays on photography (pp. 83–90). Stony Creek, CT: Leete’s Island Books.
    • In this essay, Baudelaire argues that photography cannot be an artistic medium because it does not engage the imagination appropriately.
  • Bazin, A. (1968). “The ontology of the photographic image.” In H. Gray (Trans.), What is cinema? (Vol. 1, pp. 9–16). Berkeley: University of California Press.
    • Bazin holds that photography satisfies once and for all the desire to preserve reality and thus opens up new artistic possibilities.
  • Benjamin, W. (1999). Little history of photography. In M. W. Jennings, H. Eiland, and G. Smith (Eds.), Selected writings (Vol. 2, Part 2, pp. 507–530). Cambridge, MA: Harvard University Press.
    • Benjamin describes the history of 19th and early 20th century photographic theories and practices.
  • Benjamin, W. (2003). The work of art in the age of its technological reproducibility: Third version. In H. Eiland and M. W. Jennings (Eds.), Selected writings (Vol. 4, pp. 251–283). Cambridge, MA: Harvard University Press.
    • In this seminal essay, Benjamin identifies the transition to technological reproducibility as a fundamental shift in the nature of art and argues that film constitutes a new mode of perception.
  • Blausius, L. (2006). Mapping the terrain. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 27–45). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful overview of the history of Western music theories.
  • Bower, C. (2006). The transmission of ancient music theory into the Middle Ages. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 136–167). Cambridge, United Kingdom: Cambridge University Press.
    • This essay describes the reception of ancient music theory during the European Middle Ages.
  • Braudy, L. (2002). The world in a frame: What we see in films. Chicago, IL: University of Chicago Press.
    • Originally published in 1976, Braudy describes the artistic possibilities particular to popular movies.
  • Burnham, S. (2006). Form. In T. Christensen (Ed.), The Cambridge history of Western music theory (880–906). Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives an overview of the emergence of musical form as a central theoretical category in the 18th and 19th centuries.
  • Carroll, N. (1985). The specificity of media in the arts. Journal of Aesthetic Education19(4), 5–20.
    • This essay is the earliest instance of Carroll’s critique of medium specificity.
  • Carroll, N. (1996). Theorizing the moving image. Cambridge, United Kingdom: Cambridge University Press.
    • In the opening chapters of this book, Carroll offers his most developed criticism of medium specific theories and his most sustained skepticism about the coherence of the concept of artistic medium.
  • Carroll, N. (2006). Philosophizing through the moving image: The case of serene velocity. The Journal of Aesthetics and Art Criticism64(1), 173–185.
    • In this essay, Carroll acknowledges the need for the concept of artistic medium in describing the experience of structural films such as Serene Velocity.
  • Cavell, S. (1969). A matter of meaning it. In Must we mean what we say?: A book of essays (pp. 213–237). Cambridge, United Kingdom: Cambridge University Press.
    • In this early essay, Cavell articulates an understanding of artistic medium as something discovered and explored by artists as they create.
  • Cavell, S. (1979). The world viewed: Reflections on the ontology of film. Cambridge, MA: Harvard University Press.
    • In this seminal work in the philosophy of film, Cavell articulates the medium of film as a succession of automatic world projections.
  • Cavell, S. (1981). Pursuits of happiness: The Hollywood comedy of remarriage. Cambridge, MA: Harvard University Press.
    • Cavell develops an account of movie genre as artistic medium by articulating Hollywood remarriage comedies as a distinct genre.
  • Cavell, S. (1996). Contesting tears: The Hollywood melodrama of the unknown woman. Chicago, IL: University of Chicago Press.
    • Cavell returns to the concept of genre as medium by developing an account of a companion genre, the melodrama of the unknown woman.
  • Cavell, S. (2014). The fact of television. In Themes out of school: Effects and causes (pp. 235–281). Chicago, IL: University of Chicago Press.
    • Cavell develops an account of television as medium by contrast with his account of film in The World Viewed.
  • Christensen, T. (2006a). Introduction. In The Cambridge history of Western music theory (1–26). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful précis of the history of Western music theory.
  • Christensen, T. (Ed.). (2006b). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This collection offers in-depth essays on various crucial aspects of the history of Western music theory.
  • Cox, J., & Ford, C. (2003). Julia Margaret Cameron: The complete photographs. Los Angeles, CA: Getty.
    • This is a comprehensive collection of Julia Margaret Cameron’s photography.
  • Curtis, W. J. (1996). Modern architecture since 1900. London, United Kingdom: Phaidon.
    • This is a good overview of architecture in the 20th century, including modernist architecture.
  • de Font-Reaulx, D. (2013). Painting and photography: (1839–1914). Paris, France: Flammarion.
    • This describes the complicated relations and lines of influence between painting and photography in the 19th and early 20th centuries.
  • Dewey, J. (2005). Art as experience. New York, NY: Penguin.
    • In his classic text on art as a form of experience, Dewey distinguishes between an artistic medium and the raw material out of which works of art are made.
  • Diderot, D. (1995). Diderot on art, volume 1. The salon of 1765 and notes on painting. J. Goodman (Ed.). New Haven, CT: Yale University Press.
    • This collection contains much of Diderot’s critical writing on painting.
  • Duncan, I. (2013). My life (revised and updated). New York, NY: Liveright.
    • Duncan’s autobiography also includes her reflections on her dance practices and commitments.
  • Eisenstein, S. (2014). Film form: Essays in film theory. Chicago, IL: Houghton Mifflin Harcourt.
    • This is a collection of Eisenstein’s essays on film technique and theory.
  • Eisner, W. (2008). Comics and sequential art: Principles and practices from the legendary cartoonist (Will Eisner instructional books). New York, NY: W. W. Norton.
    • Eisner offers an account of the principles underlying his approach to comics based on his decades as a comic artist.
  • Frascina, F., Harrison, C., & Paul, D. (Eds.). (1982). Modern art and modernism: a critical anthology. London, United Kingdom: Sage.
    • This is a good collection of critical writings about modernist arts.
  • Fried, M. (1980). Absorption and theatricality: Painting and beholder in the age of Diderot. Berkeley: University of California Press.
    • Fried offers an account of the artistic problematic articulated by Diderot and explored in 18th century painting.
  • Fried, M. (1998a). Art and Objecthood. In Art and objecthood: Essays and reviews (148–172). Chicago, IL: University of Chicago Press.
    • Fried’s essay argues that the turn from modernism to minimalism and conceptual art in the 1960s depends on a misunderstanding of what exploring an artistic medium can be.
  • Fried, M. (1998b). Manet’s Modernism: Or, the face of painting in the 1860s. Chicago, IL: University of Chicago Press.
    • In this book, Fried argues that Manet’s exploration of the artistic problematic he inherited from 18th and 19th century French painting constituted a form of modernism.
  • Fried, M. (2008). Why photography matters as art as never before. New Haven, CT: Yale University Press.
    • Fried argues that the cross-pollination of painting and photography has led to the most important artistic developments of the late 20th and early 21st centuries.
  • Greenberg, C. (1982). Modernist painting. In Frascina, F., Harrison, C., & Paul, D. (Eds.), Modern art and modernism: A critical anthology (5–10). London, United Kingdom: Sage.
    • Greenberg’s essay argues that modernist painting approaches the medium of painting on the model of Kantian criticism, in order to discover it in its purity.
  • Greenberg, C. (1984a). Modernist sculpture, its pictorial past. In Art and culture: Critical essays (158–163). Boston, MA: Beacon.
    • In this essay, Greenberg outlines the ways in which modernist sculpture distinguishes itself from painting.
  • Greenberg, C. (1984b). Art and culture: Critical essays. Boston, MA: Beacon.
    • This is a collection of many of Greenberg’s essays on the project of modernism in painting and sculpture.
  • Halliwell, S. (1998). Aristotle’s Poetics. Chicago, IL: University of Chicago Press.
    • This is an insightful critical analysis of Aristotle’s Poetics.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (1998). Art in theory, 1815–1900: An anthology of changing ideas. Oxford, United Kingdom: Blackwell
    • This is a comprehensive collection of 19th century art theory.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (2000). Art in theory 1648–1815: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of Western art theories prior to the 19th century.
  • Harrison, C., & Wood, P. (Eds.). (2003). Art in theory, 1900–2000: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of 20th century art theories.
  • Hegel, G. W. F. (1975a). Hegel’s aesthetics: Lectures on fine art (Vol. 1). (T. M. Knox, Trans.). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art begin with reflections on the idea of art and its relation to thought.
  • Hegel, G. W. F. (1975b). Hegel’s aesthetics: Lectures on fine art (Vol. 2). (T. M. Knox, Trans). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art conclude with reflections on the differentiation and development of particular art forms.
  • Herder, J. G., & Gaiger, J. (2002). Sculpture: Some observations on shape and form from Pygmalion’s creative dream. Chicago, IL: University of Chicago Press.
    • Herder’s reflections on the nature of sculpture develop a critical response to Lessing’s account of painting as inclusive of sculpture.
  • Horace. (1989). Ars Poetica. In N. Rudd (Ed.). Horace: Epistles book II and Ars Poetica (Vol. 2). Cambridge, United Kingdom: Cambridge University Press.
    • Horace describes the form of poetic experience as structured by imitation.
  • Joyce, J. (1986). Ulysses. New York, NY: Vintage Books.
    • Joyce’s classic work of modernism, first published in 1922, explores the history of literature and its forms.
  • Joyce, J. (1999). Finnegans wake. New York, NY: Penguin.
    • Joyce’s final work explores the nature of language and its expressive possibilities.
  • Kracauer, S. (1997). Theory of film: The redemption of physical reality. Princeton, NJ: Princeton University Press.
    • Kracauer’s account of film identifies the medium’s central feature to be its ability to connect us with reality.
  • Krauss, R. (1986a). The originality of the avant-garde. In The originality of the avant-garde and other modernist myths (pp. 151–170). Cambridge, MA: MIT Press.
    • Krauss’ essay calls into question the cult of originality underlying the critical reception of modernist art and articulates the theoretical framework for postmodernist responses.
  • Krauss, R. (1986b). The originality of the avant-garde and other modernist myths. Cambridge, MA: MIT Press.
    • This is a collection of Krauss’ critical essays arguing for postmodern artistic possibilities.
  • Lear, J. (1992). Katharsis. In Essays on Aristotle’s Poetics (pp. 315–340). Princeton, NJ: Princeton University Press.
    • Lear’s essay offers an insightful interpretation of the role of catharsis in Aristotle’s account of tragedy.
  • Lessing, G. E. (1962). Hamburg dramaturgy. (H. Zimmern, Trans.). Mineola, NY: Dover.
    • This volume collects Lessing’s theatrical criticism and contains his reflections on Aristotle’s account of tragedy.
  • Lessing, G. E. (1984). Laocoön: An essay on the limits of painting and poetry. (E. A. McCormick, Trans.). Baltimore, MD:Johns Hopkins University Press.
    • Lessing’s essay is arguably the classic work of medium analysis; in it, he distinguishes between painting and poetry as two different methods for imagining action.
  • Levenson, M. (Ed.). (2011). The Cambridge companion to modernism. Cambridge, United Kingdom: Cambridge University Press.
    • This is a collection of critical essays reflecting on modernist art.
  • Loos, A. (1998). Ornament and crime. In A. Opel (Ed.), Ornament and crime: Selected essays (pp. 167–177). Riverside, CA: Ariadne.
    • Loos’ essay is a polemic for architectural modernism and against unnecessary ornamentation.
  • McCloud, S. (1994). Understanding comics: The invisible art. New York, NY: William Morrow.
    • McCloud’s book is the articulation of the nature of the medium of comics by a contemporary comic artist.
  • Medium, n. and adj. 2016. OED Online. Web.
    • This article tracks the etymology and evolution of the term “artistic medium.”
  • Perkins, V. F. (1990). Must we say what they mean? Film criticism and interpretation. Movie34(5), 1–6.
    • In this essay, Perkins defends the critical value of the concept of medium and identifies the ability to capture minute but meaningful gestures at the heart of the medium of the movies.
  • Perkins, V. F. (1993). Film as film: Understanding and judging movies. Boston, MA: Da Capo.
    • Perkins’ book articulates normative standards specific to movies and draws on the concept of medium to do so.
  • Perron, B., & Wolf, M. J. (Eds.). (2009). The video game theory reader. New York, NY: Routledge.
    • This collection of recent essays of video game theory includes a number of essays that use the concept of medium in order to articulate the experiences specific to video game play.
  • Pudovkin, V. I. (2013). Film technique and film acting: The cinema writings of V. I. Pudovkin. Redditch, United Kingdom: Read Books.
    • This collection of Pudovkin’s writing on film theory includes many reflections on what is particular to the medium of film.
  • Rasch, R. (2006). Tuning and temperament. In T Christensen (Ed.). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives a helpful account of the emergence of temperament as a music theoretic category in the European music tradition.
  • Rorty, A. (1992). The psychology of Aristotelian tragedy. In A. Rorty (Ed.), Essays on Aristotle’s Poetics (pp. 1–22). Princeton, NJ: Princeton University Press.
    • This essay describes Aristotle’s views on the audience’s experience of tragedy.
  • Talbot, W. H. F. (1969). The pencil of nature. Boston, MA: Da Capo.
    • This essay is a reflection on the nature of photography by one of its inventors.
  • Trachtenberg, A. (1980). Classic essays on photography. Stony Creek, CT: Leetes Island Books.
    • This collection contains a number of important essays on photography and its possibilities from the 19th century.
  • Vertov, D. (1984). Kino-eye: The writings of Dziga Vertov. Berkeley: University of California Press.
    • Vertov’s writings reflect on the radical possibilities for contemporary perception offered by film.
  • Vertoy, D. (1998). Man with a movie camera [Motion Picture]. United States: Image Entertainment.
    • Released in 1929, Vertov’s experimental film explores the range of documentary possibilities for film.
  • Wack, D. (2013). Medium and the end of myths: Transformation of the imagination in The world viewedConversations: The Journal of Cavellian Studies, 1, 39–58.
    • This essay describes the transformation in the medium of movies that Cavell identifies in The World Viewed.
  • Wack, D. (2014). How movies do philosophy. Film and Philosophy, 18.
    • This essay argues that movies, documentaries, structural films, cartoons, and so on all constitute distinct artistic mediums and identifies the medium of the movies as structured around the apprehension of action.
  • Wellbery, D. E. (1984). Lessing’s Laocoön: Semiotics and aesthetics in the age of reason. Cambridge, United Kingdom: Cambridge University Press.
    • Wellbery’s book is an insightful and sustained interpretation of Lessing’s Laocoön essay.
  • Winckelmann, J. J., & Potts, A. (2006). History of the art of antiquity. Los Angeles, CA: Getty.
    • Winckelmann’s book on ancient art was widely influential in the 18th century and definitive for the development of art history as an intellectual discipline. 

 

Author Information

Daniel Wack
Email: dwack@knox.edu
Knox College
U. S. A.

Political Revolution

Revolutions are commonly understood as instances of fundamental socio-political transformation. Since “the age of revolutions” in the late 18th century, political philosophers and theorists have developed approaches aimed at defining what forms of change can count as revolutionary (as opposed to, for example, reformist types of change) as well as determining if and under what conditions such change can be justified by normative arguments (for example, with recourse to human rights). Although the term has its origins in the fields of astrology and astronomy, “revolution” has witnessed a gradual politicization since the 17th century. Over the course of significant semantic shifts that often mirrored concrete political events and experiences, the aspect of regularity, originally central to the meaning of the term, was lost: Whereas in the studies of, for example, Nicolaus Copernicus, “revolution” expressed the invariable movements of the heavenly bodies and, thus, the repetitive character of change, in its political usage, particularly stresses the moments of irregularity, unpredictability, and uniqueness.

In light of the marked heterogeneity of the ways in which thinkers such as Thomas Paine (1737-1809), J.A.N. de Condorcet (1743-1794), Immanuel Kant (1724-1804), G.W.F. Hegel (1770-1831), Mikhail Bakunin (1814-1876), Karl Marx (1818-1883), Hannah Arendt (1906-1975), and Michel Foucault (1926-1984) reflect on the possibilities and conditions of radically transforming political and social structures, this article concentrates on a set of key questions confronted by all these theories of revolution. Most notably, these questions pertain to the problems of the new, of violence, of freedom, of the revolutionary subject, the revolutionary object or target, and of the temporal and spatial extension of revolution. In covering these problems in turn, it is the goal of this article to outline substantial arguments, analyses, and aporias that shape modern and contemporary debates and, thereby, to indicate important conceptual and normative issues concerning revolution.

This article is divided into three main sections. The first section briefly reconstructs the history of the concept “revolution.” The second section gives an overview of the most important strands of politico-philosophical thought on revolution. The third section examines paradigmatic positions developed by theorists with respect to the central problems mentioned above. As the majority of thinkers who address revolution do not elaborate comprehensive theories and as there is comparatively little thematic secondary literature on the subject, this part proposes a framework for individually situating and systematically relating the differing approaches.

Table of Contents

  1. History of the Concept
  2. Three Traditions of Thought
    1. The Democratic Tradition
    2. The Communist Tradition
    3. The Anarchist Tradition
  3. Concepts of Revolution
    1. The Question of Novelty
    2. The Question of Violence
    3. The Question of Freedom
    4. The Question of the Revolutionary Subject
    5. The Question of the Revolutionary Object
    6. The Question of the Extension of Revolution
  4. Conclusion
  5. References and Further Reading

1. History of the Concept

In preparation for presentation of the different philosophical approaches to revolution in the following article, this section is concerned with providing a concise outline of the history of the concept. In so far as “revolution” is employed to describe political transformation, conceptual historians understand its origins to be genuinely modern. Critically informed by the experience of the revolutions in England, America, and France, the term in common usage designates the epitome of political change, that is, change not only in laws, policies, or government but in the established order that is both profound and durable. Earlier conceptions of political change are missing the notions of a people’s autonomous ability to act or of its right to emancipation. Further, the absence of two structural preconditions explains why revolution in the sense of fundamental politico-social transformation is not conceived prior to modernity. On the historical level, it is the formation of the “strong” state that is conducive to a political imagination of radical liberation from state oppression and the subsequent founding of an essentially different order. The extent of the Hobbesian type of the state’s disciplining power and the impossibility of direct political participation thus lay the ground for revolutionary projects. On the conceptual level, the supersession of cyclical conceptions of history as advocated by Aristotle, Polybius, Cicero, or Machiavelli by linear models of thought allows for the idea of irreversible progress in politics and society. In the course of this shift in historical thinking revolution is eventually looked upon as a catalyzing, even enabling factor of progress. Since history is no longer understood as dependent on forces beyond human control (such as, for example, divine providence), human agency comes to be regarded as the decisive factor in shaping its course (compare Koselleck, 1984 and 2004; for arguments that revolution, both as a concept and a phenomenon, does have pre-modern origins, compare Rosenstock-Huessy, 1993 [1938]; Berman, 1985).

The history of political thought largely attests to the assessment that the idea of revolution as structural, justifiable change is unknown prior to modernity. Aristotle’s reflections on political change (metabolé tes politeías) in books III and IV of Politics show that the alterations he takes into consideration do not amount to the complete breakdown of an existing order, its organizing hierarchy, and its principles of inclusion/exclusion. Despite certain arguable similarities to modern concepts (for instance, with respect to the element of violence), conceptual predecessors of “revolution” such as stasis and kinesis in the Greek tradition or seditio, secessio, and tumultus in the Roman tradition have strong negative connotations. In ancient and medieval political thought, they are primarily related to anarchy and civil war. Even in the works of an early-modern thinker like Machiavelli the idea of an absolute hiatus, a fundamental rupture on the continuum of politics is not developed fully. Although he is occupied with political change, key concepts related to the topic (most importantly, rinovazione, mutazione, and alterazione) are overridden by the conviction that all shifts as to forms of constitutions ultimately do not break out of a cycle of historical recurrence. In short, the notion of a world-shaping human “power to interrupt” and “to begin” (compare Merleau-Ponty, 2005 [1945]) and the corresponding “pathos of novelty” (compare Arendt, 2006 [1963]) remain alien to pre-modern thought.

In the 17th and 18th century, the discovery of revolution as a relevant political category is reflected and supported by political and moral philosophy. John Locke, in his Second Treatise on Civil Government (1689), develops an influential defense of the right of resistance, rebellion, and even revolution. Going beyond Thomas Hobbes’s considerations on a subject’s right to defend herself against the sovereign if her life is under threat, his social contract theory presents this protective right against stately coercion and oppression as a necessary political concretization of the individuals’ inalienable natural right to “life, liberty, and estate.” Jean-Jacques Rousseau, in the Discourse on the Origin of Inequality (1755) and the Social Contract (1762), aims at exposing the morally degenerate, politically illegitimate state of the Ancien Régime and proposing a liberal, egalitarian political and legal constitution to replace it. According to Rousseau, the “general will” ousts the particular will of the monarch as the guideline in politics, thereby implying that the people attain autonomy, sovereignty, and, thus, the status of full political subjectivity. Locke’s and Rousseau’s considerations thus importantly add to a revaluation of acts of protest and insurrection: Such acts can no longer be dismissed as the work of political offenders or public enemies as was the case prior to the undermining of the “political theology” of absolutism and feudalism, which was largely based on the doctrine of divine right (compare Kantorowicz, 1997 [1957]; Walzer, 1992). Instead, thanks to the political thought of the enlightenment in general and to Lockean and Rousseauian social contract and natural rights theory in particular, such acts can now be interpreted as an exercise of rationally and morally justifiable political self-determination. Although neither Locke nor Rousseau present elaborated theories of revolution, they develop positions that are inherently critical of any political order that is not built on the principles of consent and trust and, thus, potentially revolutionary. Their reflections on legitimate governance and on citizens’ rights go beyond earlier discussions of justified resistance to monarchs—such as the 1579 Vindiciae contra Tyrannos, published under the pseudonym Stephen Junius Brutus—, which rely on expertocratic leadership as opposed to political self-determination of the people. Their works thus prepare the ground for the two main ideas of the revolutionary age: “natural” human rights and national sovereignty (compare Habermas, 1990; Menke/Raimondi, 2011).

Resulting from a plethora of intellectual and material factors, the distinctly modern understanding of “revolution” takes shape on the eve of the historical revolutions of the late 18th century: It is both a “combat term” (R. Koselleck) in political praxis and an “essentially contested concept” (W.B. Gallie) in political theory. It is in the works of thinkers like Condorcet, Kant, or Marx that this contest is henceforth held and that the specific political and philosophical meaning of the term is spelled out, albeit in widely differing ways.

2. Three Traditions of Thought

Before turning to a detailed examination of important conceptual and normative issues concerning revolution, this section aims at giving an overview of three dominant lines of thought on revolution. Given the considerable discontinuities and breaks within each of these strands on the one hand and the numerous overlaps and interchanges between them on the other, the lines of thought presented here have to be understood as ideal types. Although it is likely that there are alternative perspectives, very few theories of revolution resist classification into one of these strands.

a. The Democratic Tradition

A primarily democratic strand of theory is influenced by the works of Locke, takes shape in Thomas Jefferson’s and J.A.N. de Condorcet’s thinking, and is further developed in Kant’s reflections on gradual, yet profound transformation. Throughout the 19th and 20th century, it is continued selectively in the late writings of Friedrich Engels or in Hannah Arendt’s and Jürgen Habermas’s considerations of the subject. This strand is characterized by a strong emphasis on non-violent, legal means and on politico-legal liberty and equality as the essential aims of revolution. Its representatives understand revolution as a continuing project or task that cannot reach a point of completion and satisfaction. Correspondingly, these thinkers, for the most part, reject notions of instantaneous rupture and absolute novelty whereby they undermine rigid distinctions between revolutionary and reformist change. Key elements of this tradition resonate in the work of a contemporary thinker like Etienne Balibar. He suggests an understanding of revolution as a progressive power that operates from within the democratic system. Instead of aiming at the radical overthrow of this system, democratic citizens assume the role of the revolutionary subject by advocating constant additions to and revisions of the existing order and its institutions—for example, an extension of what Arendt calls “the right to have rights” to non-citizens, increased possibilities for political participation, or a more consequent adherence to human rights—allowing for its continued legitimacy (compare Balibar, 2014).

b. The Communist Tradition

A primarily communist line of revolutionary theory begins with the works of Rousseau. This line is elaborated decisively in the thinking of Karl Marx and Friedrich Engels. Significant modifications notwithstanding, it is continued in the writings of Vladimir Lenin and Jean-Paul Sartre during the 20th century. The majority of its representatives share the belief in the possibility of revolutions being finalized and completed. Although they offer different suggestions as to justifiable forms and degrees of violence, they further share the idea that violence, in general, can function as an acceptable means of revolution. They also agree that the realization of material liberty and equality (as opposed to merely “formal,” that is, legal liberty and equality) in the social sphere are its main goals. As this sphere includes apolitical institutions such as the market, substantial revolutionary transformation cannot satisfy itself with abstract political principles but needs to affect the concrete conditions in which a society exists (for example, the relations of production). In addition, the notion of solidarity is central to these thinkers’ vision of revolutionary action and of a post-revolutionary society that is realized through these actions. Key elements of this strand of revolutionary thought shape the works of contemporary theorists such as Alain Badiou and Slavoj Zizek. Interpreting existing democratic orders as regimes of radical immanence, it is evident to them that genuine transcendence (a “communism to come”) has to manifest itself as a supersession of this order. To overcome the inherently bourgeois structures and discourses of power that are ceaselessly reproduced by late-capitalist democracies, radical disruptions are needed. Taking the form of acts of “terror” or “subtraction,” such disruptions express the “eternal truths” of the suffering of the masses (compare Badiou, 2012; Zizek, 2012).

c. The Anarchist Tradition

An anarchist tradition of revolutionary theory has its sources in 19th century America (Josiah Warren), France (Pierre-Joseph Proudhon), and in the thought of the Russian theorists Mikhail Bakunin and Peter Kropotkin. This tradition is later taken up in the works of, for example, Emma Goldman, Rosa Luxemburg, and Paul Goodman. Although these thinkers differ considerably in their assessment of revolutionary violence, they converge as to the crucial emancipatory aim of revolution: As any form of institutionalized authority is considered incompatible with human autonomy, their vision is the creation of a society independent of “imperial institutions” in the economic, social, and political realms. Consequently, they do not content themselves with a redistribution of political power, however radical, within the framework of the state, but aim at its abolition instead. David Graeber, in his contemporary reformulation of anarchism, describes the way in which the envisaged revolutionary abolition of vertical structures is linked to the emergence of new forms of horizontal relations, that is, of communal existence. These forms are no longer organized by the logic of dominance and of cost/benefit; instead, they are shaped by the principles of mutual aid and free cooperation, which are not guided by instrumental rationality (compare Graber, 2004).

3. Concepts of Revolution

The following section discusses central questions addressed in the works of theorists from these main strands: The questions of novelty, violence, freedom, the revolutionary subject, the revolutionary object or target, and the extension of revolution. As it is neither possible to comprehensively discuss relevant concepts of revolution proposed by political philosophers and theorists nor to comprehensively include thematic considerations of the theorists presented here, this section contents itself with highlighting certain crucial features. Since this article is concerned with concepts of revolution as developed by political philosophers and theorists, important historical (compare Furet/Ozouf, 1989; Hobsbawm, 1996 [1962]; Palmer, 2014 [1959]), sociological (compare Skopcol, 1979), and politological (compare DeFronzo, 2011) studies that primarily concentrate on the phenomenon of revolution, its empirical forms and causes, are not taken into account. Further, a number of theoretical explorations of revolution are also not taken into consideration. This applies to the works of partisans of revolution such as, for example, Georges Sorel or Georg Lukács as well as to the works of critics of revolution such as, for example, Edmund Burke, Jeremy Bentham, Joseph de Maistre, or Carl Schmitt.

The exclusive focus on the six questions mentioned above is justified by the fact that they constantly appear in the theoretical debates regarding revolution as criteria in determining (a) if and under what conditions political change can be considered as revolutionary and (b) if and under what conditions such revolutionary change can be considered as legitimate. Despite the differing historical settings as well as the differing political and philosophical commitments of the individual thinkers, these questions thus constitute the common themes that connect their heterogeneous approaches to revolution. For each of these questions, the intent is to display the extremes of the spectrum on which important theorists of revolution operate and to indicate paradigmatic stances they take on this spectrum. It is with the help of this analytical framework that the various approaches to revolution since its intellectual discovery can be individually situated and systematically related to one another: The original revolutionary experience in the context of the American and French Revolution as reflected in the writings of Jefferson, Paine, Sieyès, and Condorcet; its reception in German Idealism; the further development of revolutionary thought in different versions of Marxism; its application to the problem of colonialism in the 20th century; and, finally, contemporary debates about the relevance and meaning of revolution informed, among other things, by the crises of late capitalism and representative democracy.

a. The Question of Novelty

The question of novelty pertains to the degree of revolutionary transformation and to the mode in which such transformation is achieved. Whereas some theorists of revolution argue that the post-revolutionary state needs to be absolutely new and different in comparison to the pre-revolutionary state, others hold that revolution is conceivable as a realization of relative novelty. Although some theorists argue that transformation needs to take place in a historically disruptive or discontinuous fashion in order to be revolutionary in character, others hold that effective revolutionary change can unfold in a continuous or stepwise manner.

For Thomas Paine, there can be no doubt that the American revolutionary struggle for independence from colonial rule, understood as a practical application of enlightenment thought, amounts to a radical break in history. According to his remarks, the liberation of the colonies from monarchical government must be seen as the unique and irreversible establishment of a fundamentally new political order. Employing nature as a timeless criterion for revolution, he describes monarchy not only as an anachronistic, unjustifiable “absurdity” but as a grave violation of natural law. In Paine’s view, its supersession by consent-based, liberal, and egalitarian republicanism is therefore tantamount to “begin[ning] the world over again” (Paine, 2000: 44). In contrast to Paine’s considerations that often oscillate between conceptual analyses and calls to revolutionary action (and, thus, indicate the difficulty inherent to addressing the subject of revolution in an objective, non-partisan manner), his contemporary Condorcet suggests an understanding of revolution that is not informed by a comparatively strong concept of novelty. Condorcet’s understanding becomes particularly apparent in his stance towards the trial and execution of Louis XVI. Rejecting the extra-legalism advocated by, among others, Robespierre and Saint-Just, he develops a theoretical position that argues for the compatibility of profound change of the political system and historical continuity: For him, the largely unprecedented challenge of bringing the king to court can only be met by taking recourse to elements of previous politico-legal systems. What is more, it is precisely such elements that—under the condition that they are not just imitated, but innovatively rearranged—make the necessary “regulation” of revolutionary dynamics possible and, thus, guarantee revolutionary progress (compare Condorcet, 2012; Walzer, 1992). Instead of interpreting novelty in terms of the political creation of a “new world” without historical parallel, the new, here, is comprehended in terms of a reconfiguration of constitutive parts of the old, that is, of the pre-revolutionary world.

As represented here by Paine and Condorcet, the axis of the new, crucial for conceptually grasping revolution, runs between the extremes of absolute and relative rupture or inception. The ends of this spectrum are reflected in numerous later theories of revolution. For instance, Friedrich Engels (1820-1895), in late works such as, for example, his introduction to the reprint of Marx’s The Class Struggles in France, describes revolutionary struggle as ongoing and procedural in character. For Engels, this struggle cannot be detached from existing political, legal, and economic conditions, meaning that radical revolutionary breaks or leaps are inconceivable. As his moderate understanding of the new allows for minor modifications of the state of affairs to be labeled as revolutionary, it is inclined to tie revolution closely to reform. This propensity is reflected in his programmatic idea of a re-appropriation of universal suffrage, which turns it from a means of bourgeois dominance into an ultimately revolutionary means of proletarian liberation. As opposed to Engels’s approach to the question of the new, Walter Benjamin (1892-1940), in On the Concept of History, propounds an understanding of revolution as a state of exception in which the continuum of history is “burst open.” According to his “messianic” concept of novelty, revolutions are unforeseeable, kairological events that suspend the regular, chronological order of time: They constitute a leap into an epoch that is incommensurable with what has previously existed.

Immanuel Kant, in his thoughts on revolution, attempts to avoid similarly one-sided answers to the question of the new. Rather, his complex considerations on progressive transformation aim at undermining the dichotomy between either emphatic or deflationary notions of the new by closely associating “complete change” or “complete revolution” (völlige Umwälzung) and “thorough reform” (gründliche Reform) (compare Kant, 2006c [1795/96]). Yet, Kant’s remarks on the subject of political or politico-moral change—scattered over writings such as What is Enlightenment?, Toward Perpetual Peace, The Metaphysics of Morals, and The Contest of the Faculties—seem marked by a tension between a reformist bias and revolutionary tendencies. Whereas the former is expressed in his privileging of enlightened monarchs such as Frederick II of Prussia as the agents of change or in his explicit criticism of the French Revolution on the grounds of excessive use of violence, the latter becomes apparent in his comments on the “enthusiasm” with which contemporary Europeans observe the revolutionary events in France or in his reflections on the radical switch from “despotism” to “republicanism,” that is, from the old absolutist order to a new order of freedom and morality. Kant evidently considers the difference between the two types of order to be tremendous: An order responsible for the heteronomous subjugation of the individual by the ruler is overcome by an order primarily characterized by the proliferation of individual autonomy and political participation as well as the decrease of armed conflict and war. Kant appears to resolve what presents itself as a tension between differing, even incompatible concepts of the new by taking into account the specific temporal constitution of profound political change: For him, such change is grasped adequately only as a process that is mediated in multiple ways, but not as a sudden gestalt switch. Rejecting the sharp, static dichotomy between relative and absolute novelty (and, with it, the dichotomy between reform and revolution) and integrating the two instead, Kant shows that there is no necessary interdependence between the suddenness and the depth of political change. He thus does not accept the common assumption among theorists of revolution and active revolutionaries alike that only abrupt, immediate transformation can count as profound and progressive in a relevant sense. Although republican states, according to Kant, are fundamentally different from despotic states the principles of which are superseded entirely, the emancipatory transition from heteronomy to autonomy is achieved stepwise. Kant’s idea of “complete change” reflects his teleological understanding of history as an imperfect, yet steady development “from worse to better” as expounded in his considerations on the conditions of the possibility of progress in Idea for a Universal History from a Cosmopolitan Perspective and Conjectural Beginning of Human History; it crystallizes in concepts such as “gradualness” (Allmählichkeit) and “approximation” (Annäherung) used by Kant to illustrate his notion of progressive transformation. It follows that, with Kant, the new can impossibly be conceived in theologically charged terms of the miracle or the “event.” Yet, the terminal phase of this gradual, indeterminate transition, for him, does mark the inception of a genuinely new age in the history of humanity, which is not only “an age of ‘enlightenment’ but ‘an enlightened age’” (compare Kant, 2006a [1784]). Politically, the latter manifests itself in consent-based republican systems essentially guided by the humanity formulation of the Categorical Imperative and, thus, in a “political body the likes of which the earlier world has never known” (Kant, 2006b [1784]: 14).

Within the theoretical debates, further problems arise that are immediately tied to the question of revolutionary novelty. For instance, several theorists of revolution do not merely reflect upon the new in terms of its degree and its mode. Instead, they also investigate its sources: The new is conceived as a result made possible by acts of re-appropriation (as expressed, for example, in Jefferson’s recourse to classical antiquity), by acts of reconfiguration (as expressed, for example, in Condorcet’s approach to assembling individual elements of various previous and present legal systems), or by acts of creation (as expressed, for example, in Bakunin’s idea of creative destruction by revolutionary “bandits”).

b. The Question of Violence

The question of violence pertains to legitimate means of revolutionary transformation. While some thinkers of revolution approve of violence as an essential vehicle for bringing about radical change and assert its creative capacities, others advocate its unreserved exclusion from the realm of progressive politics and make recourse to right and law instead. Again, numerous intermediate positions between the extremes of permissive and prohibitive attitudes toward violence can be found in which theorists try to identify specific conditions under which the use of violence is legitimate (for example, if violence contributes to a measurable increase in freedom) or to determine specific forms of violence that are justifiable (for example, violence against property). In addition, this section focuses on prevalent strategies for justifying revolutionary violence with recourse to, among others, utilitarian and politico-theological arguments.

Anarchist theorist and activist Mikhail Bakunin, in his thoughts on radical socio-political transformation, stresses the creative power of humans in general and the creative potential of violence in particular. For him, revolution begins with the forcible destruction of the old (statist) order, which prepares the “fertile” ground for a fundamentally new (non-statist) order (compare Bakunin, 1990 [1873]). Even though Bakunin declares the institutions that constitute the political and economic centers of power to be the primary target of acts of revolutionary “bandits,” he holds that such violence can also legitimately affect the persons who are present at these centers. In order to justify the use of revolutionary violence Bakunin argues for an understanding of such violence as reactive and necessary: Confronted with the repressive violence of the state, its police and military units, partisans of the “social revolution” must resort to violence. In his view, such violence is justified both as an act of self-defense and as a means of a progressive politics that transcends a deeply unjust status quo in which autonomy is made impossible by the existence and the authority of the state. Thus, for Bakunin, violence is not merely an extreme alternative in case non-violent (for example, legal) vehicles of transformation fail. Instead, it is an inherent factor of revolution. In his comments on revolution, provoked by the experience of the Iranian Revolution, Michel Foucault agrees with this assessment insofar as he considers manifestations of violence an important motor of transformative politics (compare Foucault, 2005 [1978-79]). Based on irreconcilable concepts of the political and further fueled by resentment, intolerance, and hatred, a quasi-Schmittian fighting position between “friends” and “enemies” of the revolution, that is, between the supporters of the “saint” (Ayatollah Khomeini) and the “king” (Shah Reza Pahlevi) emerges. This fighting position, for Foucault, is to be seen as an inevitable element of radical change. Despite his constative judgment that violent conflict essentially enables revolutionary dynamics, he does not present an elaborate justification of revolutionary violence.

Contrary to Bakunin and Foucault, Kant understands violence as neither a necessary nor a justifiable element of revolution. Not only do his remarks reveal a pronounced reservation resulting from empirical observations of the cruelties committed in the course of the revolution in France (cf. Kant, 1991 [1798]). What is more, his rejection of the idea that violence could be considered a legitimate means of progress is a matter of principle. His position becomes particularly manifest in his reflections on the trial against Louis XVI as presented in the Doctrine of Right (compare Kant, 1996 [1797]). From the standpoint of his practical philosophy, there can be no doubt that the execution of the previous monarch is not acceptable. For Kant, this form of legally regulated and sanctioned regicide differs from historically well-known simple regicide, that is, the killing of a king on impulse or motivated by political power strategies: For in the trial, the established political principle of the inviolable nature of sovereign power is undermined and ultimately replaced by the principle of violence. Since the prosecution, in trying and finally executing the former king of France, does not appeal to a singular, exceptional situation but, instead, lends general juridical character to it, violent revolutionary insurrection against the sovereign is turned into a principle or Grundsatz of politics. To understand the right to violent resistance and revolution as a political principle (as is the case in the trial), for Kant, anticipates the Great Terror of 1793/94 (for a similar critique of the trial and execution of Louis XVI, compare Camus, 1991 [1951]). More importantly, it passes off the violent protest against sovereign governments as generally permissible and problematically normalizes it. As a consequence of the legalization of permanent insurrection, the consolidation of political (and, with it, moral) order is considerably complicated while civil disorder and war, in Kant’s view the key impediment to politico-moral progress, become the rule. A comparably unambiguous rejection of violence as an instrument of revolution can be found in Arendt’s On Revolution where she describes violence as a “limit” of the realm of the political: For her, the revolutionary praxis of violence (as exercised in the revolutions in France and Russia) as well as theoretical justifications of revolutionary violence (as given by, for example, Bakunin) are inherently anti-political.

Condorcet is one of the thinkers who neither understands violence as an integral part of revolution and gives carte blanche to its use nor completely rules out that it can serve as a justifiable means in processes of radical transformation. His intermediate position crystallizes in his considerations on the trial against Louis XVI: Representing the standpoint of the Girondins, he argues that the charges against the former king (or, rather, the “citizen Louis Capet”) cannot be based on “enmity” as suggested by Jacobins like Robespierre and Saint-Just, but have to refer to “treason” instead. The binary logic of the Jacobins according to which any monarch has to either rule or die and their corresponding attempt to apply the laws of war in the trial against the king are thus curbed. The position suggested by Condorcet allows for an at least tentative maintenance of the rule of law and of the validity of principles of justice. Like any other laws and measures, revolutionary laws and measures as developed in the course of the trial are subject to the rules of justice (compare Condorcet, 2012). In stark contrast to the Jacobins’ enthusiasm for unrestricted, extralegal, and decisionist self-authorization, what is emphasized here is the necessity of revolutionary self-restraint. According to Condorcet, the exceptional, unprecedented situation of the revolutionary trial has to be modeled on the ideal of due process of law if it is to remain distinguishable from mere revolutionary terror. Thus, revolutionary violence as it manifests itself in the eventual execution of the former king is not categorically rejected. However, it can only be considered as justified if it is legally channeled and, as a result, compatible with certain demands of justice. Insisting on the significance of revolutionary justice (however imperfect in its practical realization) in the exercise of legally qualified violent acts, Condorcet avoids the common opposition of either violence or law as the decisive tools of transformation. On the one hand, this treatment of the representatives of the old system, in not suspending the law, sets an example for the new order and for the way in which it interprets law and justice. It thus contributes to the transformation of revolutionary violence into legitimate authority. On the other hand, this treatment of the collapsed regime contributes to facilitating the peaceful co-existence of partisans and opponents of the revolution in a post-revolutionary society: Instead of declaring the former king to be a “moral monster” to be immediately “annihilated” and instead of declaring war against supporters of the monarchy and all other “enemies of freedom” as suggested by Robespierre and Saint-Just, Condorcet’s insistence on legal equality aims at finding peaceful trading zones and common ground between the factions so that previous political opponents can be repositioned as potential future partners.

Intermediate positions between the extremes of approval and rejection of violence as an instrument of revolution are also developed by Walter Benjamin, Herbert Marcuse, and, more recently, by Slavoj Zizek. With regard to the question of justification, these thinkers propose alternatives to Condorcet’s idea of legalized and, thus, legitimate revolutionary violence. Benjamin, taking recourse to political theology, interprets and justifies revolutionary movements as inner-worldly manifestations of unmediated “divine violence” that overcomes the oppressive “mythical violence” exercised by the state. With respect to the content and effect of “divine violence,” Benjamin’s remarks remain sketchy. On the one hand, the notion can be taken to imply the use of force against representatives of the state’s “mythical” authority; on the other hand, it can be interpreted as resulting in a fundamental transformation of the law which becomes critical of itself by recognizing and counter-balancing its inherent violent potential. At any rate, revolutionary movements, for Benjamin, represent a form of justice that incommensurably exceeds the existing legal order. If they are successful, they cathartically suspend the “serfdom” and “barbarity” characteristic of human history and realize the possibility of the fundamentally new (compare Benjamin, 1999 [1921]). Marcuse (1898-1979), in contrast, proposes a quasi-utilitarian justification of revolutionary violence. In Ethics and Revolution (1964) he argues that only a “brutal calculus” can determine whether a specific revolutionary project is legitimate. The suggested calculus amounts to a cost-benefit analysis of the probable number of victims on the one hand and the probable gains in human progress on the other (in terms of, for example, tolerance or human rights). For Marcuse, the historical events in England, America, and France prove the dialectical character of revolutionary violence, that is, the fact that violent conflict can contribute decisively to substantial economic and social, political and moral improvements. However, he insists that such violence is justifiable only if its use (a) is directly and recognizably tied to specific moral goals and (b) ceases at the earliest possible stage of the revolutionary process. Zizek (1949) attributes a central role to violence as an instrument to break out of the absolutely imminent “deadlock” represented by the current order of liberal democracy and market economy. His reflections concentrate on the revolutionary capacities of passive forms of violence, which he presents as particularly justifiable. Most importantly, he suggests a “Bartlebian politics” of refusal and withdrawal that undermines the discursive power of the dominant system. Such a politics, which has an expressive, communicative function, rejects the prevailing “hegemonic” language and counters the existing system’s power to name with subversive silence. For Zizek, political forms of direct non-action, guided by Bartleby’s maxim of “I would prefer not to” allow for a first negative step in the revolutionary process in creating a “vacuum” of effective power which, in a second step, can be filled with positive content. Thus, in arguing that, in the present circumstances, “doing nothing is the most violent thing to do” (an idea that also informs the traditions of strikes, pickets, and silent vigils), he designates radical non-action as a justifiable mode of revolutionary violence (compare Zizek, 2008).

Debates within and around contemporary movements with fundamentally transformative social and political agendas attest to the continued significance of violence, of its permissibility and justifiability, as the central normative problem in the context of revolution. Supporters of the Occupy movement deny the legitimacy of physical violence and, in particular, of physical violence directed against persons, as a means of revolutionary change. Instead, they largely subscribe to a “Bartlebian” revolutionary politics of non-violent violence, that is, a politics of subversive silence and, respectively, creative re-naming. The adherence to this kind of inactive, discursive violence was expressed performatively during the 2013 Gezi Park protests in Istanbul. Whereas the “standing man” actions enacted a “bodily politics” of obstruction (compare Butler, 2015) and an attitude of refusal through silence and passivity, the derogative term çapulcu (looter, marauder) used by government officials to discredit the protesters was creatively appropriated by them and re-interpreted as a honorific title. In Egypt, supporters of the Arab Spring movement took recourse to certain strands within the Islamic legal tradition when considering the question of violence. It was not only in terms of human rights and democratic governance but also in terms of the Islamic law of rebellion and of war that the question of violence was discussed. Although the positions of the main legal schools of thought differ considerably in their assessment of the question, there is a pronounced tendency to attempt to avoid or, at least, limit violence in internal conflicts and to consider it justifiable only if all other means of bringing about change have been exhausted (compare El Fadl, 2006; Al Dawoody, 2011).

c. The Question of Freedom

The question of freedom pertains to the primary objective of revolutionary transformation. Here, the spectrum established by theorists of revolution spans between the poles of freedom as liberation from oppression (that is, negative revolutionary freedom) and of freedom as the foundation and realization of a new political order (that is, positive revolutionary freedom).

Post-colonial theorist Frantz Fanon (1925-1961), in his reflections on revolutionary change, primarily concentrates on the aspect of liberation. For Fanon, whose work attests to the de-Europeanization of revolution during the 20th century, decolonization is to be understood as a process of “rehabilitation” of the suppressed that importantly implies a justifiably violent moment of radical riddance of the structural cornerstones of political, social, economic, and cultural domination and exploitation. Revolutionary liberation thus leads to the creation of a “tabula rasa,” which is the precondition for the subsequent development of a new institutional order and, what is more, the emergence of “sovereign” forms of post-colonial subjectivity (compare Fanon, 1967 [1961]). A comparable focus on revolutionary freedom as freedom from oppression characterizes the thinking of critical theorist Herbert Marcuse. For him, breaking free from the existing order is the essential element of a revolution. He argues that in light of the extent to which an inherently “repressive” socio-political order, the order of late capitalism, clearly dominates, strategies of resisting and undermining have to be considered before anything else. As Marcuse makes clear in Ethics and Revolution, such strategies of liberation do not only include forms of passive resistance as indicated in the concept of “the great refusal” but also the use of violence. Both can serve as a means to unsettle the systemic “paralysis” or blockage of human needs and potentials in industrialized Western societies. Consequently, Marcuse’s understanding of freedom is shaped by the idea of emancipation from a system of extreme immanentism that produces entirely controlled, uniform, “one-dimensional” humans. In spite of the emphasis on revolutionary freedom as liberation from prevailing modes of materialistic existence and instrumentally rational thought, he also points to a more positive notion of freedom: With explicit recourse to the thought of Jean-Paul Sartre, he discusses the necessity of “projects” that allow for forms of free (for example, artistic) action to be released (compare Marcuse, 1991 [1964]).

As opposed to Fanon and Marcuse, Hannah Arendt holds that the content of revolutionary freedom is “participation in public affairs,” that is, the positive freedom to act politically. In historical terms, this kind of freedom is exemplified for Arendt in the American Revolution where the foundation of a new political constitution—a republican constitution which codifies participatory citizenship—is itself achieved by participatory, autonomous “speech and action.” Although Arendt admits that an element of negative freedom is integral to thorough transformation, she is unequivocal in qualifying the “desire for liberation” as an insufficient objective of revolution if the latter is to be genuinely “political” as opposed to merely “social.” Employing the term “political” in a normative rather than a descriptive way, she appropriates the Aristotelian distinction between “political” and “despotic” forms of constitutional order and transposes it to the problem of revolutionary disorder. Consequently, in Arendt’s view, not every revolution can automatically be considered political. Instead, processes of profound, sustainable transformation have to meet certain conditions if they are to be labeled as political. The delineation Arendt suggests is essentially based on two criteria: For her, a revolution is apolitical or even anti-political if (a) what she calls “the social question” is its essential driving force and if (b) violence plays a central role in bringing about a new order (compare Arendt, 2006 [1963]). Similarly, Thomas Jefferson (1743-1826), himself a central intellectual and political figure of the American Revolution, insists on the importance of positive aspects of revolutionary freedom. This becomes apparent when he directly relates the idea of an “empire for liberty” to the notion of “self-government.” It is underlined in his remarks on resistance and rebellion: Despite their potential legitimacy and their “refreshing” effects on the “tree of liberty,” such attempts to be free from forms of “despotism” and “tyranny” remain insufficient in that they fail to found an alternative order that reliably rests on a constitution conducive to the realization of “life, liberty, and the pursuit of happiness” (compare Jefferson, 2004).

Karl Marx endeavors to relativize the opposition between either negative or positive freedom as definitive of revolutionary freedom. For him, revolution has to be conceived as a temporal process spanning over different stages. Thereby, an element of liberation plays a crucial role at the beginning of radical change insofar as it contributes to the liquefaction of an existing, oppressive system (such as the system of bourgeois, capitalist “class rule”). However, Marx’s theory of revolution expounds that this deconstructive element needs to be complemented by a reconstructive element once, in the later stages of the revolutionary process, the solidification of its transformative dynamics, that is, the formation of a new system becomes the essential task. The final paragraph of the 1848 Communist Manifesto paradigmatically reveals Marx’s (and Engels’s) understanding of revolutionary freedom as necessarily encompassing both negative and positive moments: The communist revolution casts off the “chains” as well as it “wins” a new, classless “world.” According to Marx, such a world makes possible the exercise of “real freedom” in the positive sense of individual “self-realization” that is embedded in a community and manifested in labor. As already stated in On the Jewish Question (1843/44), freedom thus understood differs from the bourgeois conception of freedom which is based on a “monadic” view of humans who only relate to each other in terms of competition. Marx argues that under the guise of this strictly individualist and merely formal kind of freedom, it is exclusively capital, not humans that can be considered as free. Thus, it is the idea of commitment, grounded in the communal and practical orientation of his notion of “self-realization,” which, against the background of his criticism of capitalist society, characterizes Marx’s concept of post-revolutionary freedom. In his understanding, the indeterminacy or openness of this concept as regards content guarantees that the spontaneity constitutive of freedom is not prefigured and, thereby, inhibited or even suppressed: For Marx, it is evident that the precise results of authentically free human action and interaction cannot be predicted. Thus, the significance of his vision of a future free society, in which the difference between oppressors and oppressed is overcome, is underlined in his deliberate refusal to further specify its shape.

d. The Question of the Revolutionary Subject

The question of the revolutionary subject pertains to the primary agent of radical transformation. Here, the spectrum ranges from history unfolding largely independent of man’s decisions and actions on the one end to autonomous, history-shaping man on the other. In the latter case, the agent can take a variety of forms ranging from exceptional individuals to a transnational “multitude,” from a distinct avant-garde to an amorphous crowd.

G.W.F. Hegel’s concept of revolution is thoroughly determined by his concept of history. Radicalizing Kant’s teleological conception, Hegel understands history as a rational process in which the “idea of freedom” successively realizes itself. According to his macro-perspective, this progressive development, the self-actualization of objective “spirit,” unfolds based on the principle of dialectics. It becomes manifest in the “oriental” civilizations of China, India, and Persia, in ancient Greece, in the Roman Empire, and, finally, in the “Germanic” age of reformation and enlightenment which supersedes the “dark night” of the Middle Ages, Renaissance, and the era of feudalism (compare Hegel, 1991 [1832-45]). From this it follows that the revolutions in the United States and France or the 1791 slave uprising in Haiti on which Hegel comments have to be interpreted as indicative of the current stage of development of the idea of freedom. As a consequence, revolutions, for Hegel, cannot be “made” by humans as autonomous agents. Rather, they mark epochal transitions in the “necessary” progression of history, which finds expression in the thoughts and deeds of humans. Hegel’s remarks on the French Revolution reveal that revolutionary achievements (most importantly, man’s historically unparalleled attempt to govern reality through ideas) and revolutionary failures (most importantly, the “abstract,” “subjective,” and, thus, deficient understanding of freedom which leads to the Terreurs) are to be seen primarily as reflections of the imperfect level reached by “spirit” thus far (compare Hegel, 1977 [1807]).

In opposition to Hegel’s accentuation of the progressive dynamics inherent to history, a wide range of theorists emphasize the principal role of human action with regard to the question of revolutionary subjectivity. However, these thinkers suggest various concretizations of man as the driving force of profound transformation: Bakunin emphasizes the world-changing potential of individual “bandits” (compare Bakunin, 1990 [1873]); Lenin points to a revolutionary avant-garde of limited size (compare Lenin, 1987 [1902]); Foucault attributes this role to the entirety of a people united by an experience of “political spirituality” (compare Foucault, 2005 [1978-79]); Fanon understands revolutionary subjectivity to be actualized by the “wretched” victims of colonialism (compare Fanon, 1967 [1961]; Sartre, 1967); Marcuse sees the heterogeneous group of the marginalized and “hopeless” both within and without Western societies as the key agent of revolution (compare Marcuse, 1991 [1964]); finally, contemporary theorists like Michael Hardt and Antonio Negri present a global “multitude” as the only political unit capable of realizing a revolution against the system of late capitalism (compare Hardt/Negri, 2004; Negri, 2011).

In Marx’s thought, the dichotomy between the idea that revolution is the effect of history’s independent development and the idea that revolution is the immediate product of human action is put into question. On the one hand, Marx’s position is strongly influenced by Hegelian philosophy: Despite modifying Hegel’s dialectics materialistically, he reiterates the thought of an internal logic to history (for Marx, the logic of “class struggle”) on the basis of which all processes of transformation can be explained as “necessary.” Yet, on the other hand, a specific social class is needed to concretely carry out such processes. In the historical context of the 19th century this social class is the “proletariat,” which is presented as the decisive factor of revolutionary change (compare Marx/Engels, 2012 [1848]). Thus, although Marx and Engels hold that revolution cannot be “made” thanks to human will and action alone, it cannot become manifest without human will and action. With respect to the problem of the revolutionary subject, a similar interplay between history’s inaccessible movement and self-determined human agency is described by theorists concerned with the kairos, that is, the right moment or timing for radical change. Rousseau, for instance, argues that specific historical constellations (“crises”) are necessary for humans (here, a people) to successfully initiate revolutions (compare Rousseau, 2012 [1762]). For Jefferson, such constellations—“precious occasions” beyond human planning and control—are the precondition for successfully consolidating the progress thus far achieved by bringing to a halt the revolutionary dynamics before it escalates into continuing violence and irreversible political, social decomposition (compare Jefferson, 2010). In both cases, human will and action is autonomous. Yet, according to Rousseau and Jefferson, revolutionary subjectivity is strongly affected and limited by what historical situations grant or deny respectively.

Further questions arise once theorists have identified man as the subject to actively make revolution. For instance, it is to be determined whether the revolutionary subject’s capacity to act in a world-transforming way is the result of minute “organization” as argued by Lenin for example, or whether it emerges “spontaneously” as, for example, Kropotkin claims. Another debate in this context concerns the driving motivational forces behind revolutionary subjectivity. Here, some theorists emphasize material, that is, social or economic factors, while others understand immaterial, that is, intellectual or spiritual factors, to be decisive. This tension between “being” and “consciousness” is reflected in the controversy between Jean-Paul Sartre and Maurice Merleau-Ponty: Whereas the former understands the revolutionary subject’s actions as caused by a concrete material “situation” of oppression (compare Sartre, 1955 [1946]), the latter insists that such actions constitute a form of “significance” (“Sinn-gebung”), that is, a form of freely creating meaning through revolutionary projects, which is irreducible to materialist causality (compare Merleau-Ponty, 2005 [1945]). Finally, the positions diverge with respect to the attitudes that are considered particularly conducive to effective individual or collective revolutionary action. Foucault, based on his observations of the overthrow of the Shah, underlines the influence of the “profane register” of indignation, resentment, even hatred that crucially fuels the revolutionary movement in Iran (compare Foucault, 2005 [1978-79]). Pointing to the deeply transformative political projects of Mahatma Gandhi, Martin Luther King, and Nelson Mandela, Martha Nussbaum attributes their success to an attitude that overcomes negative, destructive emotions and is committed to “non-anger” instead (compare Nussbaum, 2013). In her view, this mental commitment to non-anger is more decisive for revolutionary justice and for post-revolutionary reconciliation between former opponents than the practical commitment to non-violence.

e. The Question of the Revolutionary Object

The question of the revolutionary object pertains to the primary target of revolutionary change. Two predominant strands can be distinguished: While some theorists hold that revolutions should primarily aim at converting the attitudes, convictions, belief systems and world-views of individuals, others argue that the material, institutional frameworks within which humans act and interact constitute the main object or site of revolutionary change. Once more, a variety of positions can be found in between these extremes. Such positions hold both dimensions not only to be necessary conditions of radical change but also to mutually affect each other.

Fanon is one of the thinkers who argue that revolution cannot be limited to a remaking of the external world, that is, to the establishment of a different political, economic, social, and cultural order. Instead, full transformation is only achieved by an internal process of “creation” in which the carriers of the revolution, individually as well as collectively, re-humanize themselves in their struggle for liberation from systemically de-humanizing colonial rule. According to Fanon’s politico-psychological theory of revolution, the inner sphere of attitudes towards oneself, one’s community, and one’s former oppressors is the essential locus of revolutionary change: It is there that a radical transformation of the revolutionaries’ status occurs which turns them from an “animalized” and “objectified,” anonymous and disposable mass into “sovereign” subjects capable not only of self-determination but also of self-respect (compare Fanon, 1967 [1961]).

In contrast, the anarchist theorists Mikhail Bakunin and Peter Kropotkin (1842-1921) point to the institutional conditions as the main target of the “social revolution” they advocate. In their understanding, it is above all the institution of the state that has to be destroyed if freedom, morality, and solidarity are to be realized among humans: Being a source of “artificial” authority, any state, independent of its specific form, makes the unrestricted, free flourishing of men impossible (compare Bakunin, 2009 [1871]). Therefore, conquering freedom in its totality is tantamount to establishing an order that abolishes every political or religious institution that exercises authority. Such a society organizes itself according to the principles of decentralization, social diversity, and horizontal interconnectedness, which allow for harmony and happiness on both the subjective and inter-subjective level (compare Kropotkin, 2008 [1892]). This line of thought, which emphasizes the primacy of institutional transformation, is also represented by Kant. Far from suggesting the abolition of the state, however, Kant marks the essential institutions of the state—its politico-legal constitution and system of law—as the decisive lever to unhinge despotism and promote progress with respect to freedom, rationality, and morality in a process of “complete revolution.” In his view, a program of political pedagogy that aims at directly transforming the way in which humans understand themselves and the world is not only empirically unreliable, but also categorically insufficient. What is needed instead is a progressive shift as to systemic conditions that make it possible for a “spirit of freedom” to unfold successively. It is conditions founded on principles of right that will eventually lead to a fuller realization of the individuals’ moral and rational potential (compare Kant, 1991 [1798]; 1996 [1797]).

Insisting on the comprehensive character of revolution, Rousseau, when thinking about its adequate object or target, attempts to avoid comparable predeterminations. He argues that both the modus operandi of individual humans (that is, their ways of thinking, feeling, and acting) and of political institutions (that is, their ways of being structured and of acting upon citizens) has to be tackled for thorough transformation to occur. Consequently, if “moral” and “civil” liberty and equality are to be realized, it takes the contribution of education, as elaborated in Emile, as well as of institutional restructuring, as elaborated in the Social Contract: According to Rousseau, both the individual and the framework of politico-legal institutions constitute necessary targets of revolutionary change. Rousseau’s considerations thus underline the interdependence of both transformative dimensions.

f. The Question of the Extension of Revolution

This question pertains to (a) the temporality or, more narrowly, the duration and (b) the expansion of revolutionary transformation. Theorists dissent considerably as to whether such transformation has to be conceived as momentary, procedural, or permanent; they also disagree whether revolutions are to be understood as local, national, international, or global instances of profound, lasting politico-social change.

On the basis of his “messianic” conception of time and history that rejects the conventional understanding of time as “empty” (that is, as continuous and homogenous), Benjamin interprets revolution as a “shock” that kairologically disrupts the prevailing chronological and, with it, social and political order. For him, revolution thus constitutes a momentary event that makes a switch from a state of historical normalcy to a state of historical exception possible. This switch is as radical as it is sudden: “Every second” has the potential to serve as the gate through which “the messiah” can enter to fundamentally transform the world (compare Benjamin, 2009 [1940/42]). As opposed to Benjamin, thinkers like Hegel or Antonio Gramsci (1891-1937) understand revolution as a process that spans in time before it leads to substantial, intelligible change, that is, to new political, legal, and economic, cultural, linguistic, and aesthetic principles being implemented and effectively taking root. Although Hegel describes the French Revolution as a “glorious dawn,” it is evident for him that the political events of the late 1780s and early 1790s are belated, derivative effects of a long-lasting historical epoch of revolution that encompasses the ages of the reformation and the enlightenment (compare Hegel, 1991 [1832-45]). Discussing revolution in more narrowly political terms, Gramsci describes its realization as a tedious “war of position” against “hegemonic” power structures: It is only by means of persistently working their way through numerous struggles with the opponents of revolution over time that its carriers can hope to supersede an established order (compare Gramsci, 1992 [1929-35]). Similarly, Marx and Engels put emphasis on the aspect of duration. Modeling their understanding of revolution on the Israelites’ exodus from Egypt (compare Walzer, 1985), they attribute great significance to the interval period that lies in between the status quo at the time of the failed revolutions of 1848 and the future actualization of a classless society. Given the considerable distance between the initial and the terminal point of revolution, they propose a notion and practical program of “permanent revolution” that links immature democratic revolution to mature “proletarian” revolution. In modernizing and democratizing this idea, Étienne Balibar (*1942) expounds an understanding of revolution as a continuous, open-ended task. According to his view, revolution cannot hope for a final stage of satisfaction and completion (compare Balibar, 2014). Instead, it means an ongoing exercise in responsible citizenship and in “democratizing democracy.” This exercise allows for an ever increasing inclusion of groups and individuals who, heretofore, have been denied the ability to “take part,” that is, for their unrestricted recognition as full subjects of “equaliberty,” which is a hybrid term indicating the two main trajectories of modern emancipatory politics: On one side, the Lockean liberal and individualist strand and, on the other, the Rousseauian socialist and collectivist strand, which Balibar takes to be interdependent and co-constitutive elements of democratic revolution.

Other thinkers discuss revolution primarily in terms of its spatial extension. Contemporary anarchist theorist David Graeber (*1961) argues that revolutionary projects can be pursued by the creation of “autonomous spaces” on a local scale. Within such spheres, alternatives to dominant forms of coexistence and interaction, of politics and economy can be practiced whereby the existing order is unmasked as contingent. What is more, in drawing on exemplary practices from other epochs and cultures, the contours of an order devoid of institutions such as the state or capitalism and of repressive convictions such as racism and misogyny are “pre-figured.” For Graeber, the narrow spatial limits of these alternative micro-worlds characterized by autonomy, mutual aid, and direct democracy do not negatively affect their subversive, transformative capacities (compare Graeber, 2004). Whereas thinkers such as, for example, Sieyès and Foucault see the nation state as the adequate space for revolution to occur (compare Sieyès, 2003 [1789]; Foucault, 2005 [1978-79]), others claim that this is too limited a scope for radical transformation to have profound and lasting impact. For instance, Lenin, not unlike Sartre in his “revolutionary humanism” (compare Sartre, 1955 [1946]), follows Marx in emphasizing the transnational implications of revolution even if its scope, especially in its early phases, has to be national for reasons of mere practicability. According to Lenin, emancipatory projects carried out by a “revolutionary people” send shockwaves across neighboring as well as distant countries. Thus, it is evident to him, that the Russian Revolution ultimately represents the “interests of world socialism,” which outweigh mere national interests (compare Lenin, 1987 [1902]; 1978 [1917]). This position takes up the universalism inherent to the American and French Revolution which finds its expression in pronounced references to the “rights of man” in the writings and speeches of Paine or Mirabeau as well as in the essential political documents of the revolutionary period: the 1776 Declaration of Independence and the 1789 Declaration of the Rights of Man and of the Citizen.

4. Conclusion

Even when the plurality of manners in which “revolution” is used in the domains of technology and science, culture and art, is left aside and when the term is applied in the domain of politics only, the heterogeneity and contested nature of understandings remains considerable. In spite of the wide range of specific approaches, arguments, and agendas characteristic of the individual theories of political revolution, they can be situated within one multifaceted, yet unified intellectual space: From the theoretical enablers and “inventors” of revolution like Rousseau, Paine, or Kant to contemporary thinkers of revolution like Balibar or Graeber, their theories have been confronted with a number of central problems and questions which open up, shape, and sustain this space. It is primarily in terms of these central questions that they have attempted to conceptually grasp revolution. Six of these questions have been outlined in the above sections: (1) the question of revolutionary novelty which is discussed on a spectrum between the extremes of absolute and relative notions of rupture and beginning; (2) the question of revolutionary violence and its legitimacy discussed on the spectrum between unqualified approval and unreserved exclusion as a means of revolution; (3) the question of revolutionary freedom discussed on the spectrum between negative (liberation) and positive (foundation) concepts of freedom as the aim of revolution; (4) the question of the revolutionary subject discussed on the spectrum between individual doers on the one end and a global “multitude” on the other; (5) the question of the revolutionary object or target discussed on the spectrum between political, social institutions and individual, subjective attitudes, convictions, and beliefs; and, (6), the question of the temporal and spatial extension of revolution discussed on the spectrum between momentary and local on the one end, permanent and global on the other. Despite their pronounced heterogeneity and their attempts to periodically redefine revolution, it is with respect to these key questions that the theories presented here share family resemblances to one another.

Defining whether political change can be considered revolutionary constitutes the conceptual issue at the core of these theories. In particular, they aim at circumscribing revolution in regard to related, yet distinct concepts such as revolt, rebellion, and reform whereby the questions of the new, of liberty, and of the legitimacy of violence serve as the most relevant criteria for demarcation. The first two criteria play a central role in the distinction between revolution on the one hand, revolt and rebellion on the other. As a consequence of the underlying main goal of casting off an unjust, oppressive regime, both revolt and rebellion are based on limited notions of novelty and liberty. Thus, in comparison to revolutionary change, the specific kind of change they aspire to is more marginal in its scope. However, once revolution is not conceived as momentary but as procedural (as is the case in Kant’s or Marx’s considerations), drawing such a clear conceptual line seems less feasible: If revolution is understood as a temporal sequence that encompasses multiple stages, an initial “revolting” or “rebellious” phase is conceivable, for which the aspect of durable foundation of a new order is secondary. For the differentiation of revolution and reform, the criteria of novelty and violence are central. Whereas the criterion of violence reliably allows for a demarcation, temporalized understandings of revolution entail the blurring of a seemingly obvious difference with respect to the aspect of novelty: Here, a concluding “reformist” phase of revolution is thinkable in which the configuration of an institutional order or the establishment of a common ground with former “enemies of the revolution” takes precedence. Accordingly, when Kropotkin links revolution and revolt or when Kant explicitly associates revolution with reform, the relatedness between these concepts and not to mention the phenomena is reflected. In light of these resemblances, attempts at a precise conceptual critique of revolution, which distinguishes it sharply from revolt, rebellion, or reform remain heuristic in character.

Determining if and under what conditions revolutionary action and, especially, revolutionary violence are morally justified constitutes the normative issue at the core of theories of revolution. Although revolution represents the most radical expression of dissent and protest, the determination of its legitimacy reveals points of contact with debates on less extreme forms of a politics of resistance and transformation such as, for example, civil disobedience (compare Rawls, 1999). Despite the differences as to, inter alia, the scope of the envisaged transformation, their legitimacy essentially depends on the underlying cause and motivation. Revolutionary action and, with it, at least temporary political disorder, can only be considered legitimate if it aims at overcoming continued violations of the basic rights of specific groups or entire nations by the regime in power that are both severe and systematic. While conflict between ruling powers and revolutionary movements typically takes place within the context of a state, broader issues independent of the policies of a specific state can also be invoked as a justified cause to engage in radically transformative politics. The Occupy movement and its appeal to the inequalities brought about by the current global economic system is a case in point. Within and beyond the context of the state, the intention to right the wrongs—that is, the injustices as to dignity, liberty, and equality—committed by a regime and secured by unjust political, legal, social, or economic institutions is the primary precondition for a revolutionary project’s justifiability.

Furthermore, the (il)legitimacy of revolutionary politics is determined by the heavily disputed question of the permissibility of revolutionary violence. In relation to this question, the focus is not on the just cause, the right reason and intention of such a politics, but on the conduct in the course of its realization. The dispute pertains to different dimensions: It concerns the general issue whether violence can be considered a politically and, more importantly, morally justifiable means of revolution, in other words, whether, based on strategic or principled considerations, its use can be justified at all. In addition, it concerns more specific issues such as its justifiable form (for example, violence against property), scope (for example, violence limited to early stages of the revolutionary process), and status (for example, violence as a last resort once all peaceful alternatives have failed). Here, the discussion on revolution resembles theoretical debates on just war (Arendt, 2006 [1963]; Walzer, 2006 [1977]). For instance, much like in the case of the ius in bello, attempts to formulate essential criteria of acceptable revolutionary conduct aim at ensuring the proportionality of the use of violence, at discriminating between legitimate and illegitimate targets, and at prohibiting hostile acts which are “vile in themselves” (compare Kant, 2006c [1795/96]). Besides the perspectives of cause (in analogy to the terminology of just war theory: ius ad revolutionem) and conduct (ius in revolutione), there is a third critical perspective, in terms of which the legitimacy of revolutionary action and violence is determined. This perspective focuses on the ius post revolutionem, that is, on the final stage of a revolution, and assesses its capacity to terminate the state of exception in order to transition into a new and stable political order. Thereby, the stability of such a reconstitution is largely predicated on reconciliation with and inclusion of former adversaries. It is mainly thanks to the criteria of cause, conduct, and reconstitution that revolutionary violence becomes distinguishable from the violence used by criminals and, especially, terrorists. However, largely on the basis of formative historical experiences of excessive revolutionary violence—of revolutions not only harming their enemies, but also “devouring their children” —as well as of Gandhi’s or Mandela’s successful transformative projects, non-violent revolutionary action generally has a greater claim to justification.

A further relevant issue with regard to just revolution theory pertains to the self-authorization of revolutionary movements, which raises the questions whom such movements speak for and whose interests they represent. This issue crystallizes in revolutionary declarations that often appeal to “the people” (compare Habermas, 1990; Derrida, 2002). In this case, the legitimacy of a revolutionary project depends, among other things, on whether the revolutionaries’ political power and the sovereignty of the regime they establish is based on force or on discourse, that is, on oppression or persuasion of the majority.

To conclude, this article provides a sample of the rich theoretical discourse surrounding the contested concept of revolution. While the positions developed within the three dominant schools of thought (democratic, communist, and anarchist) are strongly shaped by broader commitments to the underlying political philosophies and often indebted to other debates (for example, on war), this discourse has distinctive features due to the specificity of its object of investigation and the controversial exchange of views between the different traditions. Given both its width and unsettledness, there are significant conceptual and normative issues for philosophers to address. It is not only in light of the often problematic history of revolutions that it is expedient to theoretically “provide yardsticks and measurements” (Hannah Arendt); a thorough analysis and critical assessment of transformative concepts, agendas, and strategies is also required because of the contemporary re-emergence of movements with revolutionary aspirations from the Zapatistas to the Arabellion, Occupy, or the Indignados.

5. References and Further Reading

  • Arendt, H., 2006, On Revolution [1963], New York: Penguin.
  • Badiou, A., 2012, The Rebirth of History, trans. G. Elliott, London/New York: Verso.
  • Balibar, É., 2014, Equaliberty: Political Essays, trans J. Ingram, Durham: Duke University Press.
  • Bakunin, M., 2009, God and the State [1871], New York: Cosimo.
  • Bakunin, M., 1990, Statism and Anarchy [1873], trans. & ed. M.S. Shatz, Cambridge: Cambridge University Press.
  • Benjamin, W., 2009, On the Concept of History [1940/42], New York: Classic Books America.
  • Benjamin, W., 1999, Zur Kritik der Gewalt [1921], in Walter Benjamin Gesammelte Schriften, vol. II.1, eds. R. Tiedemann & H. Schweppenhauser, Frankfurt: Suhrkamp, 179–204.
  • Berman, H., 1985, Law and Revolution: The Formation of the Western Legal Tradition, Cambridge, MA: Harvard University Press.
  • Butler, J., 2015, Notes Toward a Performative Theory of Assembly, Cambridge, MA: Harvard University Press.
  • Camus, A., 1991, The Rebel: An Essay on Man in Revolt [1951], trans. A. Bower, New York: Vintage Books.
  • Condorcet, J.A.N. de, 2012, Political writings, eds. S. Lukes & N. Urbinati, Cambridge, UK/New York: Cambridge University Press.
  • Dawoody, M., 2011, The Islamic Law of War: Justifications and Regulations, London: Palgrave Macmillan.
  • DeFronzo, J., 2011, Revolution and Revolutionary Movements, Boulder: Westview Press.
  • Derrida, J., 2002, “Declarations of Independence”, in Negotiations: Interventions and Interviews 1971-2001, ed. & trans. E. Rottenberg, Stanford: Stanford University Press, 46–54.
  • Engels, F., 1969, Germany: Revolution and Counter-Revolution [1851/52], with the collaboration of Karl Marx, ed. E. Marx, London: Lawrence & Wishart.
  • Fadl, K., 2006, Rebellion and Violence in Islamic Law, Cambridge: Cambridge University Press.
  • Fanon, F., 1967, The Wretched of the Earth [1961], trans. C. Farrington, Harmondsworth: Penguin.
  • Foucault, M., 2005, “Writings on the Iranian Revolution” [1978-79], in Foucault and the Iranian Revolution: Gender and the Seductions of Islamism, eds. J. Afary & K.B. Anderson, Chicago: University of Chicago Press, 179–277.
  • Furet, F., and, M. Ozouf (eds.), 1989, A Critical Dictionary of the French Revolution, Cambridge, MA: Belknap Press.
  • Graeber, D., 2004, Fragments of an Anarchist Anthropology, Chicago: Prickly Paradigm Press.
  • Gramsci, A., 1992–, Prison Notebooks [1929-35], New York: Columbia University Press.
  • Habermas, J., 1990, “Naturrecht und Revolution”, in Theorie und Praxis, Frankfurt: Suhrkamp, 89–127.
  • Hardt, M., and A. Negri, 2004, Multitude: War and Democracy in the Age of Empire, New York: Penguin Press.
  • Hegel, G.W.F., 1977, Phenomenology of Spirit [1807], trans. A.V. Miller, Oxford: Clarendon Press.
  • Hegel, G.W.F., 1991, The Philosophy of History [1832-45], trans. J. Sibree, Buffalo, NY: Prometheus Books.
  • Hobsbawm, E., 1996, The Age of Revolution: Europe 1789-1848 [1962], New York: Vintage Books.
  • Jefferson, Th., 2004–, The Papers of Thomas Jefferson: Retirement Series, ed. J.J. Looney, Princeton: Princeton University Press.
  • Jefferson, Th., 2010, The Selected Writings of Thomas Jefferson: Authoritative Texts, Contexts, Criticism, ed. W. Franklin, New York: W. W. Norton & Co.
  • Kant, I., 2006a, “An Answer to the Question: What is Enlightenment?” [1784] in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 17–23.
  • Kant, I., 2006b, “Idea for a Universal History from a Cosmopolitan Perspective” [1784], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 3–16.
  • Kant, I., 1991, “The Contest of Faculties” [1798], in Kant: Political Writings, ed. H. Reiss, Cambridge: Cambridge University Press, 176–190.
  • Kant, I., 1996, The Metaphysics of Morals [1797], trans. & ed. M. Gregor, Cambridge/New York: Cambridge University Press.
  • Kant. I, 2006c, “Toward Perpetual Peace: A Philosophical Sketch” [1795/96], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 67–109.
  • Kantorowicz, E., 1997, The King’s two Bodies: A Study in Medieval Political Theology [1957], Princeton: Princeton University Press.
  • Koselleck, R., 2004, Futures Past: On the Semantics of Historical Time, trans. K. Tribe, New York: Columbia University Press.
  • Koselleck, R. et al., 1984, “Revolution (Rebellion, Aufruhr, Bürgerkrieg)”, in Geschichtliche Grundbegriffe. Historisches Lexikon zur politisch-sozialen Sprache in Deutschland, volume 5, eds. O. Brunner, W. Conze, & R. Koselleck, Stuttgart: Klett-Cotta, 653–788.
  • Kropotkin, P., 2008, The Conquest of Bread [1892], Oakland, CA: AK Press.
  • Kropotkin, P., 1972, Mutual Aid: A Factor of Evolution [1902], ed. P. Avrich, New York: New York University Press.
  • Lenin, V.I., 1987, Essential Works of Lenin: What is to be done? and other writings, ed. H.M. Christman, New York: Dover Publications.
  • Lenin, V.I., 1978, State and Revolution: Marxist Teaching about the Theory of the State and the Tasks of the Proletariat in the Revolution [1917], Westport: Greenwood Press.
  • Locke, J., 1986, Second Treatise on Civil Government [1689], Amherst, NY: Prometheus Books.
  • Marcuse, H., 1984, “Ethik und Revolution” [1964], in Herbert Marcuse Schriften, volume 8, Frankfurt: Suhrkamp, 100–114.
  • Marcuse, H., 1991, One-dimensional Man: Studies in the Ideology of Advanced Industrial Society [1964], Boston: Beacon Press.
  • Marx, K., 2001a, Capital: A critique of Political Economy. Vol. I, Book One, The Process of Production of Capital [1867], trans. S. Moore & E. Aveling, London: Electric Book Co.
  • Marx, K., 2001b, The Class Struggles in France [1850], London: Electric Book Co.
  • Marx, K., 2012, The Communist Manifesto [1848], with Friedrich Engels, London: Verso.
  • Menke, Ch., and, F. Raimondi (eds.), 2011, Die Revolution der Menschenrechte. Grundlegende Texte zu einem neuen Begriff des Politischen, Berlin: Suhrkamp.
  • Merleau-Ponty, M., 2005, Philosophy of Perception [1945], London/New York: Routledge.
  • Nail, T., 2012, Returning to Revolution: Deleuze, Guattari, and Zapatismo, Edinburgh: Edinburgh University Press.
  • Nussbaum, M., 2013, Political Emotions: Why Love Matters for Justice, Cambridge, MA: Harvard University Press.
  • Paine, Th., 2012, Common Sense [1776], ed. R. Beeman, New York: Penguin.
  • Paine, Th., 2000, Political Writings, ed. B. Kuklick, Cambridge/New York: Cambridge University Press.
  • Paine, Th., 1992, Rights of Man [1791], ed. G. Claeys, Indianapolis: Hackett Pub. Co.
  • Palmer, R., 2014, The Age of Democratic Revolution: A Political History of Europe and America 1760-1800 [1959], Princeton: Princeton University Press.
  • Rawls, J., 1999, “The Justification of Civil Disobedience”, in John Rawls: Collected Papers, S. Freeman (ed.), Cambridge, MA: Harvard University Press, 176–189.
  • Rosenstock-Huessy, E., 1993, Out of Revolution: Autobiography of Western Man [1938], Providence: Berg Publishers.
  • Rousseau, J.-J., 1992, Discourse on the Origin of Inequality [1755], trans. D.A. Cress, Indianapolis: Hackett Pub. Co.
  • Rousseau, J.J., 2012, Of the Social Contract and other Political Writings [1762], ed. C. Bertram, trans. Q. Hoare, London: Penguin.
  • Sartre, J.-P., 1962, “Materialism and Revolution” [1946], in Literary and Philosophical Essays, transl. A. Michelson, New York: Criterion Books, 185–239.
  • Sartre, J.-P., 1967, “Preface”, in The Wretched of the Earth, F. Fanon, Harmondsworth: Penguin.
  • Sieyès, E.J., 2003, Political Writings: Including the Debate between Sieyès and Tom Paine, with a Translation of What is the Third Estate?, ed. M. Sonenscher, Indianapolis: Hackett Pub. Co.
  • Skopcol, T., 1979, States and Social Revolutions, Cambridge/New York: Cambridge University Press.
  • Walzer, M., 1985, Exodus and Revolution, New York: Basic Books.
  • Walzer, M., 2006, Just and Unjust Wars: A Moral Argument with Historical Illustrations [1977], New York: Basic Books.
  • Walzer, M., 1992, Regicide and Revolution: Speeches at the Trial of Louis XVI [1972], New York/Oxford: Columbia University Press.
  • Zizek, S., 2012, The Year of Dreaming Dangerously, London/Brooklyn, NY: Verso.
  • Zizek, S., 2008, Violence: Six Sideway Reflections, New York: Picador.

Author Information

Florian Grosser
Email: florian.grosser@unisg.ch
University of St. Gallen
Switzerland

Everettian Interpretations of Quantum Mechanics

Between the 1920s and the 1950s, the mathematical results of quantum mechanics were interpreted according to what is often referred to as “the standard interpretation” or the “Copenhagen interpretation.” This interpretation is known as the “collapse interpretation” because it supposes that an observer external to a system causes the system, upon observation, to collapse from a quantum mechanical state to a state in which the elements of the system appear to have a determinate value for the property measured. Although this interpretation is largely successful at explaining our experiences of the world, it fails in that it gives rise to what has become known as the measurement problem, a problem described in section 2.

In addition to this problem, there is another problem tied to the role of the observer. In the 1950s, Hugh Everett III (1930-1982) was considering quantum mechanics as it might apply to the entirety of the universe. Surely if quantum mechanics were true on the local level of laboratories and experiments taken as closed systems, it would also be true for the entire universe taken as a closed system. The problem with this approach is that there is no external observer available at the scale of the entire universe to cause a collapse of the quantum state, a state that the laws of quantum mechanics say the universe would be in, were it unobserved. Thus, Everett suggested that we abandon the notion of an observer-caused collapse and we consider all quantum states to be always non-collapsed.

Everett published one short paper in 1957—his doctoral dissertation (Everett 1957a, 1957b)—and after that, he left academia. He later published the longer, original version of his dissertation at the request of Bryce DeWitt and Neill Graham (Everett 1973). In his dissertation, Everett develops the mathematical theory that is the foundation of Everettian quantum mechanics [EQM]; but many people have believed the theory itself needs interpretation. Although Everett was not interested in the philosophical implications of his work, there has been great interest among philosophers in trying to interpret what EQM implies about the metaphysical structure of the world.

This article surveys the various ways philosophers have attempted to interpret Everett. To begin, the standard interpretation, as well as its attendant problems, is discussed briefly. Following that, the bare theory, the single and many minds theories, and versions of a many worlds theory are discussed. The article closes by discussing two relational interpretations of Everett.

Table of Contents

  1. Preamble
  2. The Standard Interpretation
  3. Everettian Interpretations
    1. Everett Plus Nothing (The Bare Theory)
    2. Everett Plus Minds
    3. Everett Plus Worlds (DeWitt’s Splitting Worlds)
      1. Problems with the Notion of Splitting
      2. The Preferred Basis Problem
  4. The Evolution of Many Worlds Interpretations
    1. The Problem of Probability and Graham’s Attempt at Adding a Measure
    2. The Oxford MWI
    3. Objections to the Oxford MWI
  5. Relational Interpretations of Everettian Quantum Mechanics
    1. Simon Saunders’ Relational Interpretation
    2. The Relative Facts Interpretation
  6. Conclusion
  7. References and Further Reading

1. Preamble

Before beginning to survey the various ways philosophers have attempted to interpret Everett, we must address the question of whether or not there even are rival interpretations of Everett. Some of the most influential physicists and philosophers working on EQM have either taken it as fact (DeWitt 1970) or explicitly argued (Deutsch 2010; Wallace 2012) that there is only one “interpretation” of EQM: some version of the many worlds interpretation [MWI].

Bryce DeWitt (1923-2004) took it to be the case that the only way one can “interpret” Everett was through a many worlds theory. He wrote, “The mathematical formalism of the quantum theory is capable of yielding its own interpretation” (DeWitt 1970: 160). And that formalism “forces us to believe in the reality of all the simultaneous worlds represented in the superposition described by equation [(2), below], in each of which the measurement has yielded a different outcome” (DeWitt 1970: 161).

David Deutsch (1953- ) and David Wallace (1976- ) have argued that there are no rival “interpretations” of Everett: “Other ‘interpretations’ . . . are really alternative physical theories . . .” (Wallace 2012: 382, Wallace’s emphasis). They see the “Everett interpretation” to be “just quantum mechanics itself, read literally, straightforwardly—naively, if you will—as a direct description of the physical world, just like any other microphysical theory” (Wallace 2012: 2). Deutsch writes:

. . . insisting that parallel universes are ‘only an interpretation’ and not a – what? a scientifically established fact or something (as if there were such a thing) – has the same logic as those stickers that they paste in some American biology textbooks, saying that evolution is ‘only a theory’, by which they mean precisely that it’s just an ‘interpretation’. Or, in terms of the analogy that Everett used in his famous exchange of letters with Bryce DeWitt, it’s like claiming that the motion of the Earth about its axis is only an ‘interpretation’ that we place on our observations of the sky (2010: 543, Deutsch’s emphasis).

Wallace writes:

The ‘Everett interpretation of quantum mechanics’ is just quantum mechanics itself, ‘interpreted’ the same way we have always interpreted scientific theories in the past: as modelling the world. Someone might be right or wrong about the Everett interpretation – they might be right or wrong about whether it succeeds in explaining the experimental results of quantum mechanics, or in describing our world of macroscopically definite objects, or even in making sense – but there cannot be multiple logically possible Everett interpretations any more than there are multiple logically possible interpretations of molecular biology or classical electrodynamics (2012: 38, Wallace’s emphasis).

The arguments Deutsch and Wallace provide may be persuasive to some readers. But the purpose of the current article is to survey what has historically been done by philosophers attempting to draw metaphysical pictures from Everett’s pure wave mechanics. Wallace is explicit about the fact that he is not attempting to do historical exegesis of Everett’s views (2012: 2), and whether Everett would be sympathetic to or supportive of an MWI is an open question. (Again see Barrett 2010, Barrett and Byrne 2012, and Bevers 2011.) So whether or not a version of the MWI is the correct interpretation of Everett, or even the only interpretation of Everett, is a question that can be adjudicated in other venues. Our purpose here is to consider the ways in which philosophers have attempted to interpret Everett’s pure wave mechanics, and so, after one final preliminary note, it is to this that we shall turn.

There is one other debate that ought to be considered before embarking on our project. And that is the debate over the appropriate way to explain the results of quantum mechanical experiments. Everett’s proposal for pure wave mechanics is but one way physicists explain what seem to be counterintuitive outcomes of quantum mechanical experiments. Other ways include Bohmian mechanics (de Broglie 1928, Bohm 1952) and GRW (Ghirardi, Rimini & Weber 1986). Whether or not the unitary dynamics proposed by Everett are the correct laws for describing the world is a question that is far from decided. But as this article is concerned with the question of Everett interpretation, a full rehearsal of this debate goes beyond its scope. There is no assumption made here about what the correct theory of the world is; rather there is only a historical discussion of the way people have interpreted Everett.

For more on interpretations of quantum mechanics, see “Interpretations of Quantum Mechanics” in this encyclopedia and also (Lewis 2016).

2. The Standard Interpretation

In Schrödinger’s cat thought experiment (Schrödinger, 1935),  there is a cat locked inside a box along with a glass vial of cyanide; a hammer set to potentially break the vial; and a Geiger counter inside of which there is a sample of a radioactive substance small enough that there is a 50% chance of one of the atoms decaying in the course of one hour and a 50% chance that none will. If an atom decays the Geiger counter will click which causes the hammer to fall, the flask to break, the cyanide gas to be released and the cat to die. If an atom does not decay, the cat remains alive. If the inside of the box is not observed during this hour, Schrödinger took the formalism of quantum mechanics to imply that the cat would be in a superposition of being alive and dead.

Superpositions are states of systems that are represented mathematically by a weighted sum of the possible values for the property in question. Each summand will represent one of the possible values for the property and will be accompanied by a complex number coefficient which, when it is multiplied by its complex conjugate and the result is squared (in other words, when its norm is squared), the standard interpretation takes to be the probability of the system collapsing into that value for the property. So in our cat example, there will be two terms in the superposition of the state of the system that includes the cat, each with a coefficient of 1/√2 which, when its norm is squared, gives us a ½ probability of the system collapsing into the state that includes the cat being alive and a ½ probability of its collapsing into the state of the system that includes the cat being dead.

We never seem to observe cats (or any other macroscopic objects) as being in superpositions. The standard interpretation assumes that when an observer interacts with a system, that observation causes a collapse of the superposition, and the objects in the system take on definite values for the property being measured. So when the box is opened and the observer looks into it, the system randomly and instantaneously collapses into either cat alive or cat dead with a 50% probability of finding either.

Now we can transition from cats to electrons. Electrons have a property called “angular momentum” that can take a definite value of either spin up or spin down along the x, y or z axis. These values are mutually exclusive in the sense that if an electron has a definite value for one of the properties, it does not have a definite value for any of the others. It has been experimentally determined that when electrons have a definite spin property along one axis, they are in a superposition of having spin up and spin down along both of the other axes. So, for example, when an electron is determinately x-spin up, it is in a superposition of being y-spin up and y-spin down, and it is in a superposition of being z-spin up and z-spin down.

The standard interpretation tells us that when we observe electrons that are in such superpositions, they instantaneously and randomly collapse from the superposition they were in to one of the definite properties that make up that superposition. So when we take an electron that is x-spin up, for example, and measure its z-spin, the standard interpretation tells us that it collapses from being in a superposition of being z-spin up and z-spin down into either being z-spin up or being z-spin down. In the standard interpretation, this collapse explains the determinate measurement records that we get in experiments with quantum mechanical systems. The standard interpretation also tells us that if we do not observe quantum particles, then that collapse will not happen and they will remain in their superpositions.

The difference in these empirical results is captured in two of the laws that are part of the theory of quantum mechanics:

  1. When no measurement, or other observation, is made of a system, then that system evolves in a deterministic and linear fashion.
  2. When a measurement, or other observation, is made of a system, then that system instantaneously and non-deterministically collapses into a definite value for the property being measured.

In the standard interpretation, the second law accounts for the fact that when we measure any property of an object, it has a definite value.

These two laws are not compatible, and there is no clear explanation of when one is to be used instead of the other.  In other words, there is no explanation of what constitutes an act of observation in the standard interpretation.  In addition to this, there is the problem that if we take measurement devices to be physical systems like any other, then the standard interpretation says that the quantum system that makes up the measuring device will evolve deterministically, but the second law says that it will take on a definite value with a certain probability.  In other words, it would have to follow both a deterministic law and a law governed by chance. This is not logically possible.  This, in short, is the quantum measurement problem.  It is part of what has driven the search for different interpretations of quantum mechanics.

Another part of what drove the search for a new interpretation of quantum mechanics was an interest in being able to apply quantum mechanics to the entire universe. This is what, at least in part, led Hugh Everett III to suggest that instead of having two mutually incompatible laws for the description of the evolution of states, we drop the law that is used when systems are observed (Everett 1957a, 1957b). The implication of this is that while the standard interpretation suggests that when a measurement is made, the superposition of a quantum particle collapses and the particle has a determinate measurable (and measured) property, the so-called “Everett interpretation” claims that there is no such collapse.  All the theories that have sprung from Everett’s “pure wave mechanics” have come to be known as “no-collapse” theories since they propose that there is no collapse of the superpositions.

One difficulty with no-collapse theories is making sense of how it is that we seem to have determinate measurement records for quantum particles even though those particles do not have a determinate value for the property measured, since they never collapse out of their superpositions. Another is the question of probability in a universe in which everything happens. Various interpretations of Everett have answered these issues differently. It is to a discussion of these various interpretations that we now turn.

3. Everettian Interpretations

a. Everett Plus Nothing (The Bare Theory)

Everett’s pure wave mechanics suggests that there is generally no determinate fact about the everyday properties of the objects in our world, since the equations that are supposed to describe such properties are such that they describe superpositions of those properties. Rather, Everett takes there to be only “relative states” and thus “relative properties” of quantum systems.

To see what he means by “relative states” and “relative properties,” consider the following. When we want to learn the value of a property for some system, we measure for that property.  But Everett treats measuring devices just as he would any other system with which the object system interacts, and so the measuring device will become correlated with the system that it is measuring.  In order to learn anything about one subsystem, even the reading on a measuring device, one must make reference to the complement of the subsystem:

As a result of the [measurement] interaction the state of the measuring apparatus is no longer capable of independent definition. It can be defined only relative to the state of the object system. In other words, there exists only a correlation between the two states of the two systems. It seems as if nothing can ever be settled by such a measurement . . . There is no longer any independent system state or observer state, although the two have become correlated in a one-one manner (Everett 1957b: 144, 146).

Everett explains “correlation” this way: “If one makes the statement that two variables, X and Y, are correlated, what is basically meant is that one learns something about one variable when he is told the value of the other” (Everett 2012: 61; this definition also shows up in Everett 1973: 17). Even in this case, it only “seems” as if nothing can be settled because of this correlation between an object system and a measuring device.  In fact, one can settle matters with the use of relative states:

. . . a constituent subsystem cannot be said to be in any single well-defined state, independently of the remainder of the composite system.  To any arbitrarily chosen state for one subsystem there will correspond a unique relative state for the remainder of the composite system.  This relative state will usually depend upon the choice of state for the first subsystem.  Thus the state of one subsystem does not have an independent existence, but is fixed only by the state of the remaining subsystem.  In other words, the states occupied by the subsystems are not independent, but correlated.  Such correlations between systems arise whenever systems interact (Everett 1957b: 142; Everett’s emphasis).

Consider again what happens when we measure the z-spin of an x-spin up electron. Before the measurement interaction, the state of the system consisting of the electron, e, and the measuring device, m, can be expressed in this way:

(1)        |m>ready1/√2(|↑z>e + |↓z>e)

Where “|m>ready” is what we use to express that the measuring device is ready to make a measurement, “|↑z>e” expresses that the electron is z-spin up and “|↓z>e” expresses that it is z-spin down. When we measure the z-spin, the measuring device interacts with the system it is measuring and becomes a part of the system. After that interaction, the state of the system that consists of the measuring device and the electron can be expressed in this way:

(2)        1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

Where “|“↑”z>m” expresses that the measuring device has recorded z-spin up and “|“↓”z>m” expresses that the measuring device has recorded z-spin down.

In this state, the measuring device is in a superposition of reading z-spin up and z-spin down. Everett writes, “one can . . . look upon the total wave function . . . as a superposition of pairs of subsystem states, each element of which has a definite q value . . . for each of which the apparatus has recorded a definite value . . .” (Everett 1957a: 58, 59; Everett’s emphasis). So the way we get an explanation of our determinate measurement records is by understanding that they are records of relative states of a system.

Using the concept of relative states, we can say two things: (1) relative to the electron’s being z-spin up, the measuring device recorded “z-spin up” as the value of the electron’s z-spin; (2) relative to the electron’s being z-spin down, the measuring device recorded “z-spin down” as the value of the electron’s z-spin. This explains our determinate measurement results because if we ask a reliable observer whether he got a determinate measurement result for an experiment, he will always say “yes,” even when his measurement result is a superposition of outcomes and not one outcome determinately (Albert 1992, Barrett 1999). Here is why:

Let us say that there is a reliable observer who is about to measure the z-spin of an x-spin up electron. By “reliable” we mean that when we ask him whether he has a measurement record, if he has one then he will answer that he does; if he does not have one, he will answer that he does not. Let us also say that our observer is truthful.  He will always truthfully report what he has as a measurement record—if he recorded “z-spin down,” then he will answer “z-spin down” when asked what he recorded. Recall that x-spin up electrons are in a superposition of being z-spin up and z-spin down. So when we describe the state of the electron mathematically there will be one summand that describes the electron as z-spin up and one that describes it as z-spin down.

According to Everett’s pure wave mechanics, when our observer makes a measurement of the electron he does not cause a collapse, but instead becomes correlated with the electron. What this means is that where once we had a system that consisted of just an electron, there is now a system that consists of the electron and the observer. The mathematical equation that describes the state of the new system has one summand in which the electron is z-spin up and the observer measured “z-spin up” and another in which the electron is z-spin down and the observer measured “z-spin down.” In both summands our observer got a determinate measurement record, so in both, if we ask him whether he got a determinate record, he will say “yes.” If, as in this case, all summands share a property (in this case the property of our observer saying “yes” when asked if he got a determinate measurement record), then that property is determinate.

This is strange because he did not in fact get a determinate measurement record; he instead recorded a superposition of two outcomes.  After our observer measures an x-spin up electron’s z-spin, he will not have determinately gotten either “z-spin up” or “z-spin down” as his record. Rather he will have determinately gotten “z-spin up or z-spin down,” since his state will have become correlated with the state of the electron due to his interaction with it through measurement. Everett believed he had explained determinate experience through the use of relative states (Everett 1957b: 146; Everett 1973: 63, 68–70, 98–9). That he did not succeed is largely agreed upon in the community of Everettians.

This sparse interpretation of Everett, adding no metaphysics or special assumptions to the theory, has come to be known as the “bare theory.”  One might say that the bare theory predicts disjunctive outcomes, since the observer will report that she got “either z-spin up or z-spin down ”—without any determinate classical outcome—without being in a state where she would determinately report that she got “z-spin up” or determinately report that she got “z-spin down” (Barrett 1999).  So, if the problem is to explain how we end up with determinate measurement results, the bare theory does not provide us with that explanation. Something must be added to Everett’s account.

For more on the bare theory see Albert and Loewer 1988, Albert 1992 and Barrett 1999. That Everett was uninterested in the philosophical implications of his work has been argued by Barrett 2010, Barrett and Byrne 2012 and Bevers 2011, though Deutsch 2010 and Wallace 2012 differ with this conclusion.

b. Everett Plus Minds

One suggestion about what to add to Everett’s account is to suppose that every time we are faced with an entangled state such as the state in which our observer found himself in the last section, we conclude that he got a determinate result from a particular perspective. To see what motivates this, consider what happens when an observer goes from being ready to read the result on a measuring device:

(3) |o>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to having read the device:

(4) 1/√2(|↑z>e |“↑”z>m |“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o).

Here we use “|“↑”z>o” to represent the state of the observer having read “z-spin up” off the measuring device’s pointer. If the observer forms beliefs about the z-spin of the electron based on the results of the experiment (as it seems reasonable to presume), then the state of the system that now contains the observer will be the following:

(5) 1/√2(|↑z>e |“↑”z>m|“↑”z>o |BEL“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o |BEL“↓”z>o)

Here “|BEL“↑”z>o” expresses that the observer believes that the electron is z-spin up. This state of the system implies that our observer is in a superposition of belief states (Albert and Loewer, 1988: 197). But our observer doesn’t feel like he is in a superposition of belief states. He feels like he has a definite result for the z-spin measurement of the electron. David Albert and Barry Loewer set out to produce an interpretation of Everett’s pure wave mechanics that “explains how it is that we always ‘see’ (mistakenly so . . .) macroscopic objects as not being in superpositions and never experience ourselves as in superpositions” (Albert and Loewer, 1988: 203). They propose that it is a function of the evolution of one’s mental state that explains one’s experiences (Albert and Loewer, 1988; Albert 1992).

Albert and Loewer begin by supposing that mental states supervene on particular brain states and are accurately accessible to introspection (204). They take it that the state of our observer believing that he read “z-spin up” is identical with a physical brain state that is distinct from the physical brain state that is associated with our observer believing he read “z-spin down” (204). Albert and Loewer do not explain this supervening relation; they merely state it as a given fact. But, they argue, this fact is inconsistent with taking our observer to be reliable and to his holding the belief that there was a determinate measurement record obtained when he measured the z-spin of an x-spin up electron. This is because to the state expressed by (5) our observer will answer “yes” when asked, “Does the electron have a determinate z-spin?” since it does have a determinate z-spin in each term. But our observer does not believe that the electron is z-spin up, nor does he believe that it is z-spin down since his brain is not in the state that corresponds to either of those beliefs; his brain is in a superposition of states (Albert and Loewer, 1988: 204). So Albert and Loewer give up the connection between brain states and belief states. They accept a “modest . . . non-physicalism” (205).

The first way they explain this is with what they call the single mind view. This view adds to quantum theory the principle that the evolution of an observer’s mental states is probabilistic. In the state expressed by (4), our observer starts out with no beliefs about the z-spin of the electron. But when he measures the z-spin, the probability is 50% that he ends up believing it is z-spin up and 50% that he ends up believing it is z-spin down. The association of belief states with physical states is dictated by probability and is determined by the quantum evolution of the system that consists of him, the electron and the measuring device (Albert and Loewer, 1988: 205–206). Thus, mental states are never in superpositions, even if physical brain states are.

Albert and Loewer immediately recognized certain problems with this view (1988: 206). The dualism this implies is particularly problematic since it implies that mental states do not supervene on brain states, or any physical states in general, since “one cannot tell from the state of a brain what its single mind believes” (206). Additionally, in the superposition that describes the state of the system in question, all the terms but one will represent “mindless brains” (often referred to as “mindless hulks”), and which one represents a mind is impossible to determine from the quantum formalism or from experiment. Jeffrey Barrett describes this problem nicely (1999). If two observers are measuring the z-spin of our x-spin up electron, then the system of the observers, the measuring devices and the electron will evolve from this state

(6) |o1>ready|o2>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to this one

(7) 1/√2(|↑z>e |“↑”z>m |“↑”z>o1|“↑”z>o2 + |↓z>e|“↓”z>m|“↓”z>o1|“↓”z>o2).

There is nothing in the dynamics proposed by Albert and Loewer that prevents the mental states of each observer from being associated with the same term or from being associated with different terms. But there is nothing that tells us which is the case. In each term of (7) there is a physical brain state for each observer, but with only one is there a mental state for each observer. Thus, it is possible that observer 1 is associated with the first term of (7) and observer 2 is associated with the second, but that neither of them realizes it. As such, observer 1 will (correctly) remember having gotten “z-spin up” as the result of her measurement but (incorrectly) remember that her friend got the same result. The same will hold true, mutatis mutandis, for observer 2. There is no way to determine whether or not one is speaking to a mindless hulk rather than a set of physical states with which there is associated a mental state (Barrett, 1999: 189–190).

Barrett also points out that the single minds view predicts that when an observer repeats a measurement she may get a different result from what she first got and falsely remember having gotten a first measurement that matches her second (1999: 187–188). So this observer is in a position where she cannot trust that what she remembers is what actually happened. Here is why. Consider again the state represented in (4). If our observer were to repeat her measurement of the z-spin of the electron, then the second measurement would be identical to the first. The state would then be

(8) 1/√2(|↑z>e |“↑”z“↑”z >m |“↑”z“↑”z> o + |↓z>e|“↓”z“↓”z >m|“↓”z“↓”z >o).

Even if our observer ended up in the state in which she read “z-spin up” for her first measurement, there is still a 50/50 chance that her mental state will be associated with each term in (8). Thus, there is a 50% chance that her mental state will evolve in such a way that she (correctly) believes that the results of her two measurements agreed, but she will (incorrectly) believe that she got “z-spin down” each time.

To avoid some of these problems, Albert and Loewer propose what they call the many minds view. This view takes it that every observer is associated with an infinite set of minds. This immediately solves the mindless brains problem since there are minds associated with each term in the state of a system that includes an observer. It also solves the problem of dualism in the sense that “mental states are determined by or supervenient on brain (or brain + environment) states” (206), though the minds are still non-physical in that they are not subject to the rules of quantum mechanics for their evolution. This is a benefit of the many minds view since mental states will never (in accordance with our experience) be in a superposition. But this benefit leads to an additional difficulty in that it retains a certain element of dualism. The only mental state that does any supervening on an observer’s physical state is what might be called the “global” mental state. This is the state that is associated with the entire set of minds. It is the only thing that evolves according to quantum mechanical dynamics. Importantly, what might be called the “local” state, the state to which an observer has direct introspective access, does not so supervene. If it did, it would also evolve according to the dynamics, but Albert and Loewer take it as a benefit to the many minds view that it does not. (For more on this objection see Barrett, 1999: 194–196 and Lockwood, 1996: 174.)

Determinate experience is just one part of the dilemma of Everett interpretation. The other is probability. The many minds view takes the norm squared of the coefficients on the terms to be interpreted as giving the proportion of minds associated with each term in the state of the system. In this view, probabilities “refer . . . to sequences of states of individual minds” (208). In (4) our observer is in a brain state that has a mind associated with it that has no belief about the z-spin of the electron, but she can predict what the probability is of ending up with a mind that believes she observed z-spin up or z-spin down; she can predict what the sequence of her mental states will be, according to the norm squared of the coefficients. Knowing that (3) evolves into (4) after a measurement, and accepting the reasonable claim that beliefs are formed when the observer looks at the measuring device, our observer knows that half her minds will believe that she measured the electron as being z-spin up and half believe she measured it as z-spin down.

It is debatable whether Albert and Loewer’s many minds view succeeds in answering the questions of determinate experience and probability that come with any interpretation of Everett and whether the dualism that is inherent in their view is a price worth paying. Michael Lockwood (1996) believed that it was not a price worth paying and so worked to develop a competing many minds view that did not suffer this problem.

Lockwood’s version of many minds argues that “associated with a sentient being at any given time, there is a multiplicity of distinct conscious points of view . . .  it is these conscious points of view or ‘minds’ . . . that are to be conceived as literally dividing or differentiating over time” (1996: 170). For him there is a sense in which our observer can regard herself as having just one mind. He calls this her “multimind” or “Mind,” and it consists of all the minds (lower-case “m”) that are described in the terms of the state of the system of which the observer is a part (1996: 177). Each mind has a “maximal experience” that describes its complete state of consciousness, but it should not be identified with a state of the Mind of the observer. Lockwood takes it that there is “complete supervenience of the mental on the physical” and so he avoids the dualism that plagues Albert and Loewer’s version (1996: 184). To do this, though, he agrees with Albert and Loewer that one must give up “the assumption that . . . there is a uniquely correct way of linking earlier and later maximal experiences of the same Mind together to form persisting minds . . .” but he does not take that to be fatal to his project (1996: 183–184). For Lockwood, each mind makes up one subset of the Mind and “each stands in an equal relation of succession to the given . . . maximal experience . . . with which we started” (Lockwood, 1996: 183). To go back to our example, there is a mind associated with each term in (4), and these go to make up the Mind of the observer. Each of these minds has equal claim to be the successor to the mind in (3) that was in the ready state. So it is unclear who the observer in (3) should expect to be once the measurement is made.

Diachronic personal identity leads to the problem of how to interpret probabilities from this viewpoint. Lockwood believes that he has a meaningful interpretation of probability as “a naturally preferred measure on sets of simultaneous maximal experiences” (1996: 182). He wants to analogize this to duration so that in analogy to “this pain lasted twice as long as the last one” he can say something like, “this pain is, superpositionally speaking, twice as extensive as that” (182; Lockwood’s emphasis). The extensiveness of an experience is a function of its “temporal ‘length’ and of superpositional ‘width’” in the sense that it has a higher coefficient associated with its term (182). It is largely agreed that Lockwood does not have a sufficient explanation for probability. Loewer (1996) and Barrett (1999) both argue that Lockwood’s conception of probability is insufficient for explaining how we are to take probabilities to be guides to rational decision making and prediction. Lockwood cannot explain how the norm squared of the coefficient ought to guide an observer’s predictions about what her experience will be in the future since there is no way to track her over time. In our example above, when we ask what the probability will be of our observer in (3) seeing “z-spin up” upon measurement, we have to say it will be 1. The same is true for “z-spin down,” since in some term there is a mind that goes to make up the Mind of the observer that observes each measurement result.

Barrett (1999) argues that aside from this problem, Lockwood has a “very unusual notion of probability in mind” (209). He wants to “introduce a (sic) entirely new notion of probability,” and one that Barrett finds puzzling and insufficient for explaining an observer’s determinate experiences (209–210).

There are several other interpretations of Everett that are in the same vein as the single minds and many minds views of Albert and Loewer and Lockwood, but none of them are quite as fully worked out as those presented here. The interested reader may want to look at Zeh 1981, Squires 1987, Donald 1990, Squires 1991, Stapp 1991, Squires 1993, Donald 1993, Donald 1995, and Page 1995.

c. Everett Plus Worlds (DeWitt’s Splitting Worlds)

What is arguably the most common interpretation of Everett’s formulation is what Bryce DeWitt called the many-worlds interpretation [MWI] (DeWitt 1968 and Wheeler 1998: 269–70). In his 1967 lecture on what he calls the “Everett-Wheeler interpretation” of quantum mechanics, DeWitt takes Everett’s claim that:

. . . with each succeeding observation (or interaction), the observer state “branches” into a number of different outcomes of the measurement . . . for the object-system state.  All branches exist simultaneously in the superposition after any given sequence of observations (Everett 1957a: 25–6; Everett 1957b: 146).

to imply that we are forced:

. . . to believe in the ‘reality’ of all the simultaneous ‘worlds’ represented in the superposition [in which we find the universe after a measurement interaction] . . . in each of which the measurement has yielded a different outcome (DeWitt 1968: 326).

The first published version of Everett’s dissertation began to gain popularity when it was reprinted in The Many Worlds Interpretation of Quantum Mechanics (DeWitt and Graham 1973).  In his papers in this volume, DeWitt explicitly refers to the “reality of all simultaneous worlds” that he believes are implied by his reading of Everett and to the “reality composed of many worlds” that he takes Everett’s formalism to have taught us is true about our universe (DeWitt 1970: 161, DeWitt 1971: 182). So from this point on, Everett’s interpretation has most commonly been known as the “many worlds interpretation” of quantum mechanics.

The branching that occurs in DeWitt’s MWI can be interpreted in several different ways. One possibility is DeWitt’s way, which suggests that:

[o]ur universe must be viewed as constantly splitting into a stupendous number of branches, all resulting from the measurementlike interactions between its myriads of components . . . every quantum transition taking place on every star, in every galaxy, in every remote corner of the universe is splitting our local world on earth into myriads of copies of itself (DeWitt 1970: 178).

DeWitt takes a strong realist position in regards to the worlds that are the result of the branches splitting.  He takes each branch to be “a possible universe-as-we-actually-see-it” (DeWitt 1970: 163) and believes that in spite of the fact that “all branches must be regarded as equally real” (DeWitt 1970: 178; see also Everett 1957b: note added in proof), we inhabit only one of the worlds that go to make up reality and we have no access to other worlds (DeWitt 1970: 182).

In order to see how this fares with regards to the question of determinate measurement results, let us make a distinction between a local I, IL, and a global I, IG. IL is the self who experiences determinate measurement results when conducting quantum mechanical experiments; IL sees only one outcome of every experiment. IL is who we generally consider ourselves to be. So when DeWitt says that we inhabit only one world of the many possible worlds, we can take him to be referring to weL as a collection of ILs. We then can understand him to believe that there will be an IL for each world that is created in the split. In contrast, there is always only one IG. We can think of the IG as the self, were it to exist, which has access to all the worlds. IG can be thought of as someone outside the theory who can see the branching structure of the world and who knows every outcome of quantum mechanical experiments, someone who has a “god’s-eye view” of the universe as a whole. We as ILs do not seem to have access to the experiences of IG. Thus, the only thing that needs to be explained in terms of determinate measurement results is the perspective of IL since this is the only perspective we seem to have, and according to Everett, it is the only perspective that is of any importance to an observer (Everett 1973: 99).

Everett argues that relative states provide a way to understand an observerL‘s determinate measurement record. The only place where the indeterminacy shows up is in the global perspective of a state; the local, relative perspective will always be determinate. Everett argues that an observer will never have access to the global state and therefore will always and only have determinate measurement results relative to his state (Everett 1973).

DeWitt’s branching worlds follow along the same line. For him, there are many different branches of the universe and each branch splits every time there is an observation or correlation made. On each branch is an observerL who will split every time the branch does. Each observerL gets a determinate record of whatever outcome occurs on his branch, because that is exactly what happened on that branch.

i. Problems with the Notion of Splitting

However, there are objections that have been raised to DeWitt’s proposal that we take each branch to be “a possible universe-as-we-actually-see-it.” The main objection has been that the idea that we in some way split or branch seems absurd given our experience in the world (Everett 1957c: 2; DeWitt 1971: 179; DeWitt and Graham 1973: 161; Xavier 1962: Tuesday Morning, p. 20).

Additionally, one might object to what turns out to be an infinite (possibly uncountable) number of worlds (Healey 1984; Saunders 1997). Some early many-world interpreters have taken Everett to be implying such a profligacy of worlds when he writes: “[a]ll branches exist simultaneously in the superposition after any given sequence of observations” (Everett 1957b: 146), and so “[f]rom the viewpoint of the theory all elements of a superposition (all ‘branches’) are ‘actual’, none any more ‘real’ than the rest” (Everett57b: note added in proof; Everett’s emphasis).

There appear to be two distinct issues raised.  The first is that a multitude of worlds, always splitting, seems to defy our sense that we are not splitting.  The second is that it seems to run counter to intuition to propose that there are multiple copies of the world, and of people, that constitute the universe.  Let us see how these objections are addressed one at a time as doing so will shed light on Everett’s thoughts about the relative state formulation.

The first objection, that we do not feel the splitting, is addressed by Everett both in his response to DeWitt’s letter in 1957 (Everett 1957c) and ultimately in the short dissertation (Everett 1957b).  In the letter to DeWitt he writes:

I must confess that I do not see this “branching process” as the “vast contradiction” that you do.  The theory is in full accord with our experience (at least insofar as ordinary quantum mechanics is).  It is in full accord just because it is possible to show that no observer would ever be aware of any “branching,” which is alien to our experience as you point out.

The whole issue of the “transition from the possible to the actual” is taken care of in a very simple way – there is no such transition, nor is such a transition necessary for the theory to be in accord with our experience.

From the viewpoint of the theory, all elements of a superposition (all “branches”) are “actual,” none any more “real” than another.  It is completely unnecessary to suppose that after an observation somehow one element of the final superposition is selected to be awarded with a mysterious quality called “reality” and the others condemned to oblivion – they won’t cause any trouble anyway because all the separate elements of the superposition (“branches”) individually obey the wave equation with complete indifference to the presence of absence (“actuality” or not) of any other elements.

This is only to say that the theory manages to avoid the difficulty of the “transition from possible to actual” – and I consider this to be not a weakness, but rather a great strength of the theory.  The theory is isomorphic with experience when one takes the trouble to see what the theory itself says our experience will be.  Little more can be asked of it without exposing a naked philosophic prejudice of one kind or another (Everett 1957c: 3; Everett’s emphasis).

In the short dissertation this becomes:

Arguments that the world picture presented by this theory is contradicted by experience, because we are unaware of any branching process, are like the criticism of the Copernican theory that the mobility of the earth as a real physical fact is incompatible with the common sense interpretation of nature because we feel no such motion.  In both cases the argument fails when it is shown that the theory itself predicts that our experience will be what it in fact is.  (In the Copernican case the addition of Newtonian physics was required to be able to show that the earth’s inhabitants would be unaware of any motion of the earth.) (Everett 1957b: note added in proof).

Everett had no problem with an observerL not feeling the split because his theory predicts that the splitting of worlds is not something of which IL could be aware.  An observerL will never have access to the global state of a system and so will never be able to observe anything but the state of his branch.  It requires a perspectiveG to observe any branching event.  Since there can never be any observation of a branching event, there can never be a physical record (memory) of the event.

The second objection comes from those who want the most economic metaphysics possible.  Such philosophers might argue that to have an infinite number of worlds after a split is metaphysically extravagant.  While some theorists who first encounter this idea initially balk at it, ultimately most do not find anything terribly objectionable in it.  In fact, most MWI theorists embrace the notion of a multitude of existing worlds.  If the MWI as proposed by DeWitt solves the quantum measurement problem, then this metaphysical extravagance may be worth the cost.

For more on the story of the evolution from the long to the short thesis, and on Everett’s life, see Byrne 2010.

ii. The Preferred Basis Problem

DeWitt’s MWI fails to provide us with any explanation of when worlds split.  Unfortunately, saying when they do is just as difficult as saying when a collapse occurs, which is one way to understand the quantum measurement problem. To determine when a world splits, we would first need to know in which basis we should write the universal state. If we knew this, then we would know that a split has occurred because a new term would show up in the global state when it is written in the preferred basis.  The choice of basis also determines which properties in the universe are determinate—namely, those that are represented by vectors that are on the axes of the basis—and which worlds exist after a measurement interaction. DeWitt does not provide any way to choose a particular basis, and any way that we might suggest in the context of his MWI would be blatantly ad hoc.  This problem has come to be known as the preferred basis problem.

Even though DeWitt’s MWI seems to provide us with an explanation of why we get determinate measurement records, that explanation assumes that there is a basis in which the universal state has been written that guarantees that those measurement records are in fact determinate.  In order to be able to assert the determinateness of records in a particular branch, the basis that we choose ought to be one in which those records are determinate.  But the basis in which the pointer on our measuring device has a determinate position and the basis in which our mental states are determinate are not necessarily the same.  It is not clear that we can choose a basis in which they will both be determinate, not to mention all the other things that we want to have as determinate properties in order to be conscious beings capable of successfully completing a quantum mechanical experiment.

Without a preferred basis in which to write the universal state, one is free to choose any basis one likes with the result being that in different bases there will be different decompositions of the universal state vector.  If each term in the expansion of the universal state is taken to represent a different possible world, or a different description of a world, then with each choice of basis there will be different terms and so different worlds. Thus, without a preferred basis, there is no fact of the matter as to when splits occur, no fact of the matter as to which properties in the universe are determinate and no fact of the matter as to which worlds go to make up the universe; rather, these are all a function of the choice of basis.

So we are left without answers to several questions: What worlds are there?  Which properties are determinate?  When do worlds split?  Solving this is as difficult as solving the original quantum measurement problem and in fact is a version of it (Barrett 1999: 176).  Thus, no progress has been made toward solving the measurement problem, insofar as this is a problem that could be solved by saying when collapses or splits occur.  So the cost of DeWitt’s MWI is the reintroduction of the measurement problem, something Everett’s pure wave mechanics was developed to avoid.  Fortunately for those sympathetic to Everett’s ideas, quantum mechanics without the collapse postulate has evolved in the decades since DeWitt’s writing.  In what follows we will see how the concept of “many worlds” has been developed within the interpretation of Everett’s physics.

4. The Evolution of Many Worlds Interpretations

a. The Problem of Probability and Graham’s Attempt at Adding a Measure

Something that the standard interpretation of quantum mechanics provides that is missing in pure wave mechanics is a way to account for probabilistic claims. In any MWI of Everett’s relative state formulation, there is in some sense a different world in which every possible outcome of an experiment occurs; every world is real, and therefore every outcome occurs.  We lose the ability to say, “Event e happens with probability p” where p is less than 1. Everett writes, “In order to establish quantitative results, we must put some sort of measure (weighting) on the elements of the final superposition” (Everett 1957b: 147).

In the standard interpretation, this is done with the use of the Born rule (Born 1926). The Born rule is what physicists use to assign probabilities to the outcomes of quantum mechanical experiments. For each term in the state of the system, written in some basis, there is associated with it a complex number, the amplitude of that region of the wave function. The Born rule says to take that number and square its norm in order to get the probability of that term being the outcome of a measurement. But in EQM, a deterministic theory in which everything happens, the Born rule does not prima facie seem to be applicable.

To be able to derive probability from within his theory, Everett first needed to define what he meant by a “typical element of the final superposition” since “typical” presupposes some notion of probability. Presumably he meant an element in which the predications of quantum mechanics are borne out. But if he determined what is “typical” by counting up all the branches, taking there to be one world for each term in the expansion of the universal state when it is written in the preferred basis, and calling a world “typical” when it displays the same results as most of the others, then there is a problem because in a large majority of the worlds, by this measure, the quantum statistics will not even be close to true (Graham 1973: 235).

Neill Graham was the first to suggest that there was something missing from Everett’s derivation of a probability measure. He writes that the worlds that display the proper relative frequency are “in a numerical minority” and “any attempt to show that the probability interpretation holds in the majority of the resulting Everett worlds is doomed to failure” (Graham 1973: 236). (See also Barrett 1999: 168–70 for a very clear explanation of why the statistics fail to work for any coefficient other than 1/√2.) But Everett not only suggests that we use a counting measure, he believes that he has shown that the “only choice of measure” is the square amplitude measure (Everett 1957b: 147). A typical world is then one for which the value of the square amplitude measure is high.  But Everett’s use of this measure is not a solution to the original problem. The original problem was to derive probability from a deterministic theory in which everything happens. To solve this one needs to add the assumption that after a measurement one should expect to find oneself in a “typical” world, that typicality being determined by the norm-square of the coefficient on the term that describes that world; the higher the value, the more “typical” the world. But this is akin to claiming that the Born rule holds.  (For more on this see Barrett 1999: 168–173, Wallace 2007 and 2012.)

Early 21st century MWI theorists take a very different view of the multiplicity of the world, thereby solving some of the problems inherent in earlier MWIs. This view is in line with what David Wallace has proposed, namely that the multiple branches of the universe that arise from the mathematical theory of quantum mechanics are emergent (Wallace 2012). Because most of those who have developed and are proponents of Wallace’s emergent branching universe view are or have been located around Oxford, we will call this view the “Oxford MWI.”

b. The Oxford MWI

The main difference among Wallace’s emergent branching universe view, the Oxford MWI, and the MWIs that came before is that prior MWI theorists, in large part, understand the wave function to be a real entity, leading to a real multiplicity of worlds at the fundamental level of the theory. (For more on the question of realism regarding the wave function, see Ney and Albert 2013.) Wallace, on the other hand, sees these worlds to be emergent from the underlying microphysical description of the universe. They are no less real, but they are structural facts that are instantiated within EQM (Wallace 2012: Chapter 2). Wallace explains how he conceptualizes these structures with what he calls “Dennett’s criterion”: “A macro-object is a pattern, and the existence of a pattern as a real thing depends on the usefulness – in particular, the explanatory power and predictive reliability – of theories which admit that pattern in their ontology” (Wallace 2010a: 58, Wallace 2012: 50).

Wallace asks us to consider a tiger and its hunting patterns. We can describe both in terms of electrons and atoms, but in that description we do not see the patterns that emerge when we consider them at the macroscopic level. The tiger atoms and the swirl of atoms and energy that make up the hunting patterns are real, objective parts of the microphysical system, but they are not practically useful for predicting how tigers behave in the wild. Zoology cannot be reduced to physics. Rather, physics instantiates zoology. Wallace illustrates the instantiation relation with the example of the relation between quantum mechanics and a classical conception of the solar system. Classical mechanics is instantiated by quantum mechanics and is applicable to the solar system because some of the solar system’s properties “approximately instantiate a classical-mechanical dynamical system” (Wallace 2012: 56).

Applying these considerations to EQM, one can say that the microphysical description of the universe contains descriptions of states of affairs that are structured like the macroscopic objects we encounter in the world. When two of these states are superposed with one another, the quantum state of the system instantiates two different macroscopic systems at once. Thus, one can say that two different worlds emerge from the microphysical-level description that instantiates them. Wallace writes that “there are entities whose existence is entailed by the theory which deserve the name ‘worlds’” (Wallace 2010a: 68); thus we should take these worlds to be real entities.

Wallace considers decoherence the only consideration that one should use to help determine how the universe ought to be carved up. A system is said to “decohere” when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state, the state in which one finds an entangled system. Decoherence theorists take the destruction of the interference effects to be all that is required to explain determinate experience. Because such correlations are ubiquitous and radically swift, and because it is, in practice, impossible to isolate a macroscopic system from its environment sufficiently to prevent such correlations (even the best isolated system, if it is above absolute zero, will radiate heat and therefore interact with its environment), these correlations will produce results that seem to indicate that a collapse has occurred and not that the system is still in a superposition. The property that decoherence picks out is very close to position when one is considering an experiment that requires us to see the position of a pointer on a measuring device (as pointing either “up” or “down,” for example). (For more on decoherence see the original formulations of it in Zeh 1970, Zeh 1973, Zurek 1981, and Zurek 1982. In Zurek 1991, Zeh 1995, Giulini et al 1996, Butterfield 2001, Zurek 2002 and Schlosshauer 2007 the reader can find very accessible introductions to and discussions of decoherence.)

Given that there is no preferred way to carve up the universe, aside from decoherence considerations, and no particular “most-fine-grained” way to describe the quantum structure of the universe, there is also no fact of the matter about how many branches there are, but Wallace does not see this as a problem. Rather he sees it as misguided to even ask the question “How many worlds are there?” much as it is misguided to ask “How many experiences did you have yesterday?” (Wallace 2010a: 67–8, Wallace 2012: 99–102, 120).

Recall that part of the job of Everett interpretation is to explain how it is that we get determinate measurement records when we do quantum mechanics experiments. Wallace argues that “the emergence of a classical world from quantum mechanics is to be understood in terms of the emergence from the theory of certain sorts of structures and patterns” (2003: 5), and it is these structures that are in superpositions, not the emergent macroscopic objects. To see what he means here, it is worth quoting him at some length:

To see in a different way how the ideas of Sections 4-5 resolve the problem of macroscopic indefiniteness, consider the following sketch of the problem.

  1. After the experiment, there is a linear superposition of a live cat and a dead cat.
  2. Therefore, after the experiment the cat is in a linear superposition of being alive and being dead.
  3. Therefore, the macroscopic state of the cat is indefinite.
  4. This is either meaningless or refuted by experiment.

But (1) does not imply (2). The belief that it does is based upon an oversimplified view of the quantum formalism, in which there is a Hilbert space of cat states such that any vector in the space is a possible state of the cat. This is superficially plausible in view of the way that we treat microscopic subsystems: an electron or proton, for instance, is certainly understood this way, and any superposition of electron states is another electron state. But any state of a cat is actually a member of a Hilbert space containing states representing all possible macroscopic objects made out of the cat’s subatomic constituents. Because of Dennett’s criterion, this includes states which describe

a live cat;
a dead cat;
a dead dog;
this paper . . .

We can say (if we want, and within nonrelativistic quantum mechanics) that the particles which used to make up the cat are now jointly in a linear superposition of being a live cat and being a dead cat. But cats themselves are not the sort of things which can be in superpositions. Cats are by definition “patterns which behave like cats”, and there are definitely two such patterns in the superposition.

The point can be made more generally:

It makes sense to consider a superposition of patterns, but it is just meaningless to speak of a given pattern as being in a superposition (Wallace 2003: 12; Wallace’s emphases).

In the Oxford MWI, decoherence gives rise to the branching structures that make up the various worlds. And it is these structures that give rise to the patterns from which macroscopic objects emerge. Decoherence also causes the interference between branches to “wash out,” and so systems appear to have determinate values in the decoherence basis. (For more on the Oxford MWI see Albert 2010, Kent 2010, Maudlin 2010, Price 2010, Vaidman 2014 and Bacciagaluppi and Ismael 2015.)

There was quite a bit of work at developing a theory of probability in the context of the Oxford MWI (Saunders 1995, Saunders 1998, Vaidman 1998, Wallace 2002, Wallace September 2003, Saunders 2005, Wallace 2006, Greaves January 2007, Wallace 2007, Greaves and Myrvold 2010, Saunders 2010, Wallace 2010b, Tappenden 2011, Vaidman 2012, Wallace 2012, Wilson 2013). The work generally focuses on two different concerns in probability theory: explaining how one can recover uncertainty from a deterministic world, and explaining how we can make sense of the fact that we seem to be able to take branch weights to be related to probability, as the Born rule suggests that we can in a standard collapse interpretation.

Simon Saunders presents a tripartite view of what role chance plays in standard single world views of probability:

(i) Chance is measured by statistics, and perhaps, among observable quantities, only statistics, but only with high chance. (ii) Chance is quantitatively linked to subjective degrees of belief, or credences: all else being equal, one who believes that the chance of E is p will set his credence in E equal to p (the so-called “principal principle”). (iii) Chance involves uncertainty; chance events, prior to their occurrence, are uncertain (Saunders 2010: 181).

Linking chance to statistics (as in (i)) was originally argued by Everett (1973), as we have seen above. But while this might explain certain aspects of probability, it does not explain why we ought to take branch weights to be probability. The arguments for (ii) and (iii) aim to answer this question.

Wallace and others have argued that we can make sense of probability in a theory in which all possible outcomes occur (what has been called the “incoherence problem” in Greaves 2004, Wallace 2005, Wallace 2006, Baker 2007, Lewis 2007, and Saunders and Wallace 2008a; and what has been alternatively termed “Subjective Certainty” by Greaves 2004, Baker 2007, Greaves 2007, Lewis 2007, and Greaves and Myrvold 2010). Wallace’s justification for the claim that we can make sense of probability in a determinate universe is the emergent structure that we have discussed above. It is only at the fundamental level that EQM is deterministic; at the emergent level it is not (Wallace 2012: 115). Once this has been established, two other problems remain to be solved, what Wallace calls the “practical problem” and the “epistemic problem” (Wallace 2012: 158). The first of these is how to justify allowing branch weights to play the role in decision making that ordinary probability plays in a classical context ((ii), above). The second asks how we justify taking branch weights to play the role of probability in showing that the results of experiments support quantum mechanics. Albert (2010) explains this general concern clearly when he writes, “Why (for example) should it come as a surprise, on a picture like [EQM], to see what we would ordinarily consider a low-probability string of experimental results? Why should such a result cast any doubt on the truth of this theory (as it does, in fact, cast doubt on quantum mechanics)” (Albert 2010: 356)?

The general principle of rationality on which Wallace’s argument is founded is David Lewis’ Principal Principle (Lewis 1980). In Wallace’s terminology: “For any real number x, a rational agent’s personal probability of an event E conditional on the objective probability of E being x, and on any other background information, is also x” (Wallace 2012: 140). Wallace’s goal is to show that there is an “Everett-specific derivation of the Principle” and,

to prove, rigorously and from general principles of rationality, that a rational agent, believing that (Everett-interpreted) quantum mechanics correctly gives the structure and dynamics of the world and that the quantum states of his branch is |y>, will act for all intents and purposes as if he ascribed probabilities in accordance with the Born Rule, as applied to |y> (Wallace 2012: 150, 159).

To provide this rigorous proof, Wallace builds on an argument of David Deutsch’s (1997, 1999). He uses Deutsch’s results and furthers them to argue that branch weight non-circularly serves as objective probability. As the details of both Deutsch’s and Wallace’s arguments are quite technical, I leave it to the interested reader to investigate more fully. (More on the Deutsch-Wallace derivation of probability and refinements to their work can be found in Wallace 2002, Wallace September 2003, Greaves 2004, Saunders 2005, Greaves January 2007, Greaves March 2007, Wallace 2007, Greaves and Myrvold 2010.)

The arguments for (iii), that chance involves uncertainty, are the focus of quite a number of papers. Saunders believes that to have a fully worked out view of probability, one must explain uncertainty in EQM (2010: 189–90).  A view called Subjective Uncertainty [SU] is described by Saunders (1998). There he argues that there is uncertainty in even the deterministic physics of Everettian quantum mechanics. He asks us to consider a pre-measurement observer at time t1. Call her she1. When she1 measures the x-spin of a z-spin particle, this results in a branching of her world and she1 ends up with two successors at time t2: “she2” who sees “up” as her measurement result, and “she2” who sees “down” as her measurement result. The question is, “Who should she1 expect to become?”

There seem to be three possibilities: She can expect to become one of them, both of them or neither of them (Saunders 1998: 383). Saunders claims that it is nonsense to suggest that she1 should expect to become neither of them. In the emergent branching view, branching events are ubiquitous and yet we have the experience of continuing to move through time. She would not expect to become both of them because we do not have the experience of seeing two measurement results of our experiments. So the only remaining option is that she1 should expect to become one of her successors, but she does not know which one. This is the basis for the SU attitude about uncertainty. (For more on the link between uncertainty and probability in EQM see Ismael 2003, Greaves 2004, Wallace 2006, Baker 2007, Greaves January 2007, Greaves March 2007, Lewis 2007, and Wallace 2007.)

Saunders links uncertainty to questions of personal identity in EQM (2010). If an agent, Alice, does not know what branch she is on, that can account for a type of uncertainty in EQM. This implies that some of the questions of uncertainty, and therefore probability, will depend upon answers to the question of diachronic personal identity in the context of EQM. (For more on self-locating uncertainty and diachronic identity in EQM see Vaidman 1998, Wallace 2005, Lewis 2007, Lewis 2008, Saunders and Wallace 2008a, 2008b, Tappenden 2008, Wallace 2012, Conroy 2016, and Sebens and Carroll 2016.)

c. Objections to the Oxford MWI

There are of course objections to the Oxford MWI. The first set to consider are those that focus on the use of decoherence to solve the preferred basis problem—a problem that must be solved to explain determinate measurement records and probability.

Recall that a system is said to decohere when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state. The coefficients on each term of the state of the system, when one traces over the environment (that is, when one essentially ignores the effects that are caused by only the environment) approximately match the probabilities one obtains from the standard collapse formulation. But even though decoherence theorists believe they have solved the problem of accounting for what seem to be determinate measurement records (Zeh 1997), such interactions do not produce determinate results (Barrett 1999). The interaction between, say, a pointer on a measuring device and its environment does not produce a determinate position for the pointer. Decoherence destroys the interference effects, but it does not produce determinate measurement records. There is a sense as if a collapse has occurred, but all we really end up with is a more complex entangled superposition (Albert 1992, Barrett 1999).

Related to this worry is the fact that using decoherence considerations to choose a preferred basis does not clarify exactly which basis is chosen. In every interaction the property that decoheres most quickly and completely will always be one that is close to position, but it will not always be the same property each time (Barrett 1999: 242–4). This is not going to be troubling to a MWI theorist like Wallace, however, as he does not take there to be a particular fine-grained way to carve up the universe—provided that it is done in a way that preserves the emergence of macroscopic entities from the underlying structure (Wallace 2012).

An additional problem with taking a property very close to position to be the preferred basis in which to write the state of the system is that this is adding an additional principle to quantum mechanics, one that says that whatever decoheres most quickly and completely is the preferred observable chosen to be determinate (Barrett 1999). This does not concern most Oxford MWI theorists, though, as they do not take themselves to be doing Everett exegesis.

A more pressing problem that faces the MWI theorist who relies on decoherence is the question of circularity. David Baker (2007) and Ruth Kastner (2014) have both argued that the use of decoherence begs the question regarding the derivation of probability because the use of decoherence presupposes a concept of objective probability. As Baker puts it, “proofs of decoherence depend implicitly upon the truth of the Born rule. Without first justifying the probability rule, Everettians cannot establish the existence of a preferred basis or the division of the wave function into branches. But without a preferred basis or a specification of branches, there can be no assignment of probabilities to measurement outcomes” (Baker 2007: 3). Kastner puts her concern this way: “the goal of decoherence is to obtain vanishing of the off-diagonal terms, which corresponds to the vanishing of interference and the selection of the observable R as the one with respect to which the universe purportedly ‘splits’ in an Everettian account . . . the vanishing of the off-diagonal terms is crucially dependent on an assumption that makes the derivation circular” (Kastner 2014: 57). Both Baker and Kastner are pointing to the fact that in order to ignore the off-diagonal terms, the crucial step in decoherence that provides the preferred basis, one must already have a conception of probability.

The best that Oxford MWI theorists can do, according to Baker, is to show that because the off-diagonal terms are incredibly close to zero, the observer can ignore them as part of her decision making (she ought not care much about them). If, however, that is justified by saying that it is because those terms are very unlikely to occur, then one is bringing in an illegitimate notion of probability (Baker 2007: 19ff). Kastner points to a different problem, the problem of the arbitrariness of the division between system, measuring device and environment (Kastner 2014: 57). It is crucial that one be able to ignore the effects of the environment, but if the division is arbitrary, then there is no non-circular way of isolating only the environment.

There are at least two other objections raised against the Oxford MWI that are in this vein, Barnum et al 2000 and Hemmo and Pitowsky 2007. The former points to a hidden assumption in Deutsch’s proof, one that “is not just a minor addition to Deutsch’s list of assumptions, but rather a major conceptual shift. The assumption is akin to applying Laplace’s Principle of Insufficient Reason to a set of indistinguishable alternatives, an application that requires acknowledging a priori that amplitudes are related to probabilities” (Barnum et al 2000: 1180). Hemmo and Pitowsky criticize the use of the Everettian Principal Principle, essential to Wallace’s derivation of the Born rule, on the grounds that it is incoherent to claim that observers should treat a term with a measure of zero as if it had a probability of zero. They also argue that there is no reason to believe (that is not question-begging) that rational observers would be justified in believing the statistical predications of quantum mechanics if they also believed the Oxford MWI (Hemmo and Pitowsky 2007: 334). Some of these objections were addressed by Wallace and others in the intervening years. But other criticisms have also been raised. Most of these criticize the use of decision theory to guide the derivations of probability in EQM.

David Albert (2010) argues that the entire program undertaken by Deutsch and Wallace (and others supporting them) misses the point. He writes, “the question out of which this entire program arises, seems like the wrong question. The questions to which this program is addressed are questions of what we would do if we believe that the fission hypothesis were correct. But the question at issue here is precisely whether to believe that the fission hypothesis is correct” (359)! The “fission hypothesis” is the hypothesis that the Schrödinger equation is the complete story of the evolution of the world and that each branch that results from a branching event has an observer with an actual experience. He goes on to say, “The decision-theoretic program seems to act as if what primarily and in the first instance stands in need to being explained about the world is why we bet the way we do” (359). And this is not what Albert believes needs to be explained, but rather, “What we need is an account of our actual empirical experiences of frequencies” (360).

Peter Lewis (2010) has argued that the decision-theoretic considerations that guide both Deutsch’s and Wallace’s arguments are not sufficient to show that the only rational guide to an observer’s decision-making procedures is the Born rule. Lewis argues that there is a gap in Deutsch’s proof by showing that there are other rules that an observer can follow and still be consistent with the rationality constraints assumed by Deutsch. Lewis acknowledges that Wallace has filled the gaps in Deutsch’s argument by adding a new axiom of rationality, but that unlike Deutsch’s axioms, Wallace’s addition is “not an innocuous and general axiom of rationality . . . [rather] it is a substantive claim about decision-making in the specific context of Everettian quantum mechanics, and so requires a substantive justification” (Lewis 2010: 21). Lewis does not believe that the justification that Wallace provides is sufficient. Alastair Wilson (2013, 2015) has worked to counter this criticism of Lewis’ by proposing new principles that tie the physics of EQM to modal metaphysics, thereby helping to provide the justification for some of Wallace’s most contentious claims.

Adrian Kent (2010) argues that Wallace’s attempt to axiomatize rational decision making in EQM, using decision theory as its model, is incoherent, and that in fact “Wallace’s axioms are not constitutive of rationality either in Everettian quantum theory or in theories in which branchings and branch weights are precisely defined” (307). Huw Price (2010) argues that because in the Oxford MWI view there are multiple future successors to an observer that she ought to care about, it is irrelevant to the observer (as far as rationality requirements go) which future branch she will subjectively occupy (378–79). Following this and further detailed argument, Price concludes that “there seems little prospect that a Deutschian decision rule can be a constraint of rationality, in a manner analogous to the classical case” (389).

the Oxford MWI has had a great deal of influence on many philosophers, and so work on the question of probability has not stopped. For more work on the question of probability see Dizadji-Bahmani 2013, Dawid and Thébault 2014, Wilson 2015, Jansson 2016, and Sebens and Carroll 2016.

5. Relational Interpretations of Everettian Quantum Mechanics

Given the centrality of Everett’s notion of relative states, it seems important to consider those interpretations that also highlight their importance. In section 3a we considered the bare theory interpretation of Everett. The two interpretations considered in this section are Simon Saunders’ relational interpretation and the relative facts interpretation.

a. Simon Saunders’ Relational Interpretation

Simon Saunders developed an interpretation of Everett in which values for systems can only be defined relative to a point of view, or a context of interest or relevance.  He proposes that just as we understand facts about tense to be relations, we should also understand facts about the properties of systems to be relations (Saunders 1993, 1995, 1996a, 1996b, 1997, 1998).

Let us assume that what appears to be the case is in fact the case, and that we are tracking the truth about the world when we make statements about how the world appears to us.  Then, there must be something about the world that makes true the proposition

 (9)  My coffee cup is on my desk.

This is what is known as the truthmaker principle.  Saunders has proposed that we understand the truthmaker for (9) analogously to the way B-theorists about time understand the truthmaker for a proposition like

(10)  My coffee cup was at home.

A B-theorist takes the truthmaker for (10) to be a fundamentally relational fact. For the B-theorist a statement like (10) can be reduced to a statement like

(11) The event of my coffee cup’s being at home is earlier than now.

For a B-theorist, relations such as the one in (11) are permanent dyadic relations that order positions in time. Other B-series relations include simultaneous with, earlier than, and later than. (For more on the B-theory of time, see McTaggart 1908, Maudlin 2002 and Markosian 2008 and the article on time in this encyclopedia.)

Saunders draws an analogy between what he considers the nature of the truthmaker for a proposition about the property of an object and what a B-theorist considers the nature of the truthmaker for a proposition about the property of an event like

(12) Event e is past.

In both cases the truthmaker is some fundamentally relative fact (Saunders 1995).

So now suppose that there are two non-concurrent events, E and E’. Then the following two statements, while both true, appear to be contradictory:

(13) Event E is now.

(14) Event E’ is now.

However, if we introduce two other events, W and W’, that are not identical and not concurrent, then we can resolve the apparent contradiction by instead saying:

(15) Event E is now relative to event W.

(16) Event E’ is now relative to event W’.

And these two statements are not contradictory.

Saunders suggests that we extend this analogy to the consideration of truthmakers for propositions like (9).  If we do, then a seeming contradiction in pure wave mechanics has a solution that is analogous to the solution to the apparent contradiction in tense metaphysics (Saunders 1995).

In a relational interpretation of EQM, most every physical systemG will typically have most every possible relative value for a property, just as every event has all of the qualities of past, present and future.  So, it is true to say:

 (17) X has value x.

and

(18) X has value x’.

even if ≠ x’ and the two are mutually exclusive.  But if so, then (17) and (18) are contradictory.

However, if we now introduce two parameters, Y and Y’, that can take values and are such that Y ≠ Y’, we can restate (17) and (18) as:

(19) X has value x relative to Y having value y.

(20) X has value x’ relative to Y’ having value y’.

It is clear that (19) and (20) are not contradictory.

In this view, an event’s having a seemingly particular (tensed) time is analogous to an object having a seemingly particular (determinate) value for a property.  An event happened in the past (or future) relative to another time.  Likewise, an objectL has a determinate value relative to some parameter.

By relativizing the property of an object, what we are doing is explicitly changing the focus of our discussion from that of the properties of an objectG to that of the properties of an objectL.  So in (17) and (18) it is to XG that we are referring, but in (19) and (20) it is to XL that we are referring.  In the tense case, we change our focus from one concept of “now” in pair (13) and (14), to a different, relative concept of “now” in the pair (15) and (16).

The question then becomes, what are the relativizing parameters Y and Y’?  For Saunders, the parameters are worlds, or branches, at a particular time. In light of this, consider again (9).  In analogy with the B-theory, one recourse for explaining its truth (when it is in fact true) is to say that it is true because there is a determinate relative fact that consists in the coffee cup being on my desk.  Each possible fact about the value for the coffee cupG‘s position occurs in a different world.  So Saunders makes sense of the truth of a proposition like (9) by relativizing the cupL‘s position value to the world in which it finds itself.  Relative to being in this world, the coffee cupL is on my desk; relative to being in a different world, the coffee cupL is in the Mariana Trench; relative to being in yet another world, the coffee cupL is in my mother’s kitchen.  Thus, the fact that makes (9) true is a relation between the position value for the coffee cupL and the world in which the coffee cupL finds itself. (For more discussion and criticism of Saunders’ relational interpretation see Barrett 1999, Conroy 2010, Laudisa and Rovelli 2008.)

b. The Relative Facts Interpretation

Another attempt to read a metaphysical picture from EQM is the relative facts interpretation [RFI]. It takes seriously Everett’s claim of the centrality of the notion of relative states and adds no additional principles to his pure wave mechanics. It bears a great deal of similarity to Saunders’ relational interpretation, but whereas Saunders’ view is a many-worlds view, the RFI takes there to be just one world in which all objects generally have relational properties (Conroy 2010, 2012, 2016).

Consider the problem described in the last section of resolving the contradiction between (17) and (18). The RFI can do so by defining the parameters Y and Y’ as being the (relative) state of the complement of the system that we are considering.  So we can make sense of the truth of a proposition like (9) by relativizing a coffee cupL‘s position to the state of the complement of the system of which it is a part.  My coffee cupL is on my desk relative to my having put it there, my desk being in my office, to my not having knocked the cup off, and so forth.  The fact that my desk is in my office is relative to some other collection of relative facts about the complement of its system (the movers that put it there being a part of that), and this goes on ad infinitum.  Likewise, my coffee cupL is in the Mariana Trench relative to my having decided to take a cruise in the Pacific, and my having dropped my coffee cup over the side of the ship at the right time, and so on. The truthmaker for a proposition like (9), in the RFI, is a relative fact.

That there can be a metaphysics of relative facts has been argued in the context of quantum mechanics in a general sense. Because entanglement is an inescapable part of the quantum mechanical world, several philosophers have argued that non-separable, entangled quantum mechanical states imply that there are relations that fail to be supervenient upon or be grounded by non-relational properties of their relata, and that this leads to quantum holism and a metaphysics consisting of non-reducible relations. Thus it seems reasonable to take a relational metaphysics and apply it to Everett interpretation given the importance he places on the notion of relative states. (For more on the development of a relational metaphysics in light of considerations of quantum entanglement see Cleland 1984, Teller 1986, Teller 1989, French 1989, Healey 1991, Esfeld 2003, Esfeld 2004, Schaffer 2010, Calosi 2014, McKenzie 2014, and Esfeld 2016.)

Both the RFI and Saunders’ relational interpretation can explain determinate measurement results by saying that an observer has a determinate relative result. While Saunders has worked on solving the problem of probability (see section 4b above), the RFI still lacks a development of a clear sense of how probability is meant to function.

6. Conclusion

Although there is no consensus about the correct way to interpret the outcome of quantum mechanical experiments, one very influential way of doing so is due to Hugh Everett III, who suggested that we drop the collapse postulate and take the universe to be such that its quantum state is an incredibly complex superposition that never collapses. The work that Everett did in the late 1950s has led many philosophers and philosophers of physics to attempt to build a metaphysical picture of the world based on the physics that he proposed. This article has surveyed the major developments in the work that was inspired by Everett’s 1957 dissertation. Just as there is no consensus on whether or not Everett has the best physics for the description of the world, there is no consensus on the best way to read a metaphysical picture off of the world Everett described. It is left to the reader to adjudicate these debates.

7. References and Further Reading

  • Albert, David Z. Quantum Mechanics and Experience.  Cambridge: Harvard University Press, 1992.
  • Albert, David Z. “Probability in the Everett Picture.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 355–368.
  • Albert, David Z. and Barry Loewer.  “Interpreting the Many Worlds Interpretation.”   Synthese 77 (1988): 195–213.
  • Bacciagaluppi, Guido and Jenann Ismael. “Book Review: The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.Philosophy of Science 82 (1): 129–148: January 2015.
  • Baker, David J. “Measurement Outcomes and Probability in Everettian Quantum Mechanics.” Studies in History and Philosophy of Modern Physics 38 (2007): 153–69.
  • Barnum, Howard, Carlton M. Caves, Jerome Finkelstein and Ruediger Schack. “Quantum Probability from Decision Theory.” Proceedings of the Royal Society of London A, 456 (2000): 1175–1182.
  • Barrett, Jeff.   The Quantum Mechanics of Minds and Worlds.  New York: Oxford University Press, 1999.
  • Barrett, Jeff.  “Ithaca Interpretation of Quantum Mechanics.” In Compendium of Quantum Physics. Greenberger, Daniel, Klaus Hentschel and Friedel Weinert (eds.). Berlin: Springer, 2009: 325–6.
  • Barrett, Jeff. “A Structural Interpretation of Pure Wave Mechanics.” Humana Mente 13  (April 2010): 225–235.
  • Barrett, Jeffrey and Peter Byrne. The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Princeton, NJ: Princeton University Press, 2012.
  • Bevers, Brett. “Everett’s ‘Many Worlds’ Proposal.” Studies in History and Philosophy of Modern Physics, 42 (1) (February 2011): 3–12.
  • Birkhoff, Garrett and John von Neumann.  The Logic of Quantum Mechanics. Vol 37. 1936.
  • Birman, Fernando.  “Quantum Mechanics, Correlations, and Relational Probability.”  CRÍTICA, Revista Hispanoamericana de Filosofía 41, no. 121 (April 2009): 3–22.
  • Bohm, David. “A suggested interpretation of the quantum theory in terms of ‘hidden’ variables, I and II.” Physical Review, 85 (1952): 166–193.
  • Bohr, Niels.  “Quantum Mechanics and Physical Reality.” Nature 136 (1935): 1025–1026. Reprinted in Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 144.
  • Bohr, Niels.  “Causality and Complementarity.” Philosophy of Science 4, no. 3 (July 1937): 289–98.
  • Bohr, Niels.  “Quantum Physics and Philosophy: Causality and Complementarity” (1958a).  In Essays 1958-1962 on Atomic Physics and Human Knowledge.  Vol. 3 of The Philosophical Writings of Niels Bohr.  Woodbridge, Conn.: Ox Bow Press, 1987: 1–7.
  • Bohr, Niels.  Atomic Physics and Human Knowledge.  New York: Wiley, 1958b.
  • Born, Max. “Zur Quantenmechanik der Stoßyorgänge”   Zeitschrift für Physik 37, No. 12 (December 1926): 863–67. English translation, “On the Quantum Mechanics of Collisions.” In Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 52–55.
  • Brown, Matthew. J.  “Relational Quantum Mechanics and the Determinacy Problem.” British Journal for the Philosophy of Science 60 (2009): 679–95.
  • Butterfield, Jeremy. “Some Worlds of Quantum Theory.” In Scientific Perspectives on Divine Action. Robert John Russell, Philip Clayton, Kirk Wegter-McNelly, and John Polkinghorne (eds.). Vatican City: Vatican Observatory Publications, 2001.
  • Byrne, Peter. The Many Worlds of Hugh Everett III.  New York: Oxford University Press, 2010.
  • Calosi, Claudio. “Quantum Mechanics and Priority Monism.” Synthese 191, no. 5 (2014): 915–28.
  • Cleland, Carol. E.  “Space:  An Abstract System of Non-Supervenient Relations.” Philosophical Studies 46, no. 1 (July 1984): 19–40.
  • Clifton, Rob.  Ed.  Perspectives on Quantum Reality.  Boston: Kluwer Academic Press, 1996.
  • Conroy, Christina. “A relative facts interpretation of Everettian quantum mechanics.” PhD Thesis. Irvine: University of California, 2010.
  • Conroy, Christina. “The Relative Facts Interpretation and Everett’s note added in proof. Studies in History and Philosophy of Modern Physics 43 (2012): 112–120.
  • Conroy, Christina.  “Branch-Relative Identity.” In Individuals Across Sciences. Edited by Alexandre Guay and Pradeu. Oxford University Press: New York, 2016: 250-271.
  • Dawid, Richard and Karim Thébault. “Against the Empirical Viability of the Deutsch-Wallace-Everett Approach to Quantum Mechanics.” Studies in the History and Philosophy of Modern Physics, 47 (2014): 55–61.
  • Dizadji-Bahmani, Foad. “The probability problem in Everettian quantum mechanics persists.” British Journal for the Philosophy of Science 0. (2013): 1–27
  • de Broglie, Louis. Solvay Congress (1927). Electrons et Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24 au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay, Paris: Gauthier-Villars, 1928.
  • Deutsch, David. The Fabric of Reality. New York: Penguin Books, 1997.
  • Deutsch, David. “Quantum Theory of Probability and Decisions.” Proceedings of the Royal Society of London A458 (1999): 2911–23.
  • Deutsch, David. “Apart from Universes.” 2010. In Many Worlds? Everett, Quantum Theory, & Reality. Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press: 542-552.
  • DeWitt, Bryce. S. Letter to John Wheeler. 1957. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne (eds.). Princeton: Princeton University Press, 2012: 242–251.
  • DeWitt, Bryce. “Everett-Wheeler Interpretation of Quantum Mechanics.”  1968. In Battelle Rencontres. DeWitt, Cecile M. and John. A. Wheeler, (eds.).  New York: Benjamin, 1 January 1968.
  • DeWitt, Bryce.  1970. “Quantum Mechanics and Reality.”  Physics Today 23, no. 9.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce.  “The Many-Universes Interpretation of Quantum Mechanics.”   Foundations of Quantum Mechanics.  New York: Academic Press Inc., 1971.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce. S., and Neill Graham, (eds.).  The Many-Worlds Interpretation of Quantum Mechanics. Princeton: Princeton University Press, 1973.
  • DeWitt, Cecile M. and John. A. Wheeler, eds.  Battelle Rencontres. New York: Benjamin, 1 January 1968.
  • Esfeld, Michael.  “Do Relations Require Underlying Intrinsic Properties?  A Physical Argument for a Metaphysics of Relations.” Metaphysica: International Journal for Ontology and Metaphysics 4, no. 1 (2003): 5–25.
  • Esfeld, Michael.  “Quantum Entanglement and a Metaphysics of Relations.” Studies in History and Philosophy of Modern Physics 35 (2004): 601–617.
  • Esfeld, Michael. “The Reality of Relations: The Case from Quantum Physics.” In The Metaphysics of Relations. Marmadoro, Anna and David Yates, (eds.). New York: Oxford University Press, 2016: 218–34.
  • Everett, Hugh III. “On the Foundations of Quantum Mechanics.”  PhD Thesis.  Princeton University. 1957a.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III.  “’Relative State’ Formulation of Quantum Mechanics.” Reviews of Modern Physics 29 (1957b): 454–462. Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 141–49.
  • Everett, Hugh III. Letter to Bryce DeWitt.  May 31, 1957c. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 252–256.
  • Everett, Hugh III. “The Theory of the Universal Wave Function.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III. “Quantitative Measure of Correlation.” Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 61–63.
  • French, Steven.  “Individuality, supervenience and Bell’s Theorem.” Philosophical Studies 55 (1989): 1–22.
  • Ghirardi, Giancarlo, Alberto Rimini and Tullio Weber. “Unified dynamics for microscopic and macroscopic systems.” Physical Review D34 (1989): 470.
  • Graham, Neill.  “The Measurement of Relative Frequency.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 229–253.
  • Greaves, Hilary. “Understanding Deutsch’s Probability in a Deterministic Multiverse.” Studies in History and Philosophy of Modern Physics 35 (2004): 423–56.
  • Greaves, Hilary. “Probability in the Everett Interpretation.” Philosophy Compass 2, 1 (January 2007): 109–128.
  • Greaves, Hilary. “On the Everettian Epistemic Problem.” Studies in History and Philosophy of Modern Physics 38, 1 (March 2007): 120–152.
  • Greaves, Hilary and Wayne Myrvold. “Everett and Evidence.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 264–304.
  • Healey, Richard A.  “How Many Worlds?” Noûs 18 (1984): 591–616.
  • Healey, Richard A.  “Holism and Nonseparability.” Journal of Philosophy 88 (1991): 393–421.
  • Hemmo, Meir and Itamar Pitowsky. “Quantum Probability and Many Worlds.” Studies in the History and Philosophy of Modern Physics 38 (2007): 333–350.
  • Hooker, Clifford A. The Logico-Alegraic Approach to Quantum Mechanics, vol. 1.  Dordrecht: D. Reidel, 1975.
  • Ismael, Jenann. “How to Combine Chance and Determinism: Thinking About the Future in an Everett Universe.” Philosophy of Science 70 (2003): 776–90.
  • Jammer, Max.  The Philosophy of Quantum Mechanics.  New York: McGraw Hill, 1974.
  • Jansson, Lina. “Everettian Quantum Mechanics and Physical Probability: Against the Principle of ‘State Supervenience’.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 53 (February 2016): 45–54.
  • Joos, Erich, H. Dieter Zeh, Claus Keifer, Giulini, Domenico, Joachim Kupsch, and Ion-Olimpiu Stametesci, (eds.). Decoherence and the Appearance of a Classical World in Quantum Theory. Berlin: Springer; second revised edition, 2003.
  • Kastner, Ruth E. “‘Einselection’ of Pointer Observables: The New H-Theorem?” Studies in History and Philosophy of Modern Physics 48 (2014): 56–58.
  • Kent, Adrian. “One World Versus Many: The Inadequacy of Everettian Accounts of Evolution, Probability and Scientific Confirmation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 307–354.
  • Laudisa, Federico and Carlo Rovelli.  “Relational Quantum Mechanics.” Stanford Encyclopedia of Philosophy, (2008). http://plato.stanford.edu/entries/qm-relational/.
  • Lewis, David. “A Subjectivist’s Guide to Objective Chance” in Studies in Inductive Logic and Probability, Volume II. Richard C. Jeffrey, (ed.). Berkeley: University of California Press. 1980: 263–293.
  • Lewis, Peter. “Interpretations of Quantum Mechanics.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/int-qm/.
  • Lewis, Peter. “Uncertainty and Probability for Branching Selves.” Studies in History and Philosophy of Modern Physics 38 (2007): 1–14.
  • Lewis, Peter. “Probability, Self-Location and Quantum Branching.” Philosophy of Science 76 (5), (December 2009): 1009–1019.
  • Lewis, Peter.  “Probability in Everettian Quantum Mechanics.” Manuscrito, 33 (2010): 285–306.
  • Lewis, Peter. Quantum Ontology. New York: Oxford University Press, 2016.
  • Mackey, George W.  “Quantum Mechanics and Hilbert Space.” American Mathematics Monthly 64 (1957): 45–57.
  • Mackey, George W.  Foundations of Quantum Mechanics.  New York: W. A. Benjamin, 1963.
  • Markosian, Ned.  “Time.” Stanford Encyclopedia of Philosophy (2008). http://plato.stanford.edu/entries/time.
  • Maudlin, Tim.  Quantum Non-Locality and Relativity: Metaphysical Intimations of Modern Physics. Oxford: Blackwell, 2002.
  • Maudlin, Tim. “Can the World Be Only Wavefunction?” in Many Worlds? Everett, Quantum Theory, & Reality. Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 121–143.
  • McKenzie, Kerry. “Priority and Particle Physics: Ontic Structural Realism as a Fundamentality Thesis.” British Journal for the Philosophy of Science 65 no. 2, (2014): 353–80.
  • McTaggart, J. M. Ellis. “The Unreality of Time.” Mind 17 (October 1908): 457–474.
  • Mermin, David. “What is Quantum Mechanics Trying to Tell Us?” American Journal of Physics 66 (1998): 753–767.
  • Ney, Alyssa and David Albert. The Wave Function. New York: Oxford University Press, 2013.
  • Price, Huw. “Decisions, Decisions, Decisions: Can Savage Salvage Everettian Probability?” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 369–390.
  • Putnam, Hilary. “Is Logic Empirical?” Boston Studies in the Philosophy of Science, vol. 5. Coehn, Robert S. and Marx W. Wartofsky, (eds.) Dordrecht: D. Reidel, 1968: 216–241.
  • Reichenbach, Hans. Philosophical Foundations of Quantum Mechanics. Berkeley: University of California Press, 1944.
  • Rovelli, C. “Relational quantum mechanics.” International Journal of Theoretical Physics 35, no. 8 (1995): 1637–1678.
  • Saunders, Simon.  “Decoherence, Relative States, and Evolutionary Adaptation.” Foundations of Physics 23, no. 12 (1993): 1553–1585.
  • Saunders, Simon.  “Time, Quantum Mechanics, and Decoherence.” Synthese 102, no. 2 (1995): 235–266.
  • Saunders, Simon.  “Relativism.”  (1996a). In Perspectives on Quantum Reality. Clifton, Rob, (ed.).  Boston: Kluwer Academic Press, 1996.
  • Saunders, Simon.  “Time, Quantum Mechanics and Tense.” Synthese 107 (1996b): 19–53.
  • Saunders, Simon. “Naturalizing Metaphysics.”  The Monist 80. no. 1 (1997): 44–69.
  • Saunders, Simon.  “Time, Quantum Mechanics and Probability.” Synthese 114 (1998): 373–404.
  • Saunders, Simon. “What is Probability?” In Quo Vadis Quantum Mechanics. Avshalom C. Elitzur, Shahar Dolev, and Nancy Kolenda, (eds.). Berlin: Springer-Verlag, 2005.
  • Saunders, Simon. “Chance in the Everett Interpretation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 181–205.
  • Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace (eds.).  Many Worlds? Everett, Quantum Theory, & Reality. New York: Oxford University Press. 2010.
  • Saunders, Simon and David Wallace. “Branching and Uncertainty.” (2008a). British Journal for the Philosophy of Science 59, (3): 293–305.
  • Saunders, Simon and David Wallace. “Reply.” (2008b). British Journal for the Philosophy of Science 59, (3): 315–37.
  • Schaffer, Jonathan. “Monism: The Priority of the Whole.” Philosophical Review 119 no.1 (2010): 31–76. Reprinted in Spinoza on Monism. Goff, Philip, (ed.). New York: Palgrave, 2012: 149–66.
  • Schaffer, Jonathan and Jenann Ismael. “Quantum Holism: Nonseparability as Common Ground.” Synthese (2016): online published first.
  • Schlosshauer, Maximilian. Decoherence and the Quantum-to-Classical Transition. Heidelberg: Springer. 2007.
  • Schrödinger, Erwin. “Die gegenwärtige Situation in der Qauntenmechanik.” Naturwissenschaften, 23 (1935): 807–812, 823–828, 844–849; English translation by Trimmer, J. D. “The Present Situation in Quantum Mechanics: A Translation of Schrödinger’s ‘Cat Paradox’ Paper.” Proceedings of the American Philosophical Society, 124: 323–338, reprinted in Wheeler and Zurek (1983).
  • Sebens, Charles and Sean M. Carroll. “Self-Locating Uncertainty and the Origin of Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science. First published online July 5, 2016.
  • Tappenden, Paul. “Saunders and Wallace on Everett and Lewis” (2008). British Journal for the Philosophy of Science, 59: 307–314.
  • Tappenden, Paul. “Evidence and Uncertainty in Everett’s Multiverse,” British Journal for the Philosophy of Science, 62 (2011): 99–123.
  • Teller, Paul.  “Relational Holism and Quantum Mechanics. The British Journal for the Philosophy of Science 37, no. 1 (March 1986): 71–81.
  • Teller, Paul. “Relativity, Relational Holism and the Bell Inequalities.” In Philosophical consequences of quantum theory: Reflections on Bell’s theory. Cushing, James and Ernan McMullin, (eds.).  South Bend, IN: University of Notre Dame Press. (1989): 208–23.
  • Vaidman, Lev. “On the Schizophrenic Experiences of the Neutron or Why We Should Believe in the Many-Worlds Interpretation of Quantum Theory.” International Studies in the Philosophy of Science 12, 3 (1998): 245–261.
  • Vaidman, Lev. “Probability in the Many-Worlds Interpretation of Quantum Mechanics.” In Probability in Physics, The Fronteirs Collection XII. Yemima Ben-Menahem and Meir Hemmo, (eds.). Berlin: Springer. (2012): 299–311.
  • Vaidman, Lev. “Review: David Wallace The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.” The British Journal for the Philosophy of Science, 0 (2014): 1–4.
  • van Fraassen, Bas C. “Rovelli’s World.” Foundations of Physics 40, no. 4 (2009): 390–417.
  • Wallace, David. “Quantum Probability and Decision Theory Revisited.” (2002). http://arxiv.org/abs/quant-ph/0211104
  • Wallace, David. “Everettian Rationality: Defending Deutsch’s Approach to Probability in the Everett Interpretation.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 34, 3 (September 2003): 415–439.
  • Wallace, David. “Everett and Structure.” (2003). Studies in History and Philosophy of Science, Part B 34, (1): 87–105.
  • Wallace, David. “Language Use in a Branching Universe.”(2005). Preprint: http://philsci-archive.pitt.edu/archive/00002554/
  • Wallace, David. “Epistemology Quantized: Circumstances in Which We Should come to Believe in the Everett Interpretation.” (2006). British Journal for the Philosophy of Science 57, (4): 655–689.
  • Wallace, David. “Quantum Probability from Subjective Likelihood: Improving on Deutsch’s Proof of the Probability Rule.” (2007) Studies in History and Philosophy of Modern Physics 38, 311–32.
  • Wallace, David. “Decoherence and Ontology” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010a: 53–72.
  • Wallace, David. “How to Prove the Born Rule” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010b: 227–263.
  • Wallace, David. (2012) The Emergent Universe. New York: Oxford University Press.
  • Wheeler, John A.  Geons, Black Holes and Quantum Foam.  New York: W. W. Norton, 1998.
  • Wheeler, John. A., and Wojciech. H. Zurek, (eds.). Quantum Theory and Measurement. Princeton: Princeton University Press, 1983.
  • Wilson, Alastair. “Objective Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science 64 (4) (2013): 709–737.
  • Wilson, Alastair. “The Quantum Doomsday Argument.” British Journal for the Philosophy of Science 0 (2015): 1–19.
  • Xavier University Conference Transcript. October 1–5, 1962.  Cincinnati, Ohio. Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 267–279.
  • Zeh, H. Dieter. “On the Interpretation of Measurement in Quantum Theory.” (1970). Foundations of Physics 1: 69–76. Reprinted in Quantum Theory and Measurement. Wheeler, John A., and Wojciech H. Zurek, (eds.). Princeton: Princeton University Press, 1983: 342–49.
  • Zeh, H. Dieter. “Toward a Quantum Theory of Observation.” (1973). Foundations of Physics 3: 109–16.
  • Zeh, H. Dieter. “Basic Concepts and Their Interpretation.” (1995). In Decoherence and the Appearance of a Classical World in Quantum Theory. Giulini, Domenico, Erich Joos, Claus Keifer, Joachim Kupsch, Ion-Olimpiu, and H. Dieter Zeh, (eds.). Berlin: Spinger, 2003: Chapter 2.
  • Zurek, Wojciech H. “Pointer Basis of Quantum Apparatus: Into what Mixture does the Wave Packet Collapse?” (1981). Physical Review D (24): 1516–1525.
  • Zurek, Wojciech H. “Environment-Induced Superselection Rules.” (1982). Physical Review D (26): 1862–1880.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical.” (1991). Physics Today 44 (October): 36–44.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical – Revisited.” (2002). Los Alamos Science 27: 2–25.

 

Author Information

Christina Conroy
Email: c.conroy@moreheadstate.edu
Morehead State University
U. S. A.

Integrated Information Theory of Consciousness

Integrated Information Theory (IIT) offers an explanation for the nature and source of consciousness. Initially proposed by Giulio Tononi in 2004, it claims that consciousness is identical to a certain kind of information, the realization of which requires physical, not merely functional, integration, and which can be measured mathematically according to the phi metric.

The theory attempts a balance between two different sets of convictions. On the one hand, it strives to preserve the Cartesian intuitions that experience is immediate, direct, and unified. This, according to IIT’s proponents and its methodology, rules out accounts of consciousness such as functionalism that explain experience as a system operating in a certain way, as well as ruling out any eliminativist theories that deny the existence of consciousness. On the other hand, IIT takes neuroscientific descriptions of the brain as a starting point for understanding what must be true of a physical system in order for it to be conscious. (Most of IIT’s developers and main proponents are neuroscientists.) IIT’s methodology involves characterizing the fundamentally subjective nature of consciousness and positing the physical attributes necessary for a system to realize it.

In short, according to IIT, consciousness requires a grouping of elements within a system that have physical cause-effect power upon one another. This in turn implies that only reentrant architecture consisting of feedback loops, whether neural or computational, will realize consciousness. Such groupings make a difference to themselves, not just to outside observers. This constitutes integrated information. Of the various groupings within a system that possess such causal power, one will do so maximally. This local maximum of integrated information is identical to consciousness.

IIT claims that these predictions square with observations of the brain’s physical realization of consciousness, and that, where the brain does not instantiate the necessary attributes, it does not generate consciousness. Bolstered by these apparent predictive successes, IIT generalizes its claims beyond human consciousness to animal and artificial consciousness. Because IIT identifies the subjective experience of consciousness with objectively measurable dynamics of a system, the degree of consciousness of a system is measurable in principle; IIT proposes the phi metric to quantify consciousness.

Table of Contents

  1. The Main Argument
    1. Cartesian Commitments
      1. Axioms
      2. Postulates
    2. The Identity of Consciousness
      1. Some Predictions
    3. Characterizing the Argument
  2. The Phi Metric
    1. The Main Idea
    2. Some Issues of Application
  3. Situating the Theory
    1. Some Prehistory
    2. IIT’s Additional Support
    3. IIT as Sui Generis
    4. Relation to Panpsychism
      1. Relation to David Chalmers
  4. Implications
    1. The Spectrum of Consciousness
    2. IIT and Physics
    3. Artificial Consciousness
      1. Constraints on Structure/Architecture
      2. Relation to “Silent Neurons”
  5. Objections
    1. The Functionalist Alternative
      1. Rejecting Cartesian Commitments
      2. Case Study: Access vs. Phenomenal Consciousness
      3. Challenging IIT’s Augmentation of Naturalistic Ontology
    2. Aaronson’s Reductio ad Absurdum
    3. Searle’s Objection
  6. References and Further Reading

1. The Main Argument

IIT takes certain features of consciousness to be unavoidably true. Rather than beginning with the neural correlates of consciousness (NCC) and attempting to explain what about these sustains consciousness, IIT begins with its characterization of experience itself, determines the physical properties necessary for realizing these characteristics, and only then puts forward a theoretical explanation of consciousness, as identical to a special case of information instantiated by those physical properties. “The theory provides a principled account of both the quantity and quality of an individual experience… and a calculus to evaluate whether a physical system is conscious” (Tononi and Koch, 2015).

a. Cartesian Commitments

IIT takes Descartes very seriously. Descartes located the bedrock of epistemology in the knowledge of our own existence given to us by our thought. “I think, therefore I am” reflects an unavoidable certainty: one cannot deny one’s own existence as a thinker even if one’s particular thoughts are in error. For IIT, the relevance of this insight lies in its application to consciousness. Whatever else one might claim about consciousness, one cannot deny its existence.

i. Axioms

IIT takes consciousness as primary. Before speculating on the origins or the necessary and sufficient conditions for consciousness, IIT gives a characterization of what consciousness means. The theory advances five axioms intended to capture just this. Each axiom articulates a dimension of experience that IIT regards as self-evident.

First, following from the fundamental Cartesian insight, is the axiom of existence. Consciousness is real and undeniable; moreover, a subject’s consciousness has this reality intrinsically; it exists from its own perspective.

Second, consciousness has composition. In other words, each experience has structure. Color and shape, for example, structure visual experience. Such structure allows for various distinctions.

Third is the axiom of information: the way an experience is distinguishes it from other possible experiences. An experience specifies; it is specific to certain things, distinct from others.

Fourth, consciousness has the characteristic of integration. The elements of an experience are interdependent. For example, the particular colors and shapes that structure a visual conscious state are experienced together. As we read these words, we experience the font-shape and letter-color inseparably. We do not have isolated experiences of each and then add them together. This integration means that consciousness is irreducible to separate elements. Consciousness is unified.

Fifth, consciousness has the property of exclusion. Every experience has borders. Precisely because consciousness specifies certain things, it excludes others. Consciousness also flows at a particular speed.

ii. Postulates

In isolation, these axioms may seem trivial or overlapping. IIT labels them axioms precisely because it takes them to be obviously true. IIT does not present them in isolation. Rather, they motivate postulates. Sometimes the IIT literature refers to phenomenological axioms and ontological postulates. Each axiom leads to a corresponding postulate identifying a physical property. Any conscious system must possess these properties.

First, the existence of consciousness implies a system of mechanisms with a particular cause-effect power. IIT regards existence as inextricable from causality: for something to exist, it must be able to make a difference to other things, and vice versa. (What would it even mean for a thing to exist in the absence of any causal power whatsoever?) Because consciousness exists from its own perspective, the implied system of mechanisms must do more than simply have causal power; it must have cause-effect power upon itself.

Second, the compositional nature of consciousness implies that its system’s mechanistic elements must have the capacity to combine, and that those combinations have cause-effect power.

Third, because consciousness is informative, it must specify, or distinguish one experience from others. IIT calls the cause-effect powers of any given mechanism within a system its cause-effect repertoire. The cause-effect repertoires of all the system’s mechanistic elements taken together, it calls its cause-effect structure. This structure, at any given point, is in a particular state. In complex structures, the number of possible states is very high. For a structure to instantiate a particular state is for it to specify that state. The specified state is the particular way that the system is making a difference to itself.

Fourth, consciousness’s integration into a unified whole implies that the system must be irreducible. In other words, its parts must be interdependent. This in turn implies that every mechanistic element must have the capacity to act as a cause on the rest of the system and to be affected by the rest of the system. If a system can be divided into two parts without affecting its cause-effect structure, it fails to satisfy the requirement of this postulate.

Fifth, the exclusivity of the borders of consciousness implies that the state of a conscious system must be definite. In physical terms, the various simultaneous subgroupings of mechanisms in a system have varying cause-effect structures. Of these, only one will have a maximally irreducible cause-effect structure. This is called the maximally irreducible conceptual structure, or MICS. Others will have smaller cause-effect structures, at least when reduced to non-redundant elements. Precisely this is the conscious state.

b. The Identity of Consciousness

IIT accepts the Cartesian conviction that consciousness has immediate, self-evident properties, and outlines the implications of these phenomenological axioms for conscious physical systems. This characterization does not exhaustively describe the theoretical ambition of IIT. The ontological postulates concerning physical systems do not merely articulate necessities, or even sufficiencies, for realizing consciousness. The claim is much stronger than this. IIT identifies consciousness with a system’s having the physical features that the postulates describe. Each conscious state is a maximally irreducible conceptual structure, which just is and can only be a system of irreducibly interdependent physical parts whose causal interaction constitutes the integration of information.

An example may help to clarify the nature of IIT’s explanation of consciousness. Our experience of a cue ball integrates its white color and spherical shape, such that these elements are inseparably fused. The fusion of these elements constitutes the structure of the experience: the experience is composed of them. The nature of the experience informs us about whiteness and spherical shape in a way that distinguishes it from other possible experiences, such as of a blue cube of chalk. This is just a description of the phenomenology of a simple experience (perhaps necessarily awkward, because it articulates the self-evident). Our brain generates the experience through neurons physically communicating with one another in systems linked by cause-effect power. IIT interprets this physical communication as the integration of information, according to the various constraints laid out in the postulates. The neurobiology and phenomenology converge.

Theories of consciousness need to account for what is sometimes termed the “binding problem.” This concerns the unity of conscious experience. Even a simple experience like viewing a cue ball unites different elements such as color, shape, and size. Any theory of consciousness will need to make sense of how this happens. IIT’s account of the integration of information may be understood as a response to this problem.

According to IIT, the physical state of any conscious system must converge with phenomenology; otherwise the kind of information generated could not realize the axiomatic properties of consciousness. We can understand this by contrasting two kinds of information. First, there is Shannon information: When a digital camera takes a picture of a cue ball, the photodiodes operate in causal isolation from one another. This process does generate information; specifically, it generates observer-relative information. That is, the camera generates the information of an image of a cue ball for anyone looking at that photograph. The information that is the image of the cue ball is therefore relative to the observer; such information is called Shannon information. Because the elements of the system are causally isolated, the system does not make a difference to itself. Accordingly, although the camera gives information to an observer, it does not generate that information for itself. By contrast, consider what IIT refers to as intrinsic information: Unlike the digital camera’s photodiodes, the brain’s neurons do communicate with one another through physical cause and effect; the brain does not simply generate observer-relative information, it integrates intrinsic information.  This information from its own perspective just is the conscious state of the brain. The physical nature of the digital camera does not conform to IIT’s postulates and therefore does not have consciousness; the physical nature of the brain, at least in certain states, does conform to IIT’s postulates, and therefore does have consciousness.

To identify consciousness with such physical integration of information constitutes an ontological claim. The physical postulates do not describe one way or even the best way to realize the phenomenology of consciousness; the phenomenology of consciousness is one and the same as a system having the properties described by the postulates. It is even too weak to say that such systems give rise to or generate consciousness. Consciousness is fundamental to these systems in the same way as mass or charge is basic to certain particles.

i. Some Predictions

IIT’s conception of consciousness as mechanisms systematically integrating information through cause and effect lends itself to quantification. The more complex the MICS, the higher the level of consciousness: the corresponding metric is phi. Sometimes the IIT literature uses the term “prediction” to refer to implications of the theory whose falsifiability is a matter of controversy. This section will focus on more straightforward cases of prediction, where the evidence is consistent with IIT’s claims. These cases provide corroborative evidence that enhance the plausibility of IIT.

Deep sleep states are less experientially rich than waking ones. IIT predicts, therefore, that such sleep states will have lower phi values than waking states. For this to be true, analysis of the brain during these contrasting states would have to show a disparity in the systematic complexity of non-redundant mechanisms. On IIT, this disparity of MICS complexity directly implies a disparity in the amount of conscious integrated information, because the MICS is identical to the conscious state. The neuroscientific findings bear out this prediction.

IIT cites similar evidence from the study of patients with brain damage. For example, we already know that among vegetative patients, there are some whose brain scans indicate that they can hear and process language: When researchers prompt such patients to think about playing tennis, the appropriate areas of the brain become activated. Other vegetative patients do not respond this way. Naturally, this suggests that the former have a richer degree of consciousness than the latter. When analyzed according to IIT’s theory, the former have a higher phi metric than the latter; once again, IIT has made a prediction that receives empirical confirmation. IIT also claims that findings in the analysis of patients under anesthesia corroborate its claims.

In all these cases, one of two things happens. First, as consciousness fades, cortical activity may become less global. This reversion to local cortical activity constitutes a loss of integration: The system no longer is communicating across itself in as complex a way as it had. Second, as consciousness fades, cortical activity may remain global, but become stereotypical, consisting in numerous redundant cause-effect mechanisms, such that the informational achievement of the system is reduced: a loss of information. As information either becomes less integrated or becomes reduced, consciousness fades, which IIT takes as empirical support of its theory of consciousness as integrated information.

c. Characterizing the Argument

IIT combines Cartesian commitments with claims about engineering that it interprets, in part by citing corroborative neuroscientific evidence, as identifying the nature of consciousness. This borrows from recognizable traditions in the field of consciousness studies, but the structure of the argument is novel. While IIT’s proponents strive for clarity in the exposition of their work by breaking it down into the simpler elements of axioms, propositions, and identity claims, the nature of the relations between these parts remains largely implicit in the IIT literature. To evaluate the explanatory success or failure of IIT, it should prove helpful to attempt an explication of these logical relations. This requires characterizing the relationship of the axioms with the postulates, and of the identity claims with the axioms, postulates, and supporting neuroscientific evidence.

The axioms, of course, count as premises. These premises seem to lead to the postulates: each postulate flows from a corresponding axiom. At the same time, IIT describes these postulates as unproven assumptions, which seems at odds with their being conclusions drawn from the axioms. Consider the first axiom and its postulate, concerning existence. The axiom states that consciousness exists, and more specifically, exists intrinsically; the postulate holds that this requires a conscious system to have cause-effect power, and more specifically, to have this power over itself. The link involves, in part, the claim that existence implies cause-effect power. This claim that for a thing to exist, it must be able to make a difference, is plausible, but not self-evident. Nor does the axiomatic premise alone deductively imply this postulate. Epiphenomenalists, for example, claim that conscious mental states, although existent and caused, do not cause further events; they do not make a difference. Epiphenomenalists certainly do not go on to identify consciousness with physically causal systems, as IIT does.

Tononi (2015) adopts the position that the move from the axioms to the postulates is one of inference to the best explanation, or abduction. On this line, while the axioms do not deductively imply the postulates, the postulates have more than mere statistical inductive support. For example, consider the observation that human brains, which on IIT are conscious systems, have cause-effect power over themselves. Minimally, this offers a piece of inductive support for describing conscious systems in general as having such a power. Tononi takes a stronger line, claiming that a system’s property of having cause-effect power over itself most satisfyingly explains its intrinsic existence. So, what makes the brain a system at all, capable of having its own consciousness, is its ability to make a difference to itself. This illustrates the relation of postulates such as the first, concerning cause-effect power, to axioms such as the first axiom, concerning intrinsic existence, by appeal to something like explanatory fit, or satisfactoriness, which is to characterize that relation abductively.

In any case, IIT moves from the sub-conclusion about the postulates to a further conclusion, the identity claim: consciousness is identical to a system’s having the physical properties laid out by the postulates, which realize the phenomenology described by the axioms. Here again, the abductive interpretation remains an option. On this interpretation, the conjunction of the physical features of the postulates provides the most satisfactory explanation for the identity of consciousness.

This breakdown of the argument reveals the separability of the two parts. A less ambitious version of IIT might have limited itself to the first part, claiming that the physical features described by the postulates are the actual and/or best ways of realizing consciousness, or more strongly that they are necessary and/or sufficient, without going on to say that consciousness is identical to a system having these properties. The foregoing paragraphs outlined the possible motivation for the identification claim as lying in the abductive interpretation.

The notion of best explanation is notoriously slippery, but also ubiquitous in science. From an intuitive point of view one might regard the content of the conjunction of the postulates as apt for accounting for the phenomenology, but one might, motivated by theoretical conservatism, stop short of describing this as an identity relation. One clue as to why IIT does not take this tack may lie in IIT’s methodological goal of parsimony, something the literature mentions with some regularity. Perhaps the simplicity of identifying consciousness with a system’s having certain intuitively apt physical properties outweighs the non-conservatism of the claim that consciousness is fundamental to such systems the way mass is fundamental to a particle.

2. The Phi Metric

a. The Main Idea

IIT strives, among other things, not just to claim the existence of a scale of complexity of consciousness, but to provide a theoretical approach to the precise quantification of the richness of experience for any conscious system. This requires calculating the maximal amount of integrated information in a system. IIT refers to this as the system’s phi value, which can be expressed numerically, at least in principle.

Digital photography affords particularly apt illustrations of some of the basic principles involved in quantifying consciousness.

First, a photodiode exemplifies integrated information in the simplest way possible. A photodiode is a system of two elements, which together render it sensitive to two states only: light and dark. After initial input from the environment, the elements communicate input physically with one another, determining the output. So, the photodiode is a two-element system that integrates information. A photodiode not subsumed in another system of greater phi value is the simplest possible example of consciousness.

This consciousness, of course, is virtually negligible. The photodiode’s experience of light and dark is not rich in the way that ours is. The level of information of a state depends upon its specifying that state as distinct from others. The repertoire of the photodiodes allows only for the most limited differentiation (“this” vs. “that”), whereas the repertoire of a complex system such as the brain allows for an enormous amount of differentiation. Even our most basic experience of darkness distinguishes it not only from light, but from shapes, colors, and so forth.

Second, a digital camera’s photodiodes’ causal arrangement neatly exemplifies the distinction between integrated and non-integrated information. Putting to one side that each individual photodiode integrates information as simply as possible, those photodiodes do not take input or give output to one another, so the information does not get integrated across the system. For this reason, the camera’s image is informative to us, but not to itself.

Each isolated photodiode has integrated information in the most basic way, and would therefore have the lowest possible positive value of phi. The camera’s photodiodes taken as a system do not integrate information and have a phi value of zero.

IIT attributes consciousness to certain systems, or networks. These can be understood abstractly as models. A computer’s hardware may be modelled by logic circuits, which represent its elements and their connections as interconnected logic gates. The way a particular connection within this network mediates input and output determines what kind of logic gate it is. For example, consider a connection that takes two inputs, either which can be True or False, and then gives one output, True or False. The AND logic gate would give an output of True if both inputs were True or if both inputs were False. In other words, if both the one AND the other have the same value, the AND gate gives a True output. Such modelling captures the dynamics of binary systems, with “True” corresponding to 1 and “False” to 0. The arrangement of a network’s various logic gates (which include not only AND, but also OR, NOT, XOR, among others) determines how any particular input to the system at time-step 1 will result in output at time-step 2, and so on for that system. The brain can be modelled this way too. Input can come from a prior brain state, or from other parts of the nervous system, through the senses. The input causes a change in brain state, depending on the organization of the particular brain, which can be articulated in abstract logic.

In order to measure the level of consciousness of a system, IIT must describe the amount of its integrated information. This is done by partitioning the system in various ways. If the digital camera’s photodiodes are partitioned, say, by dividing the abstract model of its elements in half, no integrated information is lost, because all the photodiodes are in isolation from each other, and so the division does not break any connections. If no logically possible partition of the system results in a loss of connection, the conclusion is that the system does not make a difference to itself. So, in this case, the system has no phi.

Systems of interest to IIT will have connections that will be lost by some partitions and not by others. Some partitions will sever from the system elements that are comparatively low in original degree of connectivity to the system, in other words elements whose (de)activation has few causal consequences upon the (de)activation of other elements. A system where all or most elements have this property will have low phi. The lack of strong connectivity may be the result of relative isolation, or locality, an element not linking to many other elements, directly or indirectly. Or it could be from stereotypicality, where the element’s causal connections overlap in a largely redundant way with the causal connection of other elements. A system whose elements are connected more globally and non-redundantly will have higher phi. A partition that not only separates all elements that do not make a difference to the rest of the system for reasons of either isolation or redundancy from those that do make a difference, but also separates those elements whose lower causal connectivity decreases the overall level of integration of the system from those that do not, will thereby have picked out the maximally irreducible conceptual structure (MICS), which according to IIT is conscious. The degree of that consciousness, its phi, depends upon its elements’ level of causal connectivity. This is determined by how much information integration would be lost by the least costly further partition, or, in other words, how much the cause-effect structure of the system would be reduced by eliminating the least causally effective element within the MICS.

It is important to note that not every system with phi has consciousness. A sub- or super-system of an MICS may have phi but will not have consciousness.

If we were to take a non-MICS subsystem of a network, which in isolation still has causal power over itself, articulable as a logic circuit, then that would have phi. Were it indeed in isolation, it would have its own MICS, and its phi would correspond to that system’s degree of consciousness. It is, however, not in isolation, but rather part of a larger system.

IIT interprets the exclusion axiom—that any conscious system is in one conscious state only, excluding all others—as implying a postulate that holds that, at the level of the physical system, there be no “double counting” of consciousness. So, although a system may have multiple subsystems with phi, only the MICS is conscious, and only the phi value of the MICS (sometimes called phi max) measures conscious degree. The other phi values measure degrees of non-conscious integrated information. So, for example, each of a person’s visual cortices does not enjoy its own consciousness, but parts of each belong to a single MICS, which is the person’s one unitary consciousness.

If we were to take a supersystem of an MICS, one that includes the MICS and also other associated elements with lower connectivity, we could again assign it a phi value, but this would not measure the local maximum of integrated information. The supersystem integrates information, but not maximally, and its phi is therefore not a measure of consciousness. This is probably best understood by example: a group of people in a discussion integrate information, but the connective degree among them is lower than the degree of connectivity within each individual. The group as such has no consciousness, but each individual person—or, more properly, the MICS of each—does. The individuals’ MICSs are local maxima of integrated information and therefore conscious.

b. Some Issues of Application

The number of possible partitions of a system, called Bell’s number, grows immensely as the number of elements increases. For example, the tiny nematode, a simple species of worm, has 302 neurons, “and the number of ways that this network can be cut into parts is the hyperastronomical 10 followed by 467 zeros” (Koch, 2012). Calculating phi precisely for much more complex systems such as brains eludes computation pragmatically, although not in principle. In the absence of precise phi computation, IIT employs mathematical “heuristics, shortcuts, and approximations” (ibid.). The IIT literature includes several different mathematical interpretations of phi calculation, each intended to replace the last; it is not yet clear that IIT has a settled account of it. Proponents of IIT hold that the mathematical details will enable the application, but not bear on the merits, of the deeper theoretical claims. At least one serious objection to IIT, however, attempts a reductio ad absurdum of those deeper claims via analysis of the mathematical implications.

It is clear that, whatever the mathematical details, the basic principles of phi imply that biological nervous systems such as the brain will be capable of having very high phi, because neurons often have thousands of connections to one another. On the other hand, a typical circuit in a standard CPU only makes a few connections to other circuits, limiting the potential phi value considerably. It is also clear that even simple systems will have at least some phi value, and that, provided they constitute local maxima, will have a corresponding measure of consciousness.

IIT does not intend phi as a measure of the quality of consciousness, only of its quantity. Two systems may have the same phi value but different MICS organizations. In this case, each would be conscious to the same degree, but the nature of the conscious experience would differ. The phi metric captures one dimension, the amount of integrated information, of a system. IIT does address the quality of consciousness abstractly, although not with phi. A system’s model includes its elements and their connections, whose logic can be graphed as a constellation of (de)activated points with lines between them representing (de)activated connections. This is, precisely, its conceptual structure. Recall that the maximally irreducible conceptual structure, or MICS, is, on IIT, conscious. A graph of the MICS, according to IIT, captures its unique shape in quality space, the shape of that particular conscious state. In other words, this is the abstract form of the quality of the experience, or the experience’s form “seen from the outside.” The perspective “from the inside” is available only to the system itself, whose making differences to itself intrinsically, integrating information in one of many possible forms, just precisely is its experience. The phenomenological nature of the experience, its “qualia,” are evident only from the perspective of the conscious system, but the logical graph of its structure is a complete representation of its qualitative properties.

3. Situating the Theory

a. Some Prehistory

IIT made its explicit debut in the literature in 2004, but has roots in earlier work. Giulio Tononi, the theory’s founder and major proponent, worked for many years with Gerald Edelman. Their work rejected the notion that mental events such as consciousness will ever find full explanation by reference to the functioning of a system. Such functionalism, to them, ignores the crucial issue of the physical substrate itself. They especially emphasized the importance of re-entry. To them, only a system composed of feedback loops, where input may also serve as output, can integrate information. Feed-forward systems, then, do not integrate information. Even before the introduction of IIT, Tononi was claiming that integrated information was essential to the creation of a scene in primary consciousness.

Christof Koch, now a major proponent of IIT, collaborated for a long time with Francis Crick. Much of their earlier work focused on identifying the neural correlates of consciousness (NCC), especially in the visual system. While such research advances particular knowledge about the mechanisms of one set of conscious states, Crick and Koch came to see this work as failing to address the deeper problems of explaining consciousness generally. Koch also rejects the idea that identifying the functional dynamics of a system aptly treats what makes that system conscious. He came to regard information theory as the correct approach for explaining consciousness.

So, the two thinkers who became IIT’s chief advocates arrived at that position after close neuroscientific research with Nobel Laureates who eschewed functional approaches to consciousness, favoring investigation of the relation of physical substrate to information generation.

b. IIT’s Additional Support

As of 2016, Tononi, IIT’s creator, runs the Center for Sleep and Consciousness at the University of Madison-Wisconsin. The Center has more than forty researchers, many of whom work in its IIT Theory Group. Koch, a major supporter of IIT, heads the prestigious Allen Institute for Brain Science. The Institute has links with the White House Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, as well as the European Human Brain Project (HBP).  IIT’s body of literature continues to grow, often in the form of publications associated with the Center and the Institute. The public reputation of these organizations, as well as of Tononi and Koch’s earlier work, lends a certain authority or celebrity to IIT. The theory has enjoyed ample attention in mainstream media. Nevertheless, IIT remains a minority position among neuroscientists and philosophers.

c. IIT as Sui Generis

IIT does not fit neatly into any other school of thought about consciousness; there are points of connection to and departures from many categories of consciousness theory.

IIT clearly endorses a Cartesian interpretation of consciousness as immediate; such an association is unusual for a self-described naturalistic or scientific theory of consciousness. Cartesian convictions do inform IIT’s axioms and so motivate its overall methodological approach. To label IIT as a Cartesian theory generally, however, would be misleading. For one thing, like most modern theories of consciousness, it dissociates itself from the idea of a Cartesian theatre, or single point in the brain that is the seat of consciousness. Moreover, it is by no means clear how IIT stands in relation to dualism. Certainly, IIT does not advertise itself as positing a mental substance that is separate from the physical. At the same time, it draws an analogy between its identification of consciousness as a property of certain integrated information systems and physics’ identification of mass or charge as a property of particles. One might interpret such introduction of immediate experience into the naturalistic ontology as having parallels with positing a new kind of mental substance.  The literature will occasionally describe IIT as a form of materialism. It is true that IIT theorists do focus on the material substrate of informational systems, but again, one might challenge whether a theory that asserts direct experience as fundamental to substrates with particular architectural features is indeed limiting itself to reference to material in its explanation.

In describing the features of conscious systems, IIT will make reference to function, but IIT rejects functionalism outright. To IIT theorists, articulating the functional dynamics of a system alone will never do justice to the immediate nature of experience.

d. Relation to Panpsychism

The IIT literature, not only from Tononi but also from Koch and others, refers with some regularity to panpsychism—broadly put, any metaphysical system that attributes mental properties to basic elements of the world—as sharing important ground with IIT. Panpsychism comes in different forms, and the precise relationship between it and IIT has yet to be established. Both IIT and panpsychism strongly endorse Cartesian commitments concerning the immediate nature of experience. IIT, however, only attributes mental properties to re-entrant architectures, because it claims that only these will integrate information; this is inconsistent with any version of panpsychism that insists upon attributing mental properties to even more basic elements of the structure of existence.

i. Relation to David Chalmers

Of the various contemporary philosophical accounts of consciousness, IIT intersects perhaps most frequently with the work of David Chalmers. This makes sense, not only given Chalmers’s panpsychist leanings, but also given the express commitment of both to a Cartesian acceptance of the immediacy of experience, and a corresponding rejection of functionalist attempts to explain consciousness.

Moreover, Chalmers’s discussion of the relation of information to consciousness strongly anticipates IIT. Before the introduction of IIT, Chalmers had already endorsed the view of information as involving specification, or reduction of uncertainty. IIT often echoes this, especially in connection with the third axiom and postulate. Chalmers also characterizes information as making a difference, which relates to IIT’s first postulate especially. These notions of specification and difference-making are familiar from the standard Shannon account of information, but the point is that both Chalmers and IIT choose to understand consciousness partly by reference to these concepts rather than to information about something.

A major theme in Chalmers’s work involves addressing the problem of the precise nature of the connection between physical systems and consciousness. Before IIT, Chalmers speculated that information seems to be the connection between the two. If this is the case, then phenomenology is the realization of information. Chalmers suggests that information itself may be primitive in the way that mass or charge is. Now, this does not directly align with IIT’s later claims of how consciousness is fundamental to certain informational systems in the same way that mass or charge is fundamental to particles, but the parallels are clear enough. Similarly, Chalmers’s description of a “minimally sufficient neural system” as the neural correlate of consciousness resembles IIT’s discussion of the MICS. Both also use the term “core” in this context. It comes as no surprise that IIT and Chalmers have intersected when we read Chalmers’s earlier claims: “Perhaps, then, the intrinsic nature required to ground the information states is closely related to the intrinsic nature present in phenomenology. Perhaps one is even constitutive of the other” (Chalmers, 1996).

Still, the relationship between Chalmers’s work and IIT is not one of simple alliance. Despite the apparent similarity of their positions on what is fundamental, there is an important disagreement. Chalmers takes the physical to derive from the informational, and grounds the realization of phenomenal space—the instantiation of conscious experience—not upon causal “differences that make a difference,” but upon the intrinsic qualities of and structural relations among experiences. IIT regards consciousness as being intrinsic to certain causal structures, which might be read as the reverse of Chalmers’s claim.  In describing his path to IIT, Koch endorses Chalmers as “a philosophical defender of information theory’s potential for understanding consciousness” while faulting Chalmers’s work for not addressing the internal organization of conscious systems (Koch, 2012). To Koch, treating the architecture is necessary because consciousness does not alter in simple covariance with change in amounts of bits of information. Because IIT addresses the physical organization, it struck Koch as superior.

There has been some amount of cross-reference between IIT and Chalmers in the literature, although important differences are apparent. For example, Chalmers famously discusses the possibility of a zombie, or operating physical match of a human that does not have experience. On IIT, functional zombies are possible, but not zombies whose nervous connections duplicate our own. In other words, if a machine were built to imitate the behavior of a human perfectly, but whose hardware involved feed-forward circuits, then it would generate possibly no phi, or more likely low, local phi, rather than the high phi of human consciousness. But if we posit that the machine replicates the connections of the human down to the level of the hardware, then it would follow that the system would integrate the same level of phi and would be equally conscious.

Chalmers (2016) writes that IIT “can be construed as a form of emergent panpsychism,” which is true in a sense, but requires qualification. By “emergent panpsychism” Chalmers means that IIT posits consciousness as fundamental not to the merest elements of existence but to certain structures that emerge at the level of particles’ relations to one another.  This is a fair assessment, whether or not IIT’s advocates choose to label the theory this way. But in the difference between emergent and non-emergent lies the substance of IIT: what precisely makes one structure “emerge” conscious and another not is what IIT hopes to explain. Non-emergent panpsychism of the sort associated with Chalmers by definition pitches its explanation at a different level. Indeed, it does not necessarily grant the premise that there exist non-conscious elements, let alone structures, in the first place. Despite the similarities between IIT and some of Chalmers’s work, the two should not be confused.

4. Implications

a. The Spectrum of Consciousness

It is widely accepted that humans experience varying degrees of consciousness. In sleep, for example, the richness of experience diminishes, and sometimes we do not experience at all. IIT implies that brain activity during this time will generate either less information or less integrated information, and interprets experimental results as bearing this out. On IIT, the complexity of physical connections in the MICS corresponds to the level of consciousness. By contrast, the cerebellum, which has many neurons, but neurons that are not complexly interconnected and so do not belong to the MICS, does not generate consciousness.

There does not exist a widely accepted position on non-human consciousness. IIT counts among its merits that the principles it uses to characterize human consciousness can apply to non-human cases.

On IIT, consciousness happens when a system makes a difference to itself at a physical level: elements causally connected to one another in a re-entrant architecture integrate information, and the subset of these with maximal causal power is conscious. The human brain offers an excellent example of re-entrant architecture integrating information, capable of sustaining highly complex MICSs, but nothing in IIT limits the attribution of consciousness to human brains only.

Mammalian brains share similarities in neural and synaptic structure: the human case is not obviously exceptional. Other, non-mammalian species demonstrate behavior associated in humans with consciousness. These considerations suggest that humans are not the only species capable of consciousness. IIT makes a point of remaining open to the possibility that many other species may possess at least some degree of consciousness. At the same time, further study of non-human neuroanatomy is required to determine whether and how this in facts holds true. As mentioned above, even the human cerebellum does not have the correct architecture to generate consciousness, and it is possible that other species have neural organizations that facilitate complex behavior without generating high phi. The IIT research program offers a way to establish whether these other systems are more like the cerebellum or the cerebral cortex in humans. Of course, consciousness levels will not correspond completely to species alone. Within conscious species, there will be a range of phi levels, and even within a conscious phenotype, consciousness will not remain constant from infancy to death, wakefulness to sleep, and so forth.

IIT claims that its principles are consistent with the existence of cases of dual consciousness within split-brain patients. In such instances, on IIT, two local maxima of integrated information exist separately from one another, generating separate consciousness. IIT does not hold that a system need have only one local maximum, although this may be true of normal brains; in split-brain patients, the re-entrant architecture has been severed so as to create two. IIT also takes its identification of MICSs through quantification of phi as a potential tool for assessing other actual or possible cases of multiple consciousness within one brain.

Such claims also allow IIT to rule out instances of aggregate consciousness. The exclusion principle forbids double-counting of consciousness. A system will have various subsystems with phi value, but only the local maxima of phi within the system can be conscious. A normal waking human brain has only one conscious MICS, and even a split-brain patient’s conscious systems do not overlap but rather are separate. One’s conscious experience is precisely what it is and nothing else. All this implies that, for example, the United States of America has no superordinate consciousness in addition to the consciousness of its individuals. The local maxima of integrated information reside within the skulls of those individuals; the phi value of the connections among them is much lower.

Although IIT allows for a potentially very wide range of degrees of consciousness and conscious entities, this has its limits. Some versions of panpsychism attribute mental properties to even the most basic elements of the structure of the world, but the simplest conscious entity admitted on IIT to be conscious would have to be a system of at least two elements that have cause-effect power over one another. Otherwise no integrated information exists. Objects such as rocks and grains of sand have no phi, whether in isolation or heaped into an aggregate, and so no consciousness.

IIT’s criteria for consciousness are consistent with the existence of artificial consciousness. The photodiode, because it integrates information, has a phi value; if not subsumed into a system of higher phi, this will count as local maximum: the simplest possible MICS or conscious system. Many or most instances of phi and consciousness may be the result of evolution in nature, independent of human technology, but this is a contingent fact. Often technological systems involve feed-forward architecture that lowers or possibly eliminates phi, but if the system is physically re-entrant and satisfies the other criteria laid out by IIT, it may be conscious. In fact, according to IIT, we may build artificial systems with a greater degree of consciousness than humans.

b. IIT and Physics

IIT has garnered some attention in the physics literature. Even if one accepts the basic principles of IIT, it still remains open to offer a different account of the physical particulars. Tononi and the other proponents of IIT coming from neuroscientific backgrounds tend to offer description at a classical grain. They frame integrated information by reference to neurons and synapses in the case of brains, and to re-entrant hardware architecture in the case of artificial systems. Such descriptions stay within the classical physics paradigm. This does not exhaust the theoretical problem space for characterizing integrated information.

One alternative (Barrett, 2014) proposes that consciousness comes from the integration of information intrinsic to fundamental fields. This account calls for reconceiving the phi metric, which in its 2016 form applies only to discrete systems, and not to electromagnetic fields.  Another account (Tegmark, 2015) also proposes non-classical physical description of conscious, integrated information. This generalizes beyond neural or neural-type systems to quantum systems, suggesting that consciousness is a state of matter, whimsically labelled “perceptronium.”

c. Artificial Consciousness

IIT’s basic arguments imply, and the IIT literature often explicitly claims, certain important constraints upon artificial conscious systems.

i. Constraints on Structure/Architecture

At the level of hardware, computation may process information with either feed-forward or re-entrant architecture. In feed-forward systems, information gets processed in only one direction, taking input and giving output. In re-entrant systems, which consist of feedback loops, signals are not confined to movement in one direction only; output may operate as input also.

IIT interprets the integration axiom (the fourth axiom, which says that each experience’s phenomenological elements are interdependent) as entailing the fourth postulate, which claims that each mechanism of a conscious system must have the potential to relate causally to the other mechanisms of that system. By definition, in a feed-forward system, mechanisms cannot act as causes upon those parts of the system from which they take input. A purely feed-forward system would have no phi, because although it would process information, it would not integrate that information at the physical level.

One implication for artificial consciousness is immediately clear: Feed-forward architectures will not be conscious. Even a feed-forward system that perfectly replicated the behavior of a conscious system would only simulate consciousness. Artificial systems would need to have re-entrant structure to generate consciousness.

Furthermore, re-entrant systems may still generate very low levels of phi. Conventional CPUs have transistors that only communicate with several others. By contrast, each neuron of the conscious network of the brain connects with thousands of others, a far more complex re-entrant structure, making a difference to itself at the physical level in such a way as to generate much higher phi value. For this reason, brains are capable of realizing much richer consciousness than conventional computers. The field of artificial consciousness, therefore, would do well to emulate the neural connectivity of the brain.

Still another constraint applies, this one associated with the fifth postulate, the postulate of exclusion. A system may have numerous phi-generating subsystems, but according to IIT, only the network of elements with the greatest cause-effect power to integrate information—the maximally irreducible conceptual structure, or MICS—is conscious. Re-entrant systems may have local maxima of phi, and therefore small pockets of consciousness. Those attempting to engineer high degrees of artificial consciousness need to focus their design on creating a large MICS, not simply small, non-overlapping MICSs.

If IIT is correct in placing such constraints upon artificial consciousness, deep convolutional networks such as GooGleNet and advanced projects like Blue Brain may be unable to realize high levels of consciousness.

ii. Relation to “Silent Neurons”

IIT’s third postulate has a somewhat counterintuitive implication. The third axiom claims that each conscious experience is precisely what it is; that is, it is distinct from other experiences. The third postulate claims that, in order to realize this feature of consciousness, a system must have a range of possible states, describable by reference to the cause-effect repertoires of its mechanistic elements. The system realizes a specific conscious state by instantiating one of those particular physical arrangements.

An essential component of the phenomenology of a conscious state is the degree of specificity: a photodiode that registers light specifies one of only two possible states. IIT accepts that such a simple mechanism, if not subsumed under a larger MICS, must be conscious, but only to a negligible degree. On the other hand, when a human brain registers light, it distinguishes it from countless other states; not only from dark, but from different shades of color, from sound, and so forth. The brain state is correspondingly more informative than the photodiode state.

This means that not only active neuronal firing, but also neuronal silence, determines the nature of a conscious state. Inactive parts of the complex, as well as active ones, contribute to the specification at the physical level, which IIT takes as the realization of the conscious state.

It is important not to conflate silent, or inactive, neurons of this kind with de-activated neurons. Only neurons that genuinely fall within the cause-effect repertoires of the mechanistic elements of the system count as contributing to specification, and this applies to inactive as well as to active ones. If a neuron was incapable all along of having causal power within the MICS, its inactivity plays no role in generating phenomenology. Likewise, IIT predicts that should neurons otherwise belonging to cause-effect repertoires of the system be rendered incapable of such causation (for example, by optogenetics), their inactivity would not contribute to phenomenology.

5. Objections

a. The Functionalist Alternative

According to functionalism, mental states, including states of consciousness, find explanation by appeal to function. The nature of a certain function may limit the possibilities for its physical instantiation, but the function, and not the material details, is of primary relevance. IIT differs from functionalism on this basic issue: on IIT, the conscious state is identified with the way in which a system embodies the physical features that IIT’s postulates describe.

Their opposing views concerning constraints upon artificial consciousness nicely illustrate the contrast between functionalism and IIT. For the functionalist, any system that functions identically to, for example, a conscious human, will by definition have consciousness. Whether the artificial system uses re-entrant or feed-forward architecture is a pragmatic matter. It may turn out that re-entrant circuitry more efficiently realizes the function, but even if the system incorporates feed-forward engineering, so long as the function is achieved, the system is conscious. IIT, on the other hand, expressly claims that a system that performed in a way completely identical to a conscious human, but that employed feed-forward architecture, would only simulate, but not realize consciousness. Put simply, such a system would operate as if it were integrating information, but because its networks would not take output as input, would not actually integrate information at the physical level. The difference would not be visible to an observer, but the artificial system would have no conscious experience.

i. Rejecting Cartesian Commitments

Those who find functionalism unsatisfactory often take it as an inadequate account of phenomenology: no amount of description of functional dynamics seems to capture, for example, our experience of the whiteness of a cue ball. Indeed, IIT entertains even broader suspicions. Beginning with descriptions of physical systems may never lead to explanations of consciousness. Rather, IIT’s approach begins with what it takes to be the fundamental features of consciousness. These self-evident, Cartesian descriptors of phenomenology then lead to postulates concerning their physical realization; only then does IIT connect experience to the physical.

This methodological respect for Cartesian intuitions has a clear appeal, and the IIT literature largely takes this move for granted, rather than offering outright justification for it. In previous work with Edelman (2000), Tononi discusses machine-state functionalism, an early form of functionalism that identified a mental state entirely with its internal, “machine” state, describable in functional terms. Noting that Putnam, machine-state functionalism’s first advocate, came to abandon the theory because meanings are not sufficiently fixed by internal states alone, Tononi rejects functionalism generally. More recently, Koch (2012) describes much work in consciousness as “models that describe the mind as a number of functional boxes” where one box is “magically endowed with phenomenal awareness.” Koch confesses to being guilty of this in some of his earlier work. He then points to IIT as an exception.

Functionalism is not receiving a full or fair hearing in these instances. Machine-state functionalism is a straw man: contemporary versions of functionalism do not commit to an entirely internal explanation meaning, and not all functionalist accounts are subject to the charge of arbitrarily attributing consciousness to one part of a system. The success or failure of functionalism turns on its treatment of the Cartesian intuitions we all have that consciousness is immediate, unitary, and so on. Rather than taking these intuitions as evidence of the unavoidable truth of what IIT describes in its axioms, functionalism offers a subtle alternative. Consciousness indeed seems to us direct and immediate, but functionalists argue that this “seeming” can be adequately accounted for without positing a substantive phenomenality beyond function. Functionalists claim that the seeming immediacy of consciousness receives sufficient explanation as a set of beliefs and dispositions to believe that consciousness is immediate. The challenge lies in giving a functionalist account of such beliefs: no mean feat, but not the deep mystery that non-functionalists construe consciousness as posing. If functionalism is correct in this characterization of consciousness, it undercuts the very premises of IIT.

ii. Case Study: Access vs. Phenomenal Consciousness

Function may be understood in terms of access. If a conscious system has cognitive access to an association or belief, then that association or belief is conscious. In humans, access is often taken to be demonstrated by verbal reporting, although other behaviors may indicate cognitive access. Functionalists hold that cognitive access exhaustively describes consciousness (Cohen and Dennett, 2012). Others hold that subjects may be phenomenally conscious of stimuli without cognitively accessing them.

Interpretation of the relevant empirical studies is a matter of controversy. The phenomenon known as “change blindness” occurs when a subject fails to notice subtle differences between two pictures, even while reporting thoroughly perceiving each. Dennett’s version of functionalism, at least, interprets this as the subject not having cognitive access to the details that have changed, and moreover as not being conscious of them. The subject overestimates the richness of his or her conscious perception. Certain non-functionalists claim that the subject does indeed have the reported rich conscious phenomenology, even though cognitive access to that phenomenal experience is incomplete. Block (2011), for instance, holds this interpretation, claiming that “perceptual consciousness overflows cognitive access.” On this account, phenomenal consciousness may occur even in the absence of access consciousness.

IIT’s treatment of the role of silent neurons aligns with the non-functionalist interpretation. On IIT, a system’s consciousness grows in complexity and richness as the number of elements that could potentially relate causally within the MICS grows. Such elements, even when inactive, contribute to the specification of the integrated information, and so help to fix the phenomenal nature of the experience. In biological systems, this means that silent but potentially active neurons matter to consciousness.

Such silent neurons are not accessed by the system. These non-accessed neurons still contribute to consciousness. As in Block’s non-functionalism, access is not necessary for consciousness. On IIT, it is crucial that these neurons could potentially be active, so they must be accessible to the system. Block’s account is consistent with this in that he claims that the non-accessed phenomenal content need not be inaccessible. (Koch, separately from his support of IIT, takes the non-functionalist side of this argument (Koch and Tsuchiya, 2007); so do Fahrenfort and Lamme (2012); for a functionalist response to the latter, see Cohen and Dennett (2011, 2012).)

Non-functionalist accounts that argue for phenomenal consciousness without access make sense given a rejection of the functionalist claim that phenomenality may be understood as a set of beliefs and associations, rather than a Cartesian, immediate phenomenology beyond such things.

iii. Challenging IIT’s Augmentation of Naturalistic Ontology

Any account of consciousness that maintains that phenomenal experience is immediately first-personal stands in tension with naturalistic ontology, which holds that even experience in principle will receive explanation without appeal to anything beyond objective, or third-personal, physical features. Among theories of consciousness, those versions of panpsychism that attribute mental properties to basic structural elements depart perhaps most obviously from the standard scientific position. Because IIT limits its attribution of consciousness to particular physical systems, rather than to, for example, particles, it constitutes a somewhat more conservative position than panpsychism. Nevertheless, IIT’s claims amount to a radical reconception of the ontology of the physical world.

IIT’s allegiance to a Cartesian interpretation of experience from the outset lends itself to a non-naturalistic interpretation, although not every step in IIT’s argumentation implies a break from standard scientific ontology. IIT counts among its innovations the elucidation of integrated information, achieved when a system’s parts make a difference intrinsically, to the system itself. This differs from observer-relative, or Shannon, information, but by itself stays within the confines of naturalism: for example, IIT could have argued that integrated information constitutes an efficient functional route to realizing states of awareness.

Instead, IIT makes the much stronger claim that such integrated information, provided it is locally maximal, is identical to consciousness. The IIT literature is quite explicit on this point, routinely offering analogies to other fundamental physical properties. Consciousness is fundamental to integrated information in the same way as it is fundamental to mass that space-time bends around it. The degree and nature of any given phenomenal feeling follow basically from the particular conceptual structure that is the integrated information of the system. Consciousness is not a brute property of physical structure per se, as it is in some versions of panpsychism, but it is inextricable from physical systems with certain properties, just as mass or charge is inextricable from some particles. So, IIT is proposing an addition to what science admits into its ontology.

The extraordinary nature of the claim does not necessarily undermine it, but it may be cause for reservation. One line of objection to IIT might claim that this augmentation of naturalistic ontology is non-explanatory, or even ad hoc. We might accept that biological conscious systems possess neurology that physically integrates information in a way that converges with phenomenology (as outlined in the relation of the postulates to the axioms) without taking this as sufficient evidence for an identity relation between integrated information and consciousness.  In response, IIT advocates might claim that the theory’s postulates give better ontological ground than functionalism for picking out systems in the first place.

b. Aaronson’s Reductio ad Absurdum

The computer scientist Scott Aaronson (on his blog Shtetl-Optimized; see Horgan (2015) for an overview) has compelled IIT to admit a counterintuitive implication. Certain systems, which are computationally simple and seem implausible candidates for consciousness, may have values of phi higher even than those of human brains, and would count as conscious on IIT. Aaronson’s argument is intended as a reductio ad absurdum; the IIT response has been to accept its conclusion, but to deny the charge of absurdity. Aaronson’s basic claim involves applying phi calculation. Advocates of IIT have not questioned Aaronson’s mathematics, so the philosophical relevance lies in the aftermath.

IIT refers to richly complex systems such as human brains or hypothetical artificial systems in order to illustrate high phi value. Aaronson points out that systems that strike us as much simpler and less interesting will sometimes yield a high phi value. The physical realization of an expander graph (his example) could have a higher phi value than a human brain. A graph has points that connect to one another, making the points vertices and the connections edges. This may be thought of as modelling communication between points. Expander graphs are “sparse” – having not very many points – but those points are highly connected, and this connectivity means that the points have strong communication with one another. In short, such graphs have the right properties for generating high phi values. Because it is absurd to accept that a physical model of an expander graph could have a higher degree of consciousness than a human being, the theory that leads to this conclusion, IIT, must be false.

Tononi (2014) responds directly to this argument, conceding that Aaronson has drawn out the implications of IIT and phi fairly, even ceding further ground: a two-dimensional grid of logic gates, even simpler than an expander graph, would have a high phi value and would, according to IIT, have a high degree of consciousness. Tononi has already argued that a photodiode has minimal consciousness; to him, accepting where Aaronson’s reasoning leads is just another case of the theory producing surprising results. After all, science must be open to theoretical innovation.

Aaronson’s rejoinder challenges IIT by arguing that it implicitly holds inconsistent views on the role of intuition. In his response to Aaronson’s original claims, Tononi disparages intuitions regarding when a system is conscious: Aaronson should not be as confident as he is that expander graphs are not conscious. Indeed, the open-mindedness here suggested seems in line with the proper scientific attitude. Aaronson employs a thought-experiment to draw out what he takes to be the problem. Imagine that a scientist announces that he has discovered a superior definition of temperature and has constructed a new thermometer that reflects this advance. It so happens that the new thermometer reads ice as being warmer than boiling water. According to Aaronson, even if there is merit to the underlying scientific work, it is a mistake for the scientist to use the terms “temperature” or “heat” in this way, because it violates what we mean by those terms in the first place: “heat” means, partly, what ice has less of than boiling water. So, while IIT’s phi metric may have some merit, it is not in measuring consciousness degree, because “consciousness” means, partly, what humans have and expander graphs and logic gates do not have.

One might, in defense of IIT, respond by claiming that the cases are not as similar as they seem, that the definition of heat necessitates that ice has less of it than boiling water and that the definition of consciousness does not compel us to draw conclusions about expander graphs’ non-consciousness, strange as that might seem. Aaronson’s argument goes further, however, and it is here that the charge of inconsistency comes into play. Tononi’s answer to Aaronson’s original reductio argument partly relies upon claiming that facts such as that the cerebellum is not conscious are totally well-established and uncontroversial. (IIT predicts this because the wiring of the cerebellum yields a low phi and is not part of the conscious MICS of the brain.) Here, argues Aaronson, Tononi is depending upon intuition, but it is possible that although the cerebellum might not produce our consciousness, it may have one of its own. Aaronson is not arguing for the consciousness of the cerebellum, but rather pointing out an apparent logical contradiction. Tononi rejects Aaronson’s claim that expander graphs are not conscious because it relies on intuition, but here Tononi himself is relying upon intuition. Nor can Tononi here appeal to common sense, because IIT’s acceptance of expander graphs and logic gates as conscious flies in the face of common sense.

It is possible that IIT might respond to this serious charge by arguing that almost everyone agrees that the brain is conscious, and that IIT has more success than any other theory in accounting for this while preserving many of our other intuitions (that animals, infants, certain patients with brain-damage, and sleeping adults all have dimmer consciousness than adult waking humans, to give several examples). Because this would accept a certain role for intuitions, it would require walking back the gloss on intuition that Tononi has offered in response to Aaronson’s reductio. Moreover, Aaronson’s arguments show that such a defense of the overall intuitive plausibility of IIT will face difficult challenges.

c. Searle’s Objection

In one of very few published discussions of IIT by a philosopher, John Searle (2013a) has come out against it, criticizing its emphasis on information as a departure from the more promising “biological approach.” His objections may be divided into two parts; Koch and Tononi (2013) have offered a response.

First, Searle claims that in identifying consciousness with a certain kind of information, it has abandoned causal explanation. Appeal to cause should be the proper route for scientific explanation, and Searle maintains, as he has throughout his career, that solving the mystery of consciousness will depend upon the explication of the causal powers special to the brain that give rise to experience. Information fails aptly to address the problem because it is observer-relative; indeed, it is relative to the conscious observer. A book or computer, to take typical examples of objects associated with information, does not contain information except insofar as it is endowed by the conscious subject. Information is in the eye of the beholder. The notion of information presupposes consciousness, rather than explaining it.

Second, according to Searle, IIT leads to an absurdity, namely panpsychism, which is sufficient reason to reject it. He interprets IIT as imputing consciousness to all systems with causal relations, and so it follows on IIT that consciousness is “spread over the universe like a thin veneer of jam.” A successful theory of consciousness will have to appreciate that consciousness “comes in units” and give a principled account of how and why this is the case.

Koch and Tononi’s response (2013) addresses both strands of Searle’s argument. First, they agree that Shannonian information is observer-relative, but point out that integrated information is non-Shannonian. IIT defines integrated information as necessarily existing with respect to itself, which they understand in expressly causal terms, as a system whose parts make a difference to that system. Integrated information systems therefore exist intrinsically, rather than relative to observers. Not only does IIT attend to the observer-relativity point, then, but also does so in a way that, contrary to Searle’s characterization, crucially incorporates causality.

Second, they deny that IIT implies the kind of panpsychism that Searle rejects as absurd. As they point out, IIT only attributes consciousness to local maxima of integrated information (MICS), and although that implies that some simple systems such as the isolated photodiode have a minimal degree of consciousness, it provides a principle to determine which “units” are conscious, and which are not. As Tononi had already put it, before Searle’s charges: “How close is this position to panpsychism, which holds that everything in the universe has some kind of consciousness? Certainly, the IIT implies that many entities, as long as they include some functional mechanisms that can make choices between alternatives, have some degree of consciousness. Unlike traditional panpsychism, however, the IIT does not attribute consciousness indiscriminately to all things. For example, if there are no interactions, there is no consciousness whatsoever. For the IIT, a camera sensor as such is completely unconscious…” (Tononi, 2008).

Although Searle offers a rejoinder (2013b) to Tononi and Koch’s response, it largely rehearses the original claims. Regardless of whether IIT is true, Tononi and Koch have given good reason to read it as addressing precisely the concerns that Searle raises. Arguably, then, Searle might have reason to embrace IIT as a theory of consciousness that at least attempts a principled articulation of the special casual powers of the brain, which Searle has regarded for many years as the proper domain for explaining consciousness.

6. References and Further Reading

  • Barrett, Adam. “An Integration of Integrated Information Theory with Fundamental Physics.” Frontiers in Psychology, 5 (63). 2014.
    • Calls for a re-conception of phi with respect to electromagnetic fields.
  • Block, Ned. “Perceptual Consciousness Overflows Cognitive Access.” Trends in Cognitive Science, 15 (12). 2011.
    • Argues for the distinction between access and phenomenal consciousness.
  • Chalmers, David. The Conscious Mind. New York: Oxford University Press. 1996.
    • A major work, relevant here for its discussions of information, consciousness, and panpsychism.
  • Chalmers, David. “The Combination Problem for Panpsychism.” In L. Jaskolla and G. Bruntup (Ed.s) Panpsychism. Oxford University Press. 2016.
    • Updated take on a classical problem; Chalmers makes reference to IIT here.
  • Cohen, Michael, and Daniel Dennett. “Consciousness Cannot be Separated from Function.” Trends in Cognitive Science, 15 (8). 2011.
    • Argues for understanding phenomenal consciousness as access consciousness.
  • Cohen, Michael, and Daniel Dennett. “Response to Fahrenfort and Lamme: Defining Reportability, Accessibility and Sufficiency in Conscious Awareness.” Trends in Cognitive Science, 16 (3). 2012.
    • Further defends understanding phenomenal consciousness as access consciousness.
  • Dennett, Daniel. Consciousness Explained. Little, Brown and Co. 1991.
    • Classic, comparatively accessible teleofunctionalist account of consciousness.
  • Dennett, Daniel. Sweet Dreams: Philosophical Obstacles to a Science of Consciousness. London, England: The MIT Press. 2005.
    • A concise but wide-ranging, updated defense of functionalist explanation of consciousness.
  • Edelman, Gerald. The Remembered Present: A Biological Theory of Consciousness. New York: Basic Books. 1989.
    • Influential upon Tononi’s early thinking.
  • Edelman, Gerald, and Giulio Tononi. A Universe of Consciousness: How Matter Becomes Imagination. New York: Basic Books. 2000.
    • Puts forward many of the arguments that later constitute IIT.
  • Fahrenfort, Johannes, and Victor Lamme. “A True Science of Consciousness Explains Phenomenology: Comment on Cohen and Dennett.” Trends in Cognitive Science, 16 (3). 2012.
    • Argues for the access/phenomenal division supported by Block.
  • Horgan, John. “Can Integrated Information Theory Explain Consciousness?” Scientific American. 1 December 2015. http://blogs.scientificamerican.com/cross-check/can-integrated-information-theory-explain-consciousness/
    • Gives an overview of an invitation-only workshop on IIT at New York University that featured Tononi, Koch, Aaronson, and Chalmers, among others.
  • Koch, Christof. Consciousness: Confessions of a Romantic Reductionist. The MIT Press. 2012.
    • Intellectual autobiography, in part detailing the author’s attraction to IIT.
  • Koch, Christof, and Naotsugu Tsuchiya. “Phenomenology Without Conscious Access is a Form of Consciousness Without Top-down Attention.” Behavioral and Brain Sciences, 30 (5-6) 509-10. 2007.
    • Also argues for the access/phenomenal division supported by Block.
  • Koch, Christof, and Giulio Tononi. “Can a Photodiode Be Conscious?” New York Review of Books, 7 March 2013.
    • Responds to Searle’s critique.
  • Oizumi, Masafumi, Larissa Albantakis, and Giulio Tononi. “From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0.” PLOS Computational Biology, 10 (5). 2014. Doi: 10.1371/journal.pcbi.1003588.
    • Technically-oriented introduction to IIT.
  • Searle, John. “Minds, brains and programs.” In Hofstadter, Douglas and Daniel Dennett, (Eds.). The Mind’s I: Fantasies and Reflections on Self and Soul (pp. 353-373). New York: Basic Books. 1981.
    • Searle’s classic paper on intentionality and information.
  • Searle, John. The Rediscovery of Mind. Cambridge, MA: The MIT Press. 1992.
    • Fuller explication of Searle’s views on intentionality and information.
  • Searle, John. “Can Information Theory Explain Consciousness?” New York Review of Books 10 January 2013(a).
    • Objects to IIT.
  • Searle, John. “Reply to Koch and Tononi.” New York Review of Books. 7 March 2013(b).
    • Rejoinder to Koch and Tononi’s response to his objections to IIT.
  • Tegmark, Max. “Consciousness as a State of Matter.” Chaos, Solitons & Fractals. 2015.
    • Proposes a re-conception of phi, at the quantum level.
  • Tononi, Giulio. “An Information Integration Theory of Consciousness.” BMC Neuroscience, 5:42. 2004.
    • The earliest explicit introduction to IIT.
  • Tononi, Giulio. “Consciousness as Integrated Information: A Provisional Manifesto.” Biology Bulletin, 215: 216–242. 2008.
    • An early overview of IIT.
  • Tononi, Giulio. “Integrated Information Theory.” Scholarpedia, 10 (1). 2015. http://www.scholarpedia.org/w/index.php?title=Integrated_information_theory&action=cite&rev=147165.
    • A thorough synopsis of IIT.
  • Tononi, Giulio, and Gerald Edelman. “Consciousness and Complexity.” Science, 282 (5395). 1998.
    • Anticipates some of IIT’s claims.
  • Tononi, Giulio, and Christof Koch. “Consciousness: Here, There and Everywhere?” Philosophical Transactions of the Royal Society, Philosophical Transactions B, 370 (1668). 2015. Doi: 10.1098/rstb.2014.0167
    • Perhaps the most accessible current introduction to IIT, from the perspective of its founder and chief proponent.

 

Author Information

Francis Fallon
Email: Fallonf@stjohns.edu
St. John’s University
U. S. A.

Scientific Realism and Antirealism

Debates about scientific realism concern the extent to which we are entitled to hope or believe that science will tell us what the world is really like. Realists tend to be optimistic; antirealists do not. To a first approximation, scientific realism is the view that well-confirmed scientific theories are approximately true; the entities they postulate do exist; and we have good reason to believe their main tenets. Realists often add that, given the spectacular predictive, engineering, and theoretical successes of our best scientific theories, it would be miraculous were they not to be approximately correct. This natural line of thought has an honorable pedigree yet has been subject to philosophical dispute since modern science began.

In the 1970s, a particularly strong form of scientific realism was advocated by Putnam, Boyd, and others. When scientific realism is mentioned in the literature, usually some version of this is intended. It is often characterized in terms of these commitments:

  • Science aims to give a literally true account of the world.
  • To accept a theory is to believe it is (approximately) true.
  • There is a determinate mind-independent and language-independent world.
  • Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.
  • The progress of science asymptotically converges on a true account.

 

Table of Contents

  1. Brief History before the 19th Century
  2. The 19th Century Debate
    1. Poincaré’s Conventionalism
    2. The Reality of Forces and Atoms
    3. The Aim of Science: Causal Explanation or Abstract Representation?
  3. Logical Positivism
    1. General Background
    2. The Logical Part of Logical Positivism
    3. The Positivism Part of Logical Positivism
  4. Quine’s Immanent Realism
  5. Scientific Realism
    1. Criticisms of the Observational-Theoretical Distinction
    2. Putnam’s Critique of Positivistic Theory of Meaning
    3. Putnam’s Positive Account of Meaning
    4. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science
    5. Inference to the Best Explanation
  6. Constructive Empiricism
    1. The Semantic View of Theories and Empirical Adequacy
    2. The Observable-Unobservable Distinction
    3. The Argument from Empirically Equivalent Theories
    4. Constructive Empiricism, IBE, and Explanation
  7. Historical Challenges to Scientific Realism
    1. Kuhn’s Challenge
    2. Laudan’s Challenge: The Pessimistic Induction
  8. Semantic Challenges to Scientific Realism
    1. Semantic Deflationism
    2. Pragmatist Truth Surrogates
    3. Putnam’s Internal Realism
  9. Law-Antirealism and Entity-Realism
  10. NOA: The Natural Ontological Attitude
  11. The 21st Century Debates
    1. Structuralism
    2. Stanford’s New Induction
    3. Selective Realism
  12. References and Further Reading

1. Brief History before the 19th Century

The debate begins with modern science. Bellarmine advocated an antirealist interpretation of Copernicus’s heliocentrism—as a useful instrument that saved the phenomena—whereas Galileo advocated a realist interpretation—the planets really do orbit the sun. More generally, 17th century protagonists of the new sciences advocated a metaphysical picture: nature is not what it appears to our senses—it is a world of objects (Descartes’ matter-extension, Boyle’s corpuscles, Huygens’ atoms, and so forth) whose primary properties (Cartesian extension, or the sizes, shapes, and hardness of atoms and corpuscles, and/or forces of attraction or repulsion, and so forth) are causally responsible for the phenomena we observe. The task of science is “to strip reality of the appearances covering it like a veil, in order to see the bare reality itself” (Duhem 1991).

This metaphysical picture quickly led to empiricist scruples, voiced by Berkeley and Hume. If all knowledge must be traced to the senses, how can we have reason to believe scientific theories, given that reality lies behind the appearances (hidden by a veil of perception)? Indeed, if all content must be traced to the senses, how can we even understand such theories? The new science seems to postulate “hidden” causal powers without a legitimate epistemological or semantic grounding. A central problem for empiricists becomes that of drawing a line between objectionable metaphysics and legitimate science (portions of which seem to be as removed from experience as metaphysics seems to be). Kant attempted to circumvent this problem and find a philosophical home for Newtonian physics. He rejected both a veil of perception and the possibility of our representing the noumenal reality lying behind it. The possibility of making judgments depends on our having structured what is given: experience of x qua object requires that x be represented in space and time, and judgments about x require that x be located in a framework of concepts. What is real and judgable is just what is empirically real—what fits our system of representation in the right way—and there is no need for, and no possibility of, problematic inferences to noumenal goings-on. In pursuing this project Kant committed himself to several claims about space and time—in particular that space must be Euclidean, which he regarded as both a priori (because a condition of the possibility of our experience of objects) and synthetic (because not derivable from analytical equivalences)—which became increasingly problematic as 19th century science and mathematics advanced.

2. The 19th Century Debate

Many features of the contemporary debates were fashioned in 19th century disputes about the nature of space and the reality of forces and atoms. The principals of these debates—Duhem, Helmholtz, Hertz, Kelvin, Mach, Maxwell, Planck, and Poincaré—were primarily philosopher-physicists. Their separation into realists and antirealists is complicated, but Helmholtz, Hertz, Kelvin, Maxwell, and Planck had realist sympathies and Duhem, Mach, and Poincaré had antirealist doubts.

a. Poincaré’s Conventionalism

By the late 19th century several consistent non-Euclidean geometries, mathematically distinct from Euclidean geometry, had been developed. Euclidean geometry has a unique parallels axiom and angle sum of triangles equals 180º, whereas, for example, spherical geometry has a zero-parallel axiom and angle sum of triangles greater than or equal to 180º. These geometries raise the possibility that physical space could be non-Euclidean. Empiricists think we can determine whether physical space is Euclidean through experiments. For example, Gauss allegedly attempted to measure the angles of a triangle between three mountaintops to test whether physical space is Euclidean. Realists think physical space has some determinate geometrical character even if we cannot discover what character it has. Kantians think that physical space must be Euclidean because only Euclidean geometry is consistent with the form of our sensibility.

Poincaré (1913) argued that empiricists, realists, and Kantians are wrong: the geometry of physical space is not empirically determinable, factual, or synthetic a priori. Suppose Gauss’s experiment gave the angle-sum of a triangle as 180º. This would support the hypothesis that physical space is Euclidean only under certain presuppositions about the coordination of optics with geometry: that the shortest path of an undisturbed light ray is a Euclidean straight line. Instead, for example, the 180º measurement could also be accommodated by presupposing that light rays traverse shortest paths in spherical space but are disturbed by a force, so that physical space is “really” non-Euclidean: the true angle-sum of the triangle is greater than 180º, but the disturbing force makes it “appear” that space is Euclidean and the angle-sum of the triangle is 180º.

Arguing that there is no fact of the matter about the geometry of physical space. Poincaré proposed conventionalism: we decide conventionally that geometry is Euclidean, forces are Newtonian, light travels in Euclidean straight lines, and we see if experimental results will fit those conventions. Conventionalism is not an “anything-goes” doctrine—not all stipulations will accommodate the evidence—it is the claim that the physical meaning of measurements and evidence is determined by conventionally adopted frameworks. Measurements of lines and angles typically rely on the hypothesis that light travels shortest paths. But this lacks physical meaning unless we decide whether shortest paths are Euclidean or non-Euclidean. These conventions cannot be experimentally refuted or confirmed since experiments only have physical meaning relative to them. Which group of conventions we adopt depends on pragmatic factors: other things being equal, we choose conventions that make physics simpler, more tractable, more familiar, and so forth. Poincaré, for example, held that, because of its simplicity, we would never give up Euclidean geometry.

b. The Reality of Forces and Atoms

Ever since Newton, a certain realist ideal of science was influential: a theory that would explain all phenomena as the effects of moving atoms subject to forces. By the 1880s many physicists came to doubt the attainability of this ideal since classical mechanics lacked the tools to describe a host of terrestrial phenomena: “visualizable” atoms that are subject to position-dependent central forces (so successful for representing celestial phenomena) were ill-suited for representing electromagnetic phenomena, “dissipative” phenomena in heat engines and chemical reactions, and so forth. The concepts of atom and force became questionable. The kinetic theory of gases lent support to atomism, yet no consistent models could be found (for example, spectroscopic phenomena required atoms to vibrate while specific heat phenomena required them to be rigid). Moreover, intermolecular forces allowing for internal vibration and deformation could not be easily conceptualized as Newtonian central forces. Newtonian action-at-a-distance forces also came under pressure with the increasing acceptance of Maxwell’s theory of electromagnetism, which attributed electromagnetic phenomena to polarizations in a dielectric medium propagated by contiguous action. Many thought that physics had become a disorganized patchwork of poorly understood theories, lacking coherence, unity, empirical determinacy, and adequate foundations. As a result, physicists became increasingly preoccupied with foundational efforts to put their house in order. The most promising physics required general analytical principles (for example, conservation of energy and action, Hamilton’s principle) that could not be derived from Newtonian laws governing systems of classical atoms. The abstract concepts (action, energy, generalized potential, entropy, absolute temperature) needed to construct these principles could not be built from the ordinary intuitive concepts of classical mechanics. They could, however, be developed without recourse to “hidden mechanisms” and independently of specific hypotheses about the reality underlying the phenomena. Most physicists continued to be realists: they believed in a deeper reality underlying the phenomena that physics can meaningfully investigate; for them, the pressing foundational problem was to articulate the concepts and develop the laws that applied to that reality. But some physicists became antirealists. Some espoused local antirealism (antirealist about some kinds of entities, as Hertz (1956) was about forces, while not espousing antirealism about physics generally).

c. The Aim of Science: Causal Explanation or Abstract Representation?

Others espoused global antirealism. Like contemporary antirealists, they questioned the relationship among physics, common sense and metaphysics, the aims and methods of science, and the extent to which science, qua attempt to fathom the depth and extent of the universe, is bankrupt. While their realist colleagues hoped for a unified, explanatorily complete, fundamental theory as the proper aim of science, these global antirealists argued on historical grounds that physics had evolved into its current disorganized mess because it had been driven by the unattainable metaphysical goal of causal explanation. Instead, they proposed freeing physics from metaphysics, and they pursued phenomenological theories, like thermodynamics and energetics, which promised to provide abstract, mathematical organizations of the phenomena without inquiring into their causes. To justify this pursuit philosophically, they proposed a re-conceptualization of the aim and scope of physics that would bring order and clarity to science and be attainable. The aim of science is: economy of thought (science is a useful instrument without literal significance (Mach 1893)), the discovery of real relations between hidden entities underlying the phenomena (Poincaré 1913), and the discovery of a “natural classification” of the phenomena (a mathematical organization of the phenomena that is the reflection of a hidden ontological order (Duhem 1991)). These affinities, between 19th century global antirealism and 20th century antirealism, mask fundamental differences. The former is driven by methodological considerations concerning the proper way to do physics whereas the latter is driven by traditional metaphysical or epistemological concerns (about the meaningfulness and credibility of claims about goings-on behind the veil of appearances).

3. Logical Positivism

Logical positivism began in Vienna and Berlin in the 1910s and 1920s and migrated to America after 1933, when many of its proponents fled Nazism. The entire post-1960 conversation about scientific realism can be viewed as a response to logical positivism. More a movement than a position, the positivists adopted a set of philosophical stances: pro-science (including pro-verification and pro-observation) and anti-metaphysics (including anti-cause, anti-explanation, anti-theoretical entities). They are positivists because of their pro-science stance; they are logical positivists because they embraced and used the formal logic techniques developed by Frege, Russell, and Wittgenstein to clarify scientific and philosophical language.

a. General Background

As physics developed in the early 20th century, many of the 19th century methodological worries sorted themselves out: Perrin’s experiments with Brownian motion persuaded most of the reality of atoms; special relativity unified mechanics and electromagnetism and signaled the demise of traditional mechanism; general relativity further unified gravity with special relativity; quantum mechanics produced an account of the microscopic world that allowed atoms to vibrate and was spectacularly supported empirically. Moreover, scientific developments undermined several theses formerly taken as necessarily true. Einstein’s famous analysis of absolute simultaneity showed that Newtonian absolute space and time were incorrect and had to be replaced by the space-time structure of Special Relativity. His Theory of General Relativity introduced an even stranger notion of space-time: a space-time with a non-Euclidean structure of variable curvature. This undermined Kant’s claims that space has to be Euclidean and that there is synthetic a priori knowledge. Moreover, quantum mechanics, despite its empirical success, led to its own problems, since quantum particles have strange properties—they cannot have both determinate position and momentum at a given time, for example—and the quantum world has no unproblematic interpretation. So, though everyone was converted to atomism, no one understood what atoms were.

Logical positivism developed within this scientific context. Nowadays the positivists are often depicted as reactionaries who developed a crude, ahistorical philosophical viewpoint with pernicious consequences (Kuhn 1970, Kitcher 1993). In their day, however, they were revolutionaries, attempting to come to grips with the profound changes that Einstein’s relativity and Bohr’s quantum mechanics had wrought on the worldview of classical physics and to provide firm logical foundations for all science.

Logical positivism’s philosophical ancestry used to be traced to Hume’s empiricism (Putnam 1962, Quine 1969). On this interpretation, the positivist project provides epistemological foundations for problematic sentences of science that purport to describe unobservable realities, such as electrons, by reducing sentences employing these concepts to unproblematic sentences that describe only observable realities. Friedman (1999) offers a different Kantian interpretation: their project provides objective content for science, as Kant had attempted, by showing how it organizes our experience into a structured world of objects, but without commitment to scientifically outdated aspects of Kant’s apparatus, such as synthetic a priori truths or the necessity of Euclidean geometry. Whichever interpretation is correct, the logical positivists clearly began with traditional veil-of-perception worries (§1) and insisted on a distinction that both Hume and Kant advocated—between meaningful science and meaningless metaphysics.

b. The Logical Part of Logical Positivism

This distinction rests on their verificationist theory of meaning, according to which the meaning of a sentence is its verification conditions and understanding a sentence is knowing its verification conditions. For example, knowing the meaning of “This is blue” is being able to pick out the object referred to by “this” and to check that it is blue. While this works only for simple sentences built from terms that directly pick out their referents and predicates with directly verifiable content, it can be extended to other sentences. To understand “No emerald is blue” one need only know the verification conditions for “This is an emerald”, “This is blue” and the logical relations of such sentences to “No emerald is blue” (for example, that “no emerald is blue” implies “if this is an emerald, then this is not blue”, and so forth). Simple verification conditions plus some logical knowledge buys a lot. But it does not buy enough. For example, what are the verification conditions expressed by “This is an electron”,  where “this” does not pick out an ostendible object and where “is an electron” does not have directly verifiable content?

To deal with this, the positivists, especially Carnap, hit upon an ingenious program. First, they distinguished two kinds of linguistic terms: observational terms (O-terms), like “is blue”, which have relatively unproblematic, directly verifiable content, and theoretical terms (T-terms), like “is an electron”, which have more problematic content that is not directly verifiable. Second, they proposed to indirectly interpret the T-terms, using logical techniques inherited from Frege and Russell, by deductively connecting them within a theory to the directly interpreted O-terms. If each T-term could be explicitly defined using only O-terms, just as “x is a bachelor” can be defined as “x is an unmarried male human”, then one would understand the verification conditions for a T-term just by understanding the directly verifiable content of the O-terms used to define it, and a theory’s theoretical content would be just its observational content.

Unfortunately, the content of “is an electron” is open-ended and outstrips observational content so that no explicit definition of it in terms of a finite list of O-terms can be given in first-order logic. From the 1930s to the 1950s, Carnap (1936, 1937, 1939, 1950, 1956) struggled with this problem by using ever more elaborate logical techniques. He eventually settled for a less ambitious account: the meaning of a T-term is given by the logical role it plays in a theory (Carnap 1939). Although T-terms cannot be explicitly defined in first-order logic, the totality of their logical connections within the theory to other T-terms and O-terms specifies their meaning. Intuitively, the meaning of a theoretical term like “electron” is specified by: “electron” means “the thing x that plays the Θ-role”, where Θ is the theory of electrons. (This idea can be rendered precisely in second-order logic by a “Ramseyified” definition: “electron” means “the thing x such that Θ(x)”, where “Θ(x)” is the result of taking the theory of electrons Θ (understood as the conjunction of a set of sentences) and replacing all occurrences of “is an electron” with the (second-order) variable “x” (Lewis 1970).

Two features of this theory of meaning lay groundwork for later discussion. First, the meaning of any T-term is theory-relative since it is determined by the term’s deductive connections within a theory. Second, the positivists distinguished analytic truths (sentences true in virtue of meaning) and synthetic truths (sentences true in virtue of fact). “All bachelors are unmarried” and “All electrons have the property of being the x such that Θ(x)” are analytic truths, whereas “Kant was a bachelor” and “Electrons exist” are synthetic truths. The positivists inherited this distinction from Kant, but, unlike Kant, they rejected synthetic a priori truths. For them, there are only analytic a priori truths (all pure mathematics, for example) and synthetic a posteriori truths (all statements to the effect that a given claim is verified).

c. The Positivism Part of Logical Positivism

The positivists distinguished legitimate positive science, whose aim is to organize and predict observable phenomena, from illegitimate metaphysics, whose aim is to causally explain those phenomena in terms of underlying unobservable processes. We should restrict scientific attention to the phenomena we can know and banish unintelligible speculation about what lies behind the veil of appearances. This distinction rests on the observational-theoretical distinction (§3b): scientific sentences (even theoretical ones like “Electrons exist”) have meaningful verifiable content; sentences of metaphysics (like “God exists”) have no verifiable content and are meaningless.

Because of their hostility to metaphysics, the positivists “diluted” various concepts that have a metaphysical ring. For example, they replaced explanations in terms of causal powers with explanations in terms of law-like regularities so that “causal” explanations become arguments. According to the deductive-nomological (DN) model of explanation, pioneered by Hempel (1965), “Event b occurred because event a occurred” is elliptical for an argument like: “a is an event of kind A, b is an event of kind B, and if any A-event occurs, a B-event will occur; a occurred; therefore b occurred”. The explanandum logically follows from the explanantia, one of which is a law-like regularity.

Because they advocated a non-literal interpretation of theories, the positivists are considered to be antirealists. Nevertheless, they do not deny the existence or reality of electrons: for them, to say that electrons exist or are real is merely to say that the concept electron stands in a definite logical relationship to observable conditions in a structured system of representations. What they deny is a certain metaphysical interpretation of such claims—that electrons exist underlying and causing but completely transcending our experience. It is not that physical objects are fictions; rather, all there is to being a real physical object is its empirical reality—its system of relations to verifiable experience.

4. Quine’s Immanent Realism

Quine, an early critic of logical positivism, acknowledged their rejection of transcendental questions such as “Do electrons really exist (as opposed to being just useful fictions)?” Our evidence for molecules is similar to our evidence for everyday bodies, he argued; in each case we have a theory that posits an arrangement of objects that organizes our experience in a way that is simple, familiar, predictive, covering, and fecund. This is just what it is to have evidence for something. So, if we have such an organizing theory for molecules, then we can no more doubt the existence of molecules than we can doubt the existence of ordinary physical bodies (Quine 1955). Quine thus arrived at a realism not unlike the empirical realism of the logical positivists.

However, Quine rejected their theory of meaning and its central analytic-synthetic distinction, arguing that theoretical content cannot be analytically welded to observational content. The positivists, he argued, confuse the event of positing with the object posited. Yes, scientists conventionally introduce posits (an event) as Stoney introduced the term “electron” in 1894: “electron” means “the fundamental unit of electric charge that permanently attaches to atoms”. But no, scientists do not treat the conventions as analytic truths that cannot be revised without a change of meaning. Scientists did not treat Stoney’s definition as binding analytic truth and “Electrons exist” as a synthetic hypothesis whose truth must be verified. More generally, Quine argued, once the explicit definitional route failed by Carnap’s allowing the meaning of “electron” to be a function of the totality of its logical connections within a theory, Carnap had already adopted meaning holism, according to which one cannot separate the analytic sentences, whose truth-values are determined by the contribution of language, from the synthetic sentences, whose truth-values are determined by the contribution of fact.

Quine accepted meaning holism together with another thesis, epistemological holism, a doctrine often called “the Quine-Duhem Thesis”, because Duhem used it to argue against Poincaré’s conventionalism. The Quine-Duhem thesis says that only a group of hypotheses can be falsified because only a group of hypotheses has observational consequences. If a single hypothesis, H, implies an observational consequence O and we get evidence for not-O; then we can deduce not-H. But a single hypothesis will typically not imply any observational consequence. Take, for example, Gauss’s supposed mountaintop triangulation experiment to test whether space is Euclidean (§2a). Let H = “Space is Euclidean” and O = “The measured angle-sum of the triangle equals 180º”. Clearly H does not entail O without auxiliary assumptions: for example, A1 = “Light travels the shortest Euclidean paths”, A2 = “No physical force appreciably disturbs the light”, A3 = “The triangle is large enough for deviations from rectilinear paths to be experimentally detectable”, and so forth. Consequently, if the experiment yields not-O = “The measured angle-sum of the triangle is not equal to 180º”, we cannot deduce not-H = “Space is non-Euclidean”. We can only deduce not-(H and A1 and A2 and Aand so forth); that is, we can only deduce that one or more of the hypothesis and the auxiliary assumptions is false—perhaps space is Euclidean but some force is distorting the light paths to make it look non-Euclidean. Poincaré and the positivists reply that it is conventional or analytic that space is Euclidean; there is no fact of the matter. In rejecting conventionalism, Duhem and Quine claim that we may keep H and reject one of the Ais to accommodate not-O: any statement may be held true in light of disconfirming experience. [It is misleading, however, to call epistemological holism “the Quine-Duhem thesis”. For Duhem, epistemological holism holds only for physical theories for rather special reasons; it does not extend to mathematics or logic and is not connected with theses about meaning. Quine extends epistemological holism from physics to all knowledge, including all knowledge traditionally regarded as a priori, including allegedly analytic statements.] Quine, but not Duhem, believed that our reluctance to revise mathematics and logic (because of their centrality to our belief-systems) does not entail their a prioricity (irrevisability based on evidence).

Moreover, if the analytic-synthetic distinction collapses, so too does the positivist separation of metaphysics from science. For Quine, metaphysical questions are just the most general and abstract questions we ask and are decided on the grounds we use to decide whether electrons exist. All questions are “internal” in the sense that they must be formulated in our home language and answered with our standard procedures for gathering and weighing evidence. In particular, questions about the reality of some putative objects are to be answered in terms of whether they contribute to a useful organization of experience and whether they withstand the test of experience.

5. Scientific Realism

In the 1970s, a particularly strong form of scientific realism (SR) was advocated by Putnam, Boyd, and others (Boyd 1973, 1983; Putnam 1962, 1975a, 1975b). When scientific realism is mentioned in the literature, usually some version of SR is intended. SR is often characterized in terms of two commitments (van Fraassen 1980):

SR1     Science aims to give a literally true account of the world.

SR2     To accept a theory is to believe it is (approximately) true.

However, scientific realists’ arguments and their interpretation of SR1 and SR2 often presuppose further commitments:

SR3     There is a determinate mind-independent and language-independent world.

SR4     Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.

SR5     The progress of science asymptotically converges on a true account.

a. Criticisms of the Observational-Theoretical Distinction

Critics of positivism argued that there is no workable, well-motivated distinction between observational and theoretical vocabulary that would make the former unproblematic and the latter problematic (for example, Putnam 1962; Maxwell 1962; van Fraassen 1980). First, O-terms apply to apparently theoretical entities (for example, red corpuscle) and T-terms apply to apparently observable entities (for example, the moon is a satellite). Second, if T-terms were epistemologically or semantically problematic, that would have to be due to the unobservable nature of their referents. But in the continuous gradation between seeing with the unaided eye, with binoculars, with an optical microscope, with an electron microscope, and so on, there is no sharp cut-off between being observable and being unobservable where we could non-arbitrarily say: beyond this we cannot trust the evidence of our senses or apply terms with confidence. Third, the “able” in “observable” cannot be specified in a way that motivates a plausible distinction. Most “theoretical” entities can be detected (like electrons) with scientific instruments or theoretically calculated (like lunar gravity). The positivist may respond that they cannot be directly sensed, and are thus unobservable, but why should being directly sensed be the criterion for epistemological or semantic confidence? Fourth, observation is theory-infected: what we can both observe and employ as evidence is a function of the language, concepts, and theories we possess. A primitive Amazonian may observe a tennis ball (he notices it), but without the relevant concepts he cannot use it as evidence for any claims about tennis. Such arguments undermine a central distinction of the positivist program.

b. Putnam’s Critique of Positivistic Theory of Meaning

Putnam (1975a, 1975b) provides a general argument against all theories of meaning (Frege, Russell, Carnap, Kuhn), including positivist theories, which are classical in the relevant sense. Classical concepts have two characteristics: they determine their extensions in the world, and we can “grasp” them. To know the meaning of a directly interpretable O-term is to associate it with a concept (verification condition) which determines the term’s extension. In turn, to know the meaning of an indirectly interpretable T-term is to know its logical connections to directly interpretable terms. These two features of the classical view are:

(1)  To know the meaning of F is to be in a certain psychological state (of grasping F’s associated concept and knowing it is the meaning of “F”);

(2)  The meaning of F determines the extension of F in the sense that, if two terms have the same meaning, they must have the same extension.

If the meaning of “water” is the concept the clear, tasteless, potable, nourishing liquid found in lakes and rivers, then by (1) I must associate that concept with “water” if I’m to know its meaning and by (2) something will be water just in case it satisfies that concept.

Putnam’s famous Twin Earth argument (Putnam 1975b) is intended to show that all classical theories fail because (1) and (2) are not co-tenable. Suppose the year is 1740 when speakers did not know that water is H2O. Suppose too that another planet, Twin-Earth, is just like Earth except that a different liquid, whose chemical nature is XYZ, is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Suppose finally that Earthling Oscar and Twin-Earthling Twin-Oscar are duplicates and share the very same internal psychological states so that Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers” if and only if Twin-Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers”. In other words, they grasp the same meaning and associate it with the word “water”; (1) is satisfied. But then (2) cannot be satisfied: meaning does not determine extension because the extension of “water” (in English) = H2O yet XYZ = the extension of “water” (in Twin-English). If (1), then not-(2). Conversely, if meaning does determine extension, then since the extension of “water” (on Earth) is the extension of “water” (on Twin-Earth), Oscar and Twin-Oscar must associate different meanings with the term. Consequently, either (1) or (2) must go. Putnam keeps (2) and revises (1).

c. Putnam’s Positive Account of Meaning

How is extension determined, if not classically? Putnam develops a causal-historical account of reference for natural kind terms (“water”) and physical magnitude terms (“temperature”). Think of these terms being introduced into the language via an introducing event or baptism. The introducer points to an object (or phenomenon) and intones: “let ‘t’ apply to all and only objects that are relevantly similar (same kind, same magnitude) to this sample (or to whatever is the cause of this phenomenon)”. Later t-users learn conditions that normally pick out the referent of t, use these conditions to triangulate their usage with that of others and with extra-linguistic conditions, and intend their t-utterances to conform to the t-practices initiated in the introducing event. The term passes through the community so that reference is preserved. Then, on Putnam’s view, the extension of the term is part of the meaning of the term, the kind or magnitude that the term “locked on to” in the course of its introduction and historical development. So H2O is part of the English meaning of “water” and (2) is satisfied: meaning determines extension since extension is part of the meaning. This gives an intuitively plausible reading of the Twin-Earth scenario: Oscar is talking about water (H2O) and Twin-Oscar is talking about Twin-water (XYZ).

On classical accounts, a speaker S correctly uses a term “t” to refer to an object x only if x uniquely satisfies a concept, description, verification procedure, or theory that S associates with “t”. In the 1740s English-speakers lacked such uniquely identifying knowledge, though we would naturally say they were using “water” as we do — to refer to H2O. On Putnam’s account, S correctly uses t to refer to x only if S is a member of a linguistic community whose t-usage (via their linguistic and extra-linguistic interactions) is causally or historically tied to the things or stuff that are of the same kind as x. Realistic semantics ties correct usage to things in the world using causal relations. Because truth is defined in terms of reference (for example, “a is F” is true if and only if the referent of “a” has the property expressed by “F”), truth on Putnam’s account is also a causal notion.

We now see why SR is committed to SR3 and SR4 above. Clearly SR1 requires SR3: science can aim at a literally true account of the world only if the world is some determinate way that an account can be literally true of. But Putnam’s semantics requires more: that there be natural kinds and magnitudes that our terms lock onto, which is SR4. Note SR5 also seems to require SR3 and SR4. To many realists who accept SR3, SR4 seems extravagant and mysterious. Natural kinds seem to be an unnecessary traditional philosophical apparatus imposed on realism without the support of, and indeed undermined by, science. Our best science suggests that natural kinds do not exist: water, for example, is not a simple natural kind, H2O, but a more complicated structure of constantly changing polymeric variations, and biological species are anything but simple kinds. And even if there were natural kinds, it seems unreasonable to expect that language could neatly lock onto them: why should our accidental encounters with various samples in our limited part of the universe put us in a position to lock onto universal kinds? Continuity of reference of the kind advocated by Putnam may be too crude. More fine-grained accounts have been proposed (Kitcher 1993; Wilson 1982, 2006) which acknowledge the complicated evolution of science and language yet avoid metaphysical extravagance.

d. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science

A common argument for SR is the following:

  1. An acceptable philosophy of science should be able to explain standard scientific practice and its instrumental success.
  2. Only SR can explain standard scientific practice and its instrumental success.
  3. Thus SR is the only acceptable philosophy of science.

This is an instance of inference to the best explanation (§5e). Here we look at premise 2, which follows logically from:

2a. There are only two contending explanations: SR and Idealism.

2b. Idealism fails to explain the practice and its success, while SR succeeds.

Premise 2a: For Putnam the distinction between realism and idealism is fundamentally semantic. In realist (or externalist) semantics the world leads and content follows: content is determined causally and historically by the way world is; the content of “water” is H2O. In idealist (or internalist) semantics content drives and the world follows: the world is whatever satisfies the descriptive content of our thoughts; the content of “water” is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Idealism is a blanket category covering any account of meaning (including positivist and Kuhnian and pragmatist accounts (§§7-8)) in the family of classical theories (§5b).

Premise 2b: Idealism fails to explain scientific practice and success in several ways: (i) For the positivist, “Electrons exist” means “Θi implies ‘electrons exist’ and Θi is observationally correct” and “‘electron’ refers to x” means “x is a member of the kind X such that Θi(X)” (§3b). Existence, reference, and truth are all theory-relative. Take “electron” in Thomson’s 1898 theory, in Bohr’s 1911 theory, and in full quantum theory (late 1920s). Since the meaning of “electron” changes from theory to theory and meaning determines reference, the referent of “electron” changes from theory to theory. So, Thomson, early Bohr, later Bohr, Heisenberg, and Schrödinger were (a) talking about a different entity and (b) changing the meaning of “electron”. Putnam argues that this is a bizarre re-description of what we would normally say: they were (a) talking about the same entity and (b) making new discoveries about it. By contrast, realist truth and reference are trans-theoretic: once “electron” was introduced into the language by Stoney, it causally “locked onto” the property being an electron; then the various theorists were talking about that entity and making new discoveries about it. So realism, unlike positivism, saves our ordinary ways of talking and acting.

(ii) The conjunction objection: in practice we conjoin theories we accept. Realist truth has the right kind of properties, such as closure under the logical operation of conjunction (if T1 is true and T2 is true, then (T1 and T2) is true), to underwrite this conjunction practice. But positivist surrogates for truth, reference, and acceptance cannot underwrite this practice. From “T1 is observationally correct” and “T2 is observationally correct”, it does not follow that (T1 and T2) is observationally correct—their theoretical parts could contradict each other, for example, so that their conjunction would imply all observational sentences, both true and false. Again realism, but not positivism, succeeds. Similarly, the practice of conjoining auxiliary hypotheses with a theory to extend and test the theory cannot be accounted for by positivism. In {Newton’s theory of gravitation + there is no transneptunian planet}, “gravitation” has one meaning; in {Newton’s theory of gravitation + there are transneptunian planets}, it has another meaning. But the discovery that the latter was true and the former false should not be described as a change of meaning or reference of the word “gravitation”. Again realism succeeds where positivism fails.

(iii) The No-Miracles Argument (NMA): everyone agrees that science is instrumentally successful and increasingly so. Scientists believe that newly proposed theories stand a better chance of success if they resemble current successful theories or if they are tested by methods informed by such theories, and they construct scientific instruments, experiments, and applications relying on current theories. Moreover, scientists are getting better at doing this—consider improvements in microscopy over the past three centuries. Their actions are successful and rely on their beliefs that current theories can be depended upon to produce a likelihood of success. These successes are a miracle on positivist principles. Why should reliance on observationally correct theories be expected to produce success, unless we believe what they say about unobservables? In contrast, SR explains these successes: scientists’ actions rely upon their belief that the theories they use are approximately true; those actions have a high degree of success; the best explanation of their success is that the theories relied upon are approximately true.

e. Inference to the Best Explanation

Argument 1-3 (§5d) is an instance of inference to the best explanation (IBE), an inferential principle that realists endorse and antirealists reject. IBE is the rule that we should infer the truth of the theory (if there is one) that best explains the phenomena. Thus we should infer SR because it best explains scientific practice and its instrumental success.

First, a few clarifications of IBE are in order. If IBE is to be non-trivial, the best explanation must not entail that what is best must antecedently be what is most likely, since of course we should infer the truth of the most likely explanation. Rather the best explanation must be characterized in terms of properties like “loveliest” or “most explainey” (Lipton 2004). Traditional examples of such properties are: it has wide scope and precision; it appeals to plausible mechanisms; it is simple, smooth, elegant, and non-ad hoc; and it underwrites contrasts (why this rather than that). Then IBE says we should accept the theory that optimizes such explanatory virtues when explaining the phenomena. The caveat “if there is one” blocks inferences to the best of a bad lot: the best explanation may not reach a minimally acceptable threshold. Finally, like any inferential principle that amplifies our knowledge, conclusions inferred by IBE are fallible: while they are more likely to be true, they could be false. Second, the “justification” for IBE is two-fold. (1) It is needed for science. Simple enumerative induction (which entitles us to move probabilistically from “All observed As are Bs” to “All As are Bs” cannot handle inferences from observed phenomena to their “hidden” causes. For example, we cannot inductively infer “Galaxy X is receding” from “Light from Galaxy X is red-shifted”, but we can infer by IBE that Galaxy X is receding because that is the best explanation of why its light is red-shifted. More strongly, Harman (1965) argues that IBE is needed to warrant straight enumerative induction: we are entitled to make the induction from “All observed As are Bs” to “All As are Bs” only if “All As are Bs” provides the best explanation of our total evidence. (2) Scientific uses of IBE are grounded in, and are just sophisticated applications of, a principle we use in everyday inferential practice. If I see nibbled cheese and little black deposits in my kitchen and hear scratching noises in the walls, I reasonably infer that I have mice, because that best explains my evidence. IBE thus needs no more justification than does modus ponens—each is part of the very practices that constitute what rational inference is.

Realists employ IBE at different levels. At the ground-level, they observe surprising regularities like the phenomenological gas laws relating pressure, temperature, and volume. These cannot be just cosmic coincidences. Realists argue that observed gas behavior is as it is because of underlying molecular behavior; we have reason to believe the molecular hypothesis (by IBE) because it best explains the observed gas behavior. At this level, antirealist rejections of IBE seem stretched: it seems unsatisfactory to say either that we do not need an explanation (since it appears to be a guiding aim of inquiry to explain regularities where possible) or that observed gas behavior is as it is because gases behave as if they are composed of molecules (since ordinary and scientific practice distinguishes genuine explanations from just-so stories).

Realists also employ IBE at a meta-level (§5d): we should be realists about our current theories because only realism can explain how our methodological reliance on them leads to the construction of empirically successful theories (Boyd) or only realism can explain the way in which scientific theories succeed each other and the methodological constraints scientists impose on themselves when constructing new theories (Putnam). Relativity theorists felt bound to have Newton’s theory derivable in the limit from Einstein’s theory. Why? The realist answer is: “because a partially correct account of a theoretical object (as the gravitational field) must be replaced by a better account of the same theory-independent object (as the metric structure of spacetime)”. Similarly, realists claim that scientific progress is best explained by SR5, the thesis that science is converging on a true account of the world. As Putnam says, realism is the only hypothesis that does not make the success of science a miracle. At the meta-level, the alleged phenomenon is that our best scientific traditions and theories are instrumentally and methodologically successful; SR is alleged to be the best (or only) explanation of that phenomenon; thus we should infer SR. As we will see (§§6d, 7, 11b), it is not clear that these uses of IBE are legitimate, because the alleged phenomenon itself is questionable, or the SR-“explanation” does not explain, or no explanation may be needed, or alternative antirealist explanations may be better.

6. Constructive Empiricism

Van Fraassen (1980) proposed constructive empiricism (CE), arguing that we can preserve the epistemological spirit of positivism without subscribing to its letter. Van Fraassen’s is an antirealism concerning unobservable entities. Recognizing the difficulties of basing antirealism on a “broken-backed” linguistic distinction between O-terms and T-terms, he allows our judgments about unobservables to be literally construed but, he argues, our evidence can never entitle us to our beliefs about unobservables. CE is consistent with SR3 and SR4 (though it does not commit to them, it has no quarrels with realist objectivity or semantics) but replaces SR1, SR2, and SR5 respectively with:

CE1     Science aims to provide empirically adequate theories of the phenomena.

CE2     To accept a theory is to believe it is empirically adequate, but acceptance has further non-epistemic/pragmatic features.

CE5     The progress of science produces increasing empirical adequacy.

A theory T is empirically adequate if and only if what T says about all actual observable things and events is true (that is, T saves all the phenomena, or T has a model that all actual phenomena fit in). Empirical adequacy is logically weaker than truth: T’s truth entails its empirical adequacy but not conversely. But it is still quite strong: an empirically adequate theory must correctly represent all the phenomena, both observed and unobserved. CE2 distinguishes epistemic and pragmatic aspects of acceptance. Epistemic acceptance is belief; beliefs are either true or false. Pragmatic acceptance involves non-epistemic commitments to use the theory in certain ways (basing research, experiments, and explanations on it, for example); commitments are neither true nor false; they are either vindicated or not. CE5 acknowledges that there is instrumental progress without trying to explain it. CE concedes a realist semantics (“electron”-talk is not highly derived talk about observables) but preserves the spirit of positivism by recommending agnosticism about a theory’s literal claims about unobservables.

a. The Semantic View of Theories and Empirical Adequacy

On the positivist view, a theory T is a syntactic object: T is the set of theorems in a language generated from a set of axioms (the laws of T) and derivation rules. The empirical content (the entire literal content) of T is T/O, the theorems expressible in the observational vocabulary. A theory T is empirically (observationally) adequate if T/O is the class of all true observational sentences.  Two theories, T and T’, are empirically (observationally) equivalent if T/O = T’/O. Since such theory pairs have the same literal content and differ only in their non-literal, theoretical content, they are merely inter-definable variants of a common observational basis: they say the same thing but express it differently. There is no fact of the matter whether T or T’ is true (both are or neither are), and whether we work with T or T’ is purely a pragmatic matter concerning which is simpler, more convenient, and so forth. For SR and CE there is a fact of the matter: at most one of T, T’ can be true. For SR there may be reasons to believe one of T, T’. For CE there can be no epistemic reason to believe one over the other, though there may be pragmatic reasons to accept (commit to using) one over the other. Van Fraassen needs a different account of theories if he is to agree with realists about literal content and there being a fact of the matter about empirically equivalent theories.

For him, a theory T is a semantic object, the class of models, A = <D, R1, R2, …, Rn>, that satisfy its laws (where D is a set of objects and Ri are properties and relations defined on them). For example, D might contain billiards and molecules; the property is elastic in A might be instantiated by both billiards and molecules, is a molecule by some members of D, and is a billiard ball by others. Now let A’ = <D’, R’1, R’2, …, R’m> (where m < n, D’ is a proper subset of D, and R’i = Ri/D’ (Ri restricted to D’)). Intuitively A’ is obtained from A by removing all unobservables, so D’ would contain billiard balls but not molecules, is elastic would now be restricted to billiard balls, is a molecule would not be instantiated, and so forth. Then A’ is an empirical substructure of A, the result of restricting the original domain to observables and its properties and relations accordingly. T is empirically adequate if and only if T has an empirical substructure that all observables fit in. Two theories, T and T’, are empirically equivalent if all the observables in a model of T are isomorphic to the observables in a model of T’. Such theory pairs agree in what they say about observables but may disagree in what they say about unobservables. Thus CE can agree with SR that at most one of T, T’ can be true and to be a realist about that theory is to believe it is true (SR2). Yet CE can preserve the spirit of positivism by holding that we can never have reason to believe a theory; at most we have reason to believe it is empirically adequate. Friedman (1982) questions whether van Fraassen achieves this.

b. The Observable-Unobservable Distinction

Since CE recommends agnosticism about unobservables but permits belief about observables, the policy requires an epistemologically principled distinction between the two. Though rejecting the positivists’ distinction between T-terms and O-terms, van Fraassen defends a distinction between observable and unobservable objects and properties, a distinction that grounds his policy of agnosticism concerning what science tells us about unobservables. There is a fact of the matter about what is observable-for-humans: given the nature of the world and of the human sensory apparatus, some objects/events/properties possess the property is observable-for-humans; others lack that property; the former are observables, the latter unobservables. For example, Jupiter’s moons are observable because a human could travel close enough to see them unaided, but electrons are unobservable because a human could never see one (that is just the nature of humans and electrons). Van Fraassen also claims that the limits of observation are disclosed by empirical science and not by philosophical analysis—what is observable is simply a fact disclosed by science. It should be noted that the distinction, as he draws it, has no a priori ontological implications: flying horses are observable but do not exist; electrons may exist but are unobservable.

Critics (Churchland 1985; Musgrave 1985; Fine 1986; Wilson 1985) complain that this distinction cannot ground a sensible epistemological policy. First, van Fraassen runs together different notions, none of which has special epistemological relevance. What is observable is variously taken as: what is detectable by human senses without instruments (Jupiter’s moons); what can be “directly” measured as opposed to “indirectly” calculated; what is detectable by humans-qua-natural-measuring-instruments (as thermometers measure temperature, humans “measure” observables). Critics ask why any of these should divide the safe from the risky epistemic bet. Why is it legitimate to infer the presence of mice from casual observation of their tell-tale signs but illegitimate to infer the presence of electrons from careful and meticulous observation of their tell-tale ionized cloud-chamber tracks?

Second, many critics find van Fraassen’s agnosticism about unobservables unwarrantedly selective. CE claims that we ought to believe what science tells us about all observables (both observed and unobserved) but not about unobservables. In each case there is a gap between our evidence (what has been observed) and what science arrives at (claims about all observables (CE) or claims about all observables and unobservables (SR)). Why is it legitimate to infer from what we have observed in our spatiotemporally limited surroundings to everything observable but not to what is unobservable (though detectable with reliable instruments or calculable with reliable theories)? Our experience is limited in many ways, including lacking direct access to: medium-sized events in spatiotemporally remote regions, events involving very small or very large dimensions, very small or very large mass-energy, and so forth. Why should inductions to claims about the first be legitimate but not to claims about the others?

Third, CE’s epistemic policy is pragmatically self-defeating or incoherent. Suppose a scientific theory T tells us “A is unobservable by humans”. In order to use T to set our epistemic policy we must accept T; that is, believe what T tells us about observables, but we should be agnostic about what T tells us about unobservables, including whether A is observable or unobservable. But if we should be agnostic about A’s observability, then we do not know whether or not we should believe in As. A consistent constructive empiricist will have trouble letting science determine what is unobservable and using that determination to guide her epistemic policy—often she will not know what not to believe.

Finally, if we interpret the language of science literally (as van Fraassen does), then we ought to accept that we see tables if and only if we see collections of molecules subject to various kinds of forces. But then if we are willing to assert there are tables we should be willing to assert that there are collections of molecules (Friedman 1982; Wilson 1985).

c. The Argument from Empirically Equivalent Theories

As realists rely on IBE, antirealists rely on EET:

  1. If T and T’ are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.
  2. If (E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n), then we have no reason to believe T rather than T’ or vice versa.
  3. For any T, there exists a distinct empirically equivalent T’.
  4. Thus, for any theory T, we have no reason to believe it rather than its empirically equivalent rivals.

The argument appears to be valid, but each of its premises can be challenged (Boyd 1973; Laudan and Leplin 1991). Premise 1 is under-specified. Any abstract, sufficiently general theory (for example, Newton’s theory of gravitation) has no empirical consequences on its own. Trivially, two such theories are empirically equivalent since each has no empirical consequences; so any evidence equally confirms/infirms each. But no realist will worry about this. In order to give Premise 1 bite, the theories must have empirical consequences, which they will have only with the help of auxiliary hypotheses, A (§4). But then Premise 1 becomes:

1A. If (T and A) and (T’ and A) are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.

Whether 1A is plausible depends on what A is. If A is any hypothesis which has been accepted to date, then 1A is false because current empirical indistinguishability does not entail perpetual empirical indistinguishability, since evidence and auxiliary hypotheses change over time as we discover new instruments, methods, and knowledge. But if A is any hypothesis whatsoever, then there is no reason to think that the antecedent of Premise 1A is true, and thus 1A is again a trivial, vacuous truth. Moreover, the connection between empirical equivalence (agreement about observables in the sense of §6a) and evidential support is questionable (Laudan and Leplin 1991). Premise 1 presupposes that all and only what a theory says or implies about observables is evidentially relevant to that theory. But this is false: Brownian motion, though not an empirical consequence of atomic theory, supported it. Thus T and T’ could be empirically equivalent, yet one could have better evidential support than the other; for example, T, but not T’, might be derivable from a more comprehensive theory that entails evidentially well-supported hypotheses.

Some IBE-realists resist Premise 2: T and T’ may be equally confirmed by the evidence, yet one of them may possess superior explanatory virtues (§5e) that make it the best explanation of the evidence and thus, by IBE, more entitled to our assent—especially if the other is a less natural, ad hoc variant of the “nice” theory. The success of this response depends on whether explanatorily attractive theories are more likely to be true—why should nature care that we prefer simpler, more coherent, more unified theories?—and on whether a convincing case can be made for the claim that we are evolutionarily equipped with cognitive abilities that tend to select theories that are more likely to be true because their explanatory virtues appeal to us (Churchland 1985).

The very strong, very general conclusion of EET, however, depends on the very strong, very general Premise 3, which, critics argue, is typically supported either by “toy” examples of theory-pairs from the history of physics, by contrived examples of theories, one of which is transformed from the other by a general algorithm (Kukla 1998), or by some tricks of formal logic or mathematics. None is likely to convince any realist (Musgrave 1985; Stanford 2001).

d. Constructive Empiricism, IBE, and Explanation

For van Fraassen, a theory’s explanatory virtues (simplicity, unity, convenience of expression, power) are pragmatic—a function of its relationship to its users. This implies that explanatory power is not a rock bottom virtue like consistency (Newton could decline to explain gravity, but he could not decline to be consistent) and does not confer likelihood of truth or empirical adequacy (Newton’s theory explained lots of phenomena but is neither true nor empirically adequate). The fact that a theory satisfies our pragmatic desiderata has no implications for its being true or empirically adequate, contrary to what IBE-realists maintain.

IBE is a rule guiding rational choice among rival hypotheses. But there is always the option of declining to choose, of remaining agnostic. To undercut this general option, van Fraassen argues, the realist must commit to some claim like: every regularity and coincidence must be explained. Van Fraassen challenges this alleged requirement. First, the quest for explanation has to stop somewhere; even “realist” explanations must bottom out in brute fundamental laws; so, why cannot an antirealist bottom out in brute phenomenological laws? Second, scientists do not consider themselves bound by a principle that demands that every correlation be explained. In quantum mechanics, for example, spin states of entangled particles are perfectly correlated, yet every reasonable explanation-candidate has failed, and scientists no longer insist that they must be explained, contrary to what realists allegedly require (Fine 1986). However, these arguments may be directed at a straw man, since no realist is likely to require that every regularity be explained. Musgrave (1985), for example, suggests that these arguments confuse realism (the view that science aims to explain the phenomena where possible) with essentialism (the view that science aims to find theories that are fundamentally self-explanatory): it is not antirealist to claim that Newton explained a host of phenomena in terms of gravity but declined to explain gravity itself.

Van Fraassen also denies that only realism can explain the phenomena. There are rival explanations that are compatible with CE, and some of them are more plausible than realism. In §5e we distinguished ground-level and meta-level uses of IBE and suggested that this strategy might be more promising for the latter than the former. Recall the realists’ reasoning: there is a surprising phenomenon—our current scientific theories, scientific methodology, and the history of modern science, are surprisingly successful—which cries out for explanation; the only explanation is that the theories are approximately true; thus, by IBE, realism. But there is a more mundane explanation: many very smart people construct our scientific theories and methods, throwing out the unsuccessful ones (which we tend to ignore (Magnus and Callender 2004)) and refining and keeping only the successful ones. A variant of this success-by-design-and-trial-and-error is explanation of success in Darwinian terms: just as the mouse’s running away from its enemy the cat is better explained in Darwinian terms (only flight-successful mice survive and pass their genes along) than in representational terms (the mouse “sees” that the cat is his enemy and therefore runs), so too the instrumental success of science is better explained in Darwinian terms (only the successful theories survive) than in realist terms (they are successful because they are approximately true). These rival antirealist explanations of success are controversial, however (Musgrave 1985). The success-by-design explanation does not seem right, since scientists often construct theories that make completely unexpected, novel predictions.

7. Historical Challenges to Scientific Realism

A range of arguments attempt to show that scientific realism is often supported by an implausible history of science. (In what follows, T* and T are successor and predecessor theories in a sequence of theories; for example, think of the sequence <Aristotelian physics, Medieval physics, Cartesian physics, Newtonian physics, (Newtonian + Maxwellian physics), Special Theory of Relativity (STR), (General Theory of Relativity (GTR) + Quantum Mechanics (QM)), …> as ordered under the relation T* succeeds T.)  Both realists and empiricists think of science as being cumulative and progressive. For empiricists, cumulativeness requires at least that T* have more true (and perhaps less false) observational consequences than T. Since the content of a theory on logical positivists’ views is exhausted by its observational consequences, if T* has more true observational consequences than T, then T* is “more true than” T. However, SR-realists require more. Because of SR5, they are committed to a historical thesis: that science asymptotically converges on the truth. Because of their externalist semantics, they are committed to theses about reference: theoretical terms genuinely refer, reference is trans-theoretic, and reference is preserved in T-T* transitions (so that “electron” in Bohr’s earlier and later theories refers to the same object and the later theory provides a more adequate conception of that object). Finally, because of their meta-level appeals to IBE, they are committed to SR5 because it best explains the instrumental success of our best theories and the increasing instrumental success of sequences of theories (where T* is more successful than T because T* is closer to the truth than T), and so forth.

a. Kuhn’s Challenge

According to Kuhn (1970), the standard view of science as steadily cumulative (presupposed by both positivism and realism) rests on a myth that is inculcated by science education and fostered by Whiggish historiography of science. When the myth is deconstructed, we see science as historically unfolding through stable cycles of cumulativeness, punctuated by periods of crisis and revolution.

During periods of normal science, practitioners subscribe to a paradigm. They have the same background beliefs about: the world, its fundamental ontology, processes, and laws (statements that are not to be given up); correct mathematical and linguistic expression; scientific values, goals, and methods; scientifically relevant questions and problems; and experimental and mathematical techniques. Within a given paradigm P—for example, Newtonian physics—there is a relatively stable background: a world of Newtonian particles moving in space and time subject to Newtonian forces (like gravity) and obeying Newton’s laws. There are exemplary methods and techniques—for example, to solve a problem of motion, bring it under the equation, F = ma, which manifests itself across the board and is treated as counterexample-free. And there are shared values—for example, unified mathematical representation of phenomena—and problems (for example, the solution of the arbitrary n-body problem for a system of gravitationally attracting bodies or the resolution of the anomaly in the orbit of Uranus) that require further articulation of the theory. In normal science, cumulativeness occurs: the theory becomes extended to answer its own questions and cover its phenomena. (Kuhn thinks that clean views of history come from focusing too much on normal science.) But sooner or later anomalies crop up that the paradigm cannot handle (for example, the failure to bring electromagnetism, black body radiation, and Mercury’s orbit under the Newtonian scheme). There is a crisis that only a revolutionary new paradigm (for example, STR, QM, and GTR) can handle. Once in place, the new paradigm P* provides a radically new way of looking at the world.

Kuhn (1970) was interpreted (wrongly, but with some justice given his sometimes incautious language) as arguing for an extremely radical constructivist/relativist position: P and P* are incommensurable in the sense that they are so radically distinct that they cannot be compared; the P and P* scientists work “in different worlds”, “see different things”, use different maps (theories and conceptual schemes) and also have different rules for map-making (methods), different languages, and different goals and values. As a result, during the transition, scientists have to learn a new way of seeing and understanding phenomena—Kuhn likens the experience to a “gestalt switch” or “religious conversion”. There is no commonality—in ontology, methodology, observational base, or goals/values—that P and P* scientists can use to rationally adjudicate their disagreements. There is no paradigm-independent reason for preferring P* over P, since such reasons would have to appeal to something common (common observations, methods, or norms), and they share no commonality. Even more strongly, there is no paradigm-independent, objective fact of the matter concerning which of them is correct. If this were true, then all standard theses about progress would be undermined. There is no referential or meaning continuity across paradigms; no sense can attach to theses like T* is more true than T, T is a limiting case of T*; or T* preserves all T’s true observational consequences, since such theses presuppose T-T* commensurability.

Critics have pointed out that this view is too extreme (McMullin 1991). The history of science shows more continuity and fewer radical revolutions than this account attributes to it. Scientists make rational choices between “paradigms” (for example, most scientists who were skeptical of atoms came to reasonably believe in them as a result of Perrin’s experiments). Many scientists work within two traditions without experiencing gestalt shifts (for example, 19th century energetics and molecular theories). T and T* advocates often argue, criticize each other, and rationally persuade each other that one of the two is incorrect. How could this be, if the radical interpretation of Kuhn were correct?

Kuhn clearly did not intend the radical reading, and in later writings (1970 Postscript, 1977) he distinguishes his views from such radical, subjectivist, and relativist interpretations. Paradigm transitions and incommensurability, he argues, are never as total as the radical interpretation assumes: enough background (history, instrumentation, and every-day and scientific language) is shared by P- and P*-adherents to underwrite good reasons they can employ to mount persuasive arguments. Moreover, he lists several properties any theory should have—accuracy (of description of experimental data), consistency (internal and with accepted background theories), scope (T should apply beyond original intended applications), fecundity (T should suggest new research strategies, questions, problems), and simplicity (T should organize complex phenomena in a simple tractable structure). Application of these criteria accounts for progress and theory choice. However, these are “soft” values that guide choices rather than “hard” rules that determine choices. Unlike rules, (i) they are individually imprecise and incomplete, and (ii) they can collectively conflict (and there is no a priori method to break ties or resolve conflicts). Moreover, Kuhn argues, an individual’s choice is guided by a mixture of objective (accuracy, and so forth) and subjective (individual preferences like cautiousness and risk-taking, and so forth) factors, the latter influencing her interpretation and weighing of the criteria. A cautious scientist may be unwilling to risk a high probability of being wrong for a small probability of being informative in novel ways, and vice versa for the risk-taker. In this way Kuhn (1977) offers a middle ground between theory choices being completely subjective and being objective (qua being determined by rules applied to evidence). This “softer” view of science, he argues, enables new theories to get off the ground: progress can be made only if there are values to allow rational discussion and argument but not hard rules that would pre-determine an answer (because then everyone would conform to the rule and not risk proposing new alternatives).

Kuhn has shown that evidence and reasons are sometimes incapable of deciding between P and P*. But a realist may concede that hard choices occur: at most one of P or P* is correct, and we may have to wait and see which, if either, pans out. Temporary gridlock need not amount to permanent undecidability: the lack of decisive reasons at a time does not imply that there will be no decisive reasons forever; when more evidence is acquired and its relevance better understood, convincing reasons usually emerge. Realists should concede these points; many in the 21st century do. But no SR-realist can accept the thesis, never abandoned by Kuhn, that there is no fact of the matter whether P or P* is correct.

b. Laudan’s Challenge: The Pessimistic Induction

Although it is widely agreed that our best theories are instrumentally successful and many T-T* sequences show increasing success, Laudan (1981) disputes that success and progress are to be explained in realist SR5-terms of increasing approach to the truth. The history of science, Laudan argues, shows that referential success is neither necessary nor sufficient for empirical success: not necessary because the central terms of many successful theories did not refer (19th century ether, caloric, and phlogiston theories, for example); not sufficient because the central terms of many failing theories did, by our lights, refer (18th century chemical atomism, Prout’s hypothesis for most of the 19th century, Wegener’s theory of continental drift in the first half of the 20th century, and so forth).

Moreover, realist notions of approximate truth and convergence-to-the-truth are problematic. Despite best efforts, no satisfactory metric has emerged that would characterize distance from the truth or the truth-distance between T and T* (Laudan 1981; Miller 1974; Niiniluoto 1987). For some T-T* sequences in mathematical physics, there are limit theorems whereby T can be derived as a special case of T* under appropriate limiting conditions. For example, special relativity passes asymptotically into Newtonian mechanics as (v/c) 2 approaches 0. Such theorems suggest that Newtonian mechanics yields close to correct answers for applications close to the relativistic limits (not too fast). In this way realists can appeal to them to argue that T* extends and improves upon T. However, for many T-T* sequences there are no analogous limit theorems: Lavoisier’s oxygen theory is a progressive successor of Priestley’s phlogiston theory, yet there is no neat mathematical relationship indicating that phlogiston theory is a limiting case of oxygen theory. Moreover, even for cases where T* approaches T as some parameter approaches a limit, it is controversial what to conclude. If reference is determined by meaning (§5b), then “massnewton” and “masseinstein” refer to different things, and the fact that there is a derivation of classical mass-facts from relativistic mass-facts under certain conditions does nothing to show that T* provides a more global, more accurate description of mass-facts than T (since they’re talking about different things); the limit theorems show at most that some structure of abstract relations but not semantic content gets preserved in the T-T* transition (§11a). But if reference is determined by causal-historical relations (§5c), then the references of some key terms of T get lost in the transition to T*—“ether” was a key referring term of classical physics, but there is no ether in special relativity; so how can classical physics capture part of the same facts that special relativity captures when all its claims about the ether are either plainly false or truth valueless?

These are serious challenges to SR. On one hand, it is hard to shake the idea that theories are successful because they are “onto something”. Yes, we build them to be successful, but their scope and novel predictions generally greatly outstrip our initial intentions. Realists tend to see the history of science as supporting an optimistic meta-induction: since past theories were successful because they were approximately true and their core terms referred, so too current successful theories must be approximately true and their central terms refer. On the other hand, skeptics see the history of science as supporting a pessimistic meta-induction: since some (many, most) past successful theories turned out to be false and their core terms not to refer, so too current successful theories may (are likely to) turn out to be false and their key terms not to refer. Realists must be careful not to interpret history of science blindly (ignoring the successes of ether theories and the failures of early atomic theories, for example) or Whiggishly (begging questions by wrongly attributing to our predecessors our referential intentions—by assuming, for example, that Newton’s “gravity” referred to properties of the space-time metric).

8. Semantic Challenges to Scientific Realism

Realist truth and reference are word-world/thought-world correspondences (SR4), an intuitively plausible view with a respectable pedigree going back to Aristotle. Moreover, some IBE-realists argue that real correspondences are needed to explain the successful working of language and science: we use representations of our environment to perform tasks; our success depends on the representations causally “tracking” environmental information; truth is a causal-explanatory notion. Several philosophical positions challenge this idea.

a. Semantic Deflationism

Tarski showed how to define the concept is true-in-L (where L is a placeholder for some particular language). Treating “is true” as predicated of sentences in a formal language, he provided a definition of the concept that builds it up recursively from a primitive reference relation that is specified by a list correlating linguistic items syntactically categorized with extra-linguistic items semantically categorized. Thus, for example, a clause like “‘electron’ refers to electrons” would be on this list if the language were English. Although Tarski’s definition is technically sophisticated, the main points for our purposes are these. First, it satisfies an adequacy condition (referred to as Convention T): for every sentence P (of L), when P is run through the procedure specified by the definition, “P” is true (in L) if and only if P. Thus, for example, “Electrons exist” is true-in-English if and only if electrons exist, and so forth. Second, truth and reference are disquotational devices: because of the T-equivalences, to assert that “snow is white” is true (in English) is just to assert that snow is white; similarly, to assert that “snow” refers (in English) to some stuff is just to assert that the stuff is snow.

Semantic deflationists (Fine 1996; Horwich 1990; Leeds 1995, 2007) argue that Tarski’s theory provides a complete account of truth and reference: truth and reference are not causal explanatory notions; they are merely disquotational devices that are uninformative though expressively indispensable—useful predicates that enable us to express certain claims (like “Everything Putnam said is true”) that would be otherwise inexpressible. So long as a truth theory satisfies Convention T, these things will be expressible, and a trivial list-like definition of reference (“P” refers to x iff x is P) will suffice to generate the T-sentences. As native speakers, we know, without empirical investigation, that “electron” refers to electrons just by having mastered the word “refers” in our language. Our beliefs about electrons could be mistaken, but not our belief that “electron” applies to electrons. In particular, we cannot coherently suppose that “electron” does not refer to electrons because this is but a step away from a formal contradiction—some electrons are not electrons. Deflationists argue that such “thin” concepts and trivial relations cannot bear the explanatory burdens that scientific realists expect of them.

Deflationism is a controversial position. Field, before he endorsed deflationism, argued that Tarski merely reduced truth to a list-like definition of reference, but such a definition is physicalistically unacceptable (Field 1972). Chemical valence was originally defined by a list pairing chemical elements with their valence numbers, but later this definition was unified in terms of the number of outer shell electrons in the element’s atoms. Field argued that reference should be similarly reduced to physical notions. While this seems an implausibly strong requirement, many philosophers think it obvious that the success of action depends on the truth of the actors’ beliefs: John’s success in finding rabbits in the upper field, they argue, depends on his rabbit-beliefs corresponding to the local rabbits (Liston 2005). Deflationists respond that John’s success is explained by there being rabbits there (no need to mention ‘true’), but deflated explanations become strained when John is not an English thinker, because the sentences Jean holds true (‘Des lapins habitent le champ supérieur’) must first be translated into sentences we hold true and then disquoted—a strategy known as extended disquotationalism—and it is difficult to see why Jean’s success has anything to do with his sentences translating into ours.

Deflationists reject SR4 and SR5, but this does not mean they cannot believe what our best scientific theories tell us: deflationists can and typically do accept SR3 as well as all the object-level inferences that science uses, including object-level IBE (Leeds 1995, 2007). It means only that deflationists reject the meta-level IBE deployed by realists (§5e)—such inferences must be rejected if truth is not an explanatory notion.

b. Pragmatist Truth Surrogates

Pragmatists question metaphysical realism (SR3): it presupposes a relation between our representations (to which we have access) and a mind-independent world (to which we lack access), and there cannot be such a relation, because mind-independent objects are in principle beyond our cognitive reach. Thus SR3 (and correspondence truth) is either vacuous or unintelligible. For them, word-world relations are between words and objects-as-conceived by us. If we cannot reach out to mind-independent objects, we must bring them into our linguistic and conceptual range.

Pragmatists also tend to supplement Tarski’s understanding of truth, like philosophers in a broadly idealist tradition (including Hume, Kant, the positivists, and Kuhn) who employ truth-surrogates that structure the “world” side of the correspondence relation in some way (impressions, sense data, phenomena, a structured given) that would render the correspondence intelligible. Depending on the kind of idealism adopted “p is true” might be rendered “p is warrantedly assertible”, “p is derivable from theory Θ”, or “p is accepted in paradigm P”, all of the form “p is E” where E is some epistemic surrogate for “true”. We have already seen (§5d) how realists object to this move: it assigns to the concepts truth and reference the wrong properties (it makes them intra-theoretic rather than trans-theoretic) and thus cannot properly capture key features of practice. More generally, Putnam argues, truth cannot be identified with any epistemic notion E: take any revisable proposition p that satisfies E, we already know that p might not be true; so being E does not amount to being true. For example, that Venus has CO2 in its atmosphere is currently warrantedly assertible, but future investigation could lead us to discover that it is not true. Thus, Putnam thinks, truth is epistemically transcendent: it cannot be captured by any epistemic surrogate (Putnam 1978).

c. Putnam’s Internal Realism

In his SR period, Putnam held that only real word-world correspondences could capture the epistemic transcendence and causal explanatory features of truth. In the late 1970s Putnam came to doubt SR3, reversed his position, and proposed a new program, internal realism (Putnam 1981). IR has negative and positive components.

The main negative component rejects metaphysical realism (SR3) and the associated thesis that truth and reference are word-world correspondences (SR4). The primary argument for this rejection is Putnam’s model-theoretic argument (Merrill 1980; Putnam 1978, 1981). Take our language and total theory of the world. Suppose the intended reference scheme (which correlates our word uses with objects in the world) is that which satisfies all the constraints our best theory imposes. This supposition is problematic because those constraints would fix at best the truth conditions of every sentence of our language; they would not determine a unique assignment of referents for our terms. Proof: Assume there are n individuals in the world W, and our theory T is consistent. Model theory tells us that since T is consistent it has a model M of cardinality n; that is, all the sentences of T will be true-in-M. Now define a 1-1 mapping f from the domain of M, D(M), to the domain of W, D(W), and use f to define a reference relation R* between L(T) (the language of our theory) and objects in D(W) as follows: if x is an object in D(W) and P is a predicate of L(T), then P refers* to x if and only if P refers-in-M to f-1x. Then any sentence S will be true* (of W) if and only if S is true-in-M. Intuitively, truth* and reference* are not truth and reference but gerrymandered relations that mimic truth-in-M and refers-in-M, where M can be entirely arbitrary, provided it has enough objects in its domain. Unfortunately, anything we do to specify the correct reference scheme for our language and incorporate it into our total theory is subject to this permutation argument. One might object, for example, that a necessary condition for (real) reference is that P refer to x only if x causes P and P is not causally related to the objects it refers* to (Lewis 1984). But if we add this condition to our theory, then we can redeploy a permutation whereby “x causes* P (in W)” will mimic “f-1x causes P (in-M)”; and instead of failing to fix the real reference relation we will be failing to fix the real causal relations. This formal result is the basis of Putnam’s argument that even our best theory must fail to single out its intended model (reference scheme). The permutation move is so global that no matter what trick X one uses to distinguish reference from reference*, the argument will be redeployed so that if X relates to cats in a way that it does not to cats*, then X* (a permutation of X) will relate to cats* in the same sort of way, and there will be no way of singling out whether we’re referring to X or X*.

The positive component of internal realism replaces SR3 and SR4 with IR3 and IR4:

IR3 We can understand a determinate world only as a world containing objects and properties as it would be described in the ideal theory we would reach at the limit of human inquiry;

IR4 Theories are true, when they are, partly because their concepts correspond to objects and properties that the ideal theory carves out;

and it reinterprets references to truth in SR1, SR2, and SR5 in terms of IR3 and IR4.

IR3 replaces allegedly problematic, inaccessible mind-independent objects with unproblematic, accessible objects that would be produced by the conceptual scheme we would reach in the ideal theory, and IR4 relates our words to the world as it would be carved up according to the ideal theory. When truth, reference, objects, and properties are thus relativized to the ideal theory, then IR1, IR2, and IR5 are just IR counterparts of their SR analogs: we aim to give accounts that would be endorsed in the ideal theory; to accept a theory is to believe it approximates the ideal theory; science (trivially) progresses toward the ideal theory. Putnam believes he can avoid unintelligible correspondences to an inaccessible, God-eye view of the world yet still have a concept of truth that is explanatory and epistemically transcendent. While truth-in-the-ideal-limit is an epistemic concept—it is relativized to what humans can know—it transcends any particular epistemic context; so we can have the best reasons to believe that Venus has CO2 in its atmosphere though it may be false (for it may turn out not to be assertible in the ideal theory).

Objects and properties, according to IR3, are as much made as discovered. To many realists, this seems to be an extravagant solution to a non-problem (Field 1982): extravagant to claim we have a hand in making stars or dinosaurs; a non-problem, because many realists think the content of metaphysical realism (SR3) is just that there is a mind-independent world in the sense that stars and dinosaurs exist independently of what humans say, do, or think. The problem is not how to extend our epistemic and semantic grasp to objects separated from us by a metaphysical chasm; it is the more ordinary, scientific problem of how to extend our grasp from nearby middle-sized objects with moderate energies to objects that are very large, very small, very distant from us spatiotemporally, and so forth. (Kitcher 2001; Liston 1985). Moreover, realists point out, true-in-the-ideal-theory falls short of true. We know that either string theory is true and the material universe is composed of tiny strings or this is not the case. But it is conceivable that no amount of human inquiry, even taken to the ideal limit, will decide which; so though one disjunct is true, neither may be assertible in the ideal limit. Consequently, internalist truth lacks the properties of truth. (It is noteworthy that Putnam recanted internalist truth in his last writing on these matters (Putnam 2015)).

Rorty is another pragmatist who rejects, in a far more radical manner than Putnam, the fundamental presuppositions of the realist-antirealist debate (Rorty 1980).

9. Law-Antirealism and Entity-Realism

Cartwright (1983) and Hacking (1983) represent this mix of theoretical law antirealism and theoretical entity realism. The kind of account that Cartwright rejects has three main components. First is the facticity view of fundamental physical laws: adequate fundamental laws must be (approximately) true. The basic equations of Newton, Maxwell, Einstein (STR/GTR), quantum mechanics, relativistic quantum mechanics, and so forth, are typical examples of such laws. Second is the covering law (or DN) model of explanation (Hempel 1965, §3c): a correct explanation of a phenomenon or phenomenological law is a sound deduction of the explanandum from fundamental laws together with statements describing, for example, compositional details of the system, boundary and initial conditions, and so forth. The deduction renders the explanandum intelligible by showing it to be a special case of the general laws. Thus, for example, Galileo’s law of free fall is explained as a special case of Newtonian fundamental laws by its derivation from Newton’s gravitational theory plus background conditions close to the earth’s surface. Third is IBE: the success of DN-explanations in rendering large classes of phenomena intelligible can justify our inferring the truth of the covering laws. The fact that Galileo’s law, Kepler’s laws, the ideal gas laws, tidal phenomena, the behavior of macroscopic solids, liquids, and gases all find a deductive home under Newton’s laws provides warrant for belief in the facticity of Newton’s laws.

Cartwright rejects all three components. She begins by challenging the first two components: there is a trade-off between facticity and explanatory power. Newton’s law of gravitation, FG = Gm1m2/r122, tells us what the gravitational force between two massive bodies is. Coulomb’s law, FC = kq1q2/r122, tells us what the electrostatic force between two charged bodies is. Each law gives the total force only for bodies where no other forces are acting. But most actual bodies are charged and massive and have other forces acting on them; thus the laws either are not factive (if read literally) or do not cover (if read as subject to the ceteris paribus modifier “provided no other forces are acting”). In physics, we explain by combining the forces: the actual force acting on a charged massive body is FA = FG + FC, the vector-sum of the Newton and Coulomb forces, which determines the actual acceleration and path. Cartwright objects that (a) we lack general laws of interaction allowing us to add causal influences in this way, (b) there is no reason to think that we can get super-laws that will be true and cover, (c) in nature there is only the actual cause and resultant trajectory. But if the facticity and explanatory components clash in this way, the third component is in trouble also. Realists cannot appeal to IBE to justify belief in factive fundamental covering laws because good explanations that cover a host of phenomena rarely proceed from true (factive) laws. Consequently, the explanatory success of fundamental laws cannot be cited as evidence for their truth.

Cartwright’s own account has three corresponding components. First, fundamental laws are non-factive: they describe idealized objects in abstract mathematical models, not natural systems. In nature there are no purely Newtonian gravitational systems or purely electromagnetic systems. These are mathematical idealizations. Only messy phenomenological laws (describing empirical regularities and fairly directly supported by experiment) truly describe natural systems. Second, we should replace the DN model of explanation with a simulacrum account: explanations confer intelligibility by fitting staged mathematical descriptions of the phenomena to an idealized mathematical model provided by the theory by means of modeling techniques that are generally “rigged” and typically ignore (as negligible) disturbing forces or mathematically incorporate them (often inconsistently). To explain a phenomenon is to fit it in a theory so that we can derive fairly simple analogs of the messy phenomenological laws that are true of it. Intelligibility, not truth, is the goal of theoretical explanation. Third, although we should reject IBE, we should embrace inference to the most likely cause (ILC). Whereas theoretical explanations allow acceptable alternatives and need not be true, causal explanations prohibit acceptable alternatives and require the cause’s existence. ILC, on Cartwright’s view, can justify belief in unobservables that are experimentally detectible as the causes of phenomena. Thus, for example, Perrin’s experiments showed that the most likely cause of Brownian motion was molecular collisions with the Brownian particles; Rutherford’s experiments showed that the most likely cause of backward scattering of a-particles bombarded at gold foil were collisions with the nuclei of the gold atoms.

The laws of physics lie, Cartwright claims, and the hope of a true, unified, explanatory theory of physics is either based on a misunderstanding of physics practice or a vestige of 17th century metaphysical hankering for a neatly designed mechanical universe. The practice of physicists, she argues, indicates that we ought to be antirealists about fundamental laws and points instead to a messy, untidy universe that physicists cope with by constructing unified abstract stories (Cartwright 1999). Thus Cartwright is anti-realist about fundamental laws: contrary to realists, they are not (even approximately) true; contrary to van Fraassen, she is not recommending agnosticism—we now know they are non-factive. On the other hand, also contrary to van Fraassen, scientific practice indicates that we should be realists about “unobservable” entities that are the most likely causes of the phenomena we investigate.

Critics complain that Cartwright confuses metaphysics and epistemology: even if we lack general laws of interaction, it does not follow that there are none. Cartwright replies that the unifying ideal of such super-laws is merely a dogma. However, practice seems Janus-faced here: the history of modern physics is one of disunity leading to unity leading to disunity, and so forth. Each time distinct fundamental laws resist combination, a new unifying theory emerges that combines them: electrodynamics and eventually Einstein’s theories succeeded in combining Newton and Coulomb forces. The quest for unity is a powerful force guiding progress in physics, and, while the ideal of a unified “theory of everything” continues to elude us, Cartwright’s examples hardly show that it is a vain quest. Moreover, Cartwright arguably conflates different kinds of laws: in classical settings, the fundamental laws are Newton’s laws of motion, and his F = ma is the super-law that combines Newton’s gravitational and Coulomb’s electrostatic laws (Wilson 1998).

Cartwright’s distinction between “theoretical” and “causal” explanations has also been criticized. Nothing about successful theoretical explanations, she claims, requires their truth, whereas successful causal explanations require the existence of the cause. To many this move seems fallacious—if “successful” means correct, then the truth of the former follows as much as the existence of the latter; if “successful” does not mean correct, then neither follows. Presumably, in the IBE context, “successful” does not entail truth, but similarly in the ILC context, “successful” does not entail existence: the most likely cause could turn out not to exist (for example, caloric flow or phlogiston escape) just as the best explanation could turn out to be false (caloric or phlogiston theory).

10. NOA: The Natural Ontological Attitude

Fine (1996, 1986) presented NOA, an influential response to the debates amounting to a complete rejection of their presuppositions. We generally trust what our senses tell us and take our everyday beliefs as true. We should similarly trust what scientists tell us: they can check what is going on behind the appearances using instruments that extend our senses and methods that extend ordinary methods. This is NOA: we should accept the certified results of science on a par with homely truths. Both realists and antirealists accept this core position, but each adds an unnecessary and flawed philosophical interpretation to it.

Realists add to the core position the redundant word “REALLY”: “electrons REALLY exist”. SR realists add substantive word-world correspondences, a policy that serves no useful purpose. The only correct notion of correspondence is the disquotational one: “P” refers to (or is true of) x if and only if x is P. Realist appeals to IBE are problematic for two reasons. First, they beg the question against antirealists, who ab initio question any connection between explanatory success and approximate truth. Moreover, there is no inferential principle that realists could employ and antirealists would accept. Straight induction will not work: we can induce from the observed to the unobserved, because the unobserved can be later observed to check the induction; but we cannot induce to unobservables, because there can be no such independent check (according to the antirealist). Second, IBE does not work without some logical connection between success and (approximate) truth. But the inference from success to (approximate) truth is either invalid if read as a deductive move (because many successful theories turned out to be false (§7b)), weak if read as an inductive move (because nearly all successful past theories turned out to be false), or circular if read as a primitive IBE move. The antirealist, by contrast, has a ready answer: if a scientific theory or method worked well in the past, tinker with it, and try it again. Finally, Fine argues, contrary to what realists often claim, realism blocks rather than promotes scientific progress. In the Einstein-Bohr methodological debates about the completeness of quantum mechanics, the realist Einstein saw QM as a degenerate theory, while the instrumentalist Bohr saw QM as a progressive theory. Subsequent history favored Bohr over Einstein.

However, antirealism is no better off. Empiricists attempt to set limits: we should believe only what science tells us about observables. Fine criticizes these limits for reasons given in §5a and §6b—the observable-unobservable distinction cannot be drawn in a manner that would motivate skepticism or agnosticism about unobservables but not about observables. We have standard ways of cross checking to ensure that what we are “seeing’ with an instrument or calculating with a theory is reliable even if not “directly” observable. Fine concludes that the checks that science itself uses should be the ones we appeal to when in doubt. Pragmatists and constructivists react to the inaccessible, unintelligible word-world correspondences posited by realists by pulling back and trying to reformulate the correspondences in terms of some accessible surrogate for truth and reference (§8). Fine reiterates the criticisms of §5d and §8: truth has properties that any epistemic truth-surrogate lacks.

Both realists and antirealists view science as a practice in need of a philosophical interpretation. In fact, science is a self-interpreting practice that needs no philosophical interpretation. It has local aims and goals, which are reconfigured as science progresses. Asking about the (global) aim of science is like asking about the meaning of life: it has no answer and needs none. NOA takes science on its own terms, a practice whose history and methods are rooted in, and are extensions of, everyday thinking (Miller 1987). NOA accepts ordinary scientific practices but rejects apriorist philosophical ideas like the realist’s God’s-Eye view and antirealist’s truth-surrogates.

Critics see NOA as a flight from, rather than a response to, the scientific realism question (Musgrave 1989). The core position, they argue, is difficult to characterize in a philosophically neutral manner that does not invite a natural line of philosophical questioning. Once one accepts that science delivers truths and explanations, it is natural to ask what that means, and realist and antirealist replies will naturally ensue—as they always have, since these interpretations are as old as philosophy itself. Moreover, it may be difficult to characterize NOA non-tendentiously: ground-level IBE and correspondence truth, for example, are arguably rooted in common sense and ought to be included in NOA; but then any antirealism that rejects them is incompatible with NOA.

11. The 21st Century Debates

Between 1990 and 2016 new versions of the debates, many focusing on Laudan’s PI (§7b), have emerged.

a. Structuralism

Structural Realism claims that: science aims to provide a literally true account only of the structure of the world (StR1); to accept a theory is to believe it approximates such an account (StR2); the world has a determinate and mind-independent structure (StR3); theories are literally true only if they correctly represent that structure (StR4); and the progress of science asymptotically approaches a correct representation of the world’s structure (StR5). (Here we replace each SR thesis in §5 with an analogous StR thesis.)

Structuralism comes from philosophy of mathematics. Consider the abstract structure <ω, o, ξ>, where ω is an infinite sequence of objects, o an initial object, and ξ a relation that well-orders the sequence. This structure is distinct from its many exemplifications: for example, the natural numbers ordered under successor, <0, 1, 2, 3, …>; the even natural numbers in their natural order, <0, 2, 4, 6, …>; and so forth. We can similarly consider the offices of the U.S. President, Vice-President, Speaker of the House, and so forth. as a collection of objects defined by the structure of relations given in the U.S. Constitution, distinct from its particular exemplars at a given time: Bush, Cheney, Pelosi (January 2007), and Obama, Biden, Boehner (January 2011). Similarly, structuralists suggest, the structure of relations that obtain between scientific objects is distinct from the nature of those objects themselves. The structure of relations is typically expressed (at least in physics) by mathematical equations of the theory (Frigg and Votsis 2011). For example, Hooke’s law, F = -ks describes a structure, the set of all pairs of reals <x, y> in Rsuch that y = -kx, which is distinct from any of its concrete exemplifications like the direct proportionality between the restoring force F for a stretched spring and its elongation s. If the world is a structured collection of objects (StR3), then StR1 says that science aims to describe only the structure of the objects but not their intrinsic natures.

Structuralism is not new: precursors include Poincaré and Duhem in the 19th century (§2c), Russell (1927), Ramseyfied-theory versions of logical positivism (§3b), Quine (§4), and Maxwell (1970). Russell claimed that we can directly know (by acquaintance) only our percepts, but we can indirectly know (by structural description) the mind-independent objects that give rise to them. This approach presupposes a problematic distinction between acquaintance and description and a problematic isomorphism between the percept and causal-entity structures. Worse, it runs afoul of a devastating critique by the mathematician M.H.A. Newman (1928), closely related to Putnam’s model-theoretic argument (§8c), and never satisfactorily answered by Russell. Newman argues that a fixed structure of percepts can be mapped 1-1 onto a host of different causal-entity structures provided there are enough objects in the latter; thus the structural knowledge that science allegedly delivers is trivial—it merely amounts to a claim that the world has a certain cardinality, the size of the percept-structure. (The Ramseyfied-theory approach encounters similar problems (Psillos 2001).)

Contemporary proponents, beginning with Worrall (1989), hold that structuralism steers a middle path between standard versions of scientific realism and antirealism. StR, they argue, provides the best of both worlds by acknowledging and reconciling the pull of both pessimistic and optimistic inductions on the history of science. Pessimistic inductions (PI) argue against SR (§7b): the ontology of our current best theories (quarks, for example) will likely be discarded just like that of past best theories (for example, ether). Optimistic inductions (like the NMA) argue for SR (§5d): because past successful theories must have been approximately true, current more successful theories must be closer to the truth. Structuralists respond that, though ontologies come and go, our grip on the underlying structure of the world steadily improves. Underlying ontology need not be (and is not) preserved in theory change, but the mathematical structure is both preserved and improved upon: Fresnel’s correct claims about the structure of light (as a wave phenomenon) were retained in later theories, while his incorrect claims about the nature of light (as a mechanical vibration in a mechanical medium, the ether) were later discarded. Structuralists can also resist the argument from empirically equivalent theories (§6c)—to the extent that the theories are structurally equivalent they would capture the same structural facts, which is all a theory needs to capture—and do so without embracing a particular realist ontology occupying the nodes of the structure.

But can the needed distinction between structure and nature be drawn and can structures be rendered intelligible without the ontology that gives them flesh (Psillos 1995, 1999, 2001)? Two possible StR answers are suggested.

First, there is epistemological structural realism (EStR), endorsed by Poincaré, Worrall, and logical positivists in the Ramseyfied-theory tradition: electrons are objects as Obama is an object, but, unlike Obama, science can never discover anything about electrons’ natures other than their structural relations. For EStR to be a realist position, it will not suffice to say: we can know only observable objects (like Obama) and their (observable) structural relations; we must be agnostic about unobservable objects and their relations. This is merely a CE version of structuralism, as van Fraassen points out (2006, 2008), and inherits many problems of CE (§6). To be a realist position, EStR has to presuppose that, in addition to the structure of the phenomena whose objects are knowable, there is a mind-independent, knowable “underlying” structure, whose objects are unknowable. But now one must distinguish Obama from electrons so that Obama’s nature is knowable but electrons’ natures are not; the problematic observable-unobservable distinction (§§5a, 6b) has returned.

Critics argue that there is no sharp, epistemologically significant distinction between form (structure) and content (nature) of the kind needed for EStR. First, our knowledge of the nature of electrons is bound up with our knowledge of their structural relations so that we come to know them together: saying what an electron is includes saying how it is structured; our knowledge of its nature forms a continuum with our knowledge of its structure. Second, EStR requires a variant of the NMA (restricted to retention of structure) to uphold StR5. But this requires that, in progressive theory-change, structure (retained and improved) is what explains increased empirical success. But structure alone (without auxiliary hypotheses describing non-structural features of the world) never suffices to derive new empirical content. Finally, critics object to structuralists’ interpretations of the history. Worrall, for example, argues that Fresnel’s structural claims about light (the mathematics) were retained, but not his commitments to a mechanical ether; his critics question whether Fresnel could have been “just” right about the structure of light-propagation and completely wrong about the nature of light.

Second, there is ontological structural realism (OStR), advocated by Ladyman and others (Ladyman and Ross 2007) and similar to Quine’s realism (§4). OStR bites the bullet: we can know only structure because only structure exists. Obama is no more an object than electrons are; each is itself a structure; more strongly, everything is structure. Some of the attraction of this strange metaphysical position comes from its promise to handle problems in quantum mechanics that are orthogonal to our debates. Its proponents argue that it can account, for example, for apparently indistinguishable particles in entangled quantum states. In the context of our debates, OStR is supposed to avoid the epistemological problems of EStR: qua objects understood as structural nodes, electrons are in principle no more unknowable (or knowable) than Obama or ordinary physical objects. However, it runs into its own metaphysical problems, since it threatens to lose touch with concrete reality altogether. Even if God created nothing concrete, it would still be a structural (mathematical) fact that neutrons and protons, if they exist, form an isospin doublet related by SU(2) symmetry. For this to be a concrete (physical) fact, God would have had to create some objects—nucleons with symmetrically related isospin states or some more fundamental objects that compose nucleons—to occupy the neutron- and proton-nodes of the SU(2) group-structure. Even if those objects had only structural properties, they would have to have one non-structural property—existence (van Fraassen 2006, 2008). So, not everything is structure; there is a distinction between empty mathematical structures and realized physical structures; OStR can not capture that distinction.

b. Stanford’s New Induction

Kyle Stanford’s new induction provides the latest historical challenge to SR (Stanford 2001, 2006, 2015). Following Duhem (1991) Stanford poses what he calls the Problem of Unconceived Alternatives (PUA): for any fundamental domain of inquiry at any given time t there are alternative scientific hypotheses not entertained at t but which are consistent with (and even equally confirmed by) all the actual evidence available at t. PUA, were it true, would seem to create a serious underdetermination problem for SR: we opt for our current best confirmed theory, but there is a distinct alternative that is equally supported by all the evidence we possess, but which we currently lack the imagination to think of. (Two things about PUA are worth noting. First, it concerns the actual evidence we have at a time; it is not that the theory and the alternatives are underdetermined by all possible evidence; the underdetermination may be transient; future evidence may decide that the theory we have selected is not correct. Second, the unconceived alternative hypotheses are ordinary scientific hypotheses, not recherché philosophical hypotheses involving brains-in-vats, and so forth.)

Stanford argues that PUA is our general predicament. His New Induction on the history of science, he argues, shows that our epistemic situation is one of recurrent, transient underdetermination. Virtually all T-T* transitions in the past were affected by PUA: the earlier T-theorists selected T as the best supported theory of the available alternatives; they did not conceive of T* as an alternative; T* was conceived only later yet T* is typically better supported than T. At any given time, we could only conceive a limited set of hypotheses that were confirmed by all the evidence then available, yet subsequent inquiry revealed distinct alternatives that turned out to be equally or better confirmed by that evidence. We thus have good inductive reasons to believe we are now in the same predicament—our current best theories will be replaced by incompatible and currently unconceived successors that account for all the currently available evidence.

Stanford proposes a new instrumentalism. Like van Fraassen’s (§6), his instrumentalism is epistemic: it distinguishes claims we ought literally to believe from claims we ought only to accept as instrumentally reliable and argues that instrumental acceptance suffices to account for scientific practice. Unlike van Fraassen, Stanford bases his distinction, not on an observable-unobservable dichotomy, but on whether our access to a domain is based primarily on eliminative inference subject to PUA challenges: if it is, then we should adopt an instrumentalist stance; if it is not (as, for example, our access to the common sense world is not), then we may literally believe.

c. Selective Realism

Many debates in the early 21st century focus on historical inductions, especially on what representative basis would warrant an inductive extrapolation. Putnam and Boyd were aware that care was needed with the NMA and sometimes restricted their claims to mature theories so that we discount ab initio some theories on Laudan’s troublesome list—like the theory of crystalline spheres or of humoral medicine. Mature theories (with the credentials to warrant optimistic induction) must have passed a “take-off” point: there must be background beliefs that indicate their application boundaries and guide their theoretical development; their successes must be supported by converging but independent lines of inquiry and so forth. Moreover, many realists argue, a theory is suitable for optimistic induction only if it has yielded novel predictions; otherwise it could just have been rigged to fit the phenomena. Roughly, a prediction P (whether known or unexpected) is novel with respect to a theory T if no P-information is needed for the construction of T and no other available theory predicts P. Thus, for example, Newton’s prediction of tidal phenomena was novel because those phenomena were not used in (and not needed for) Newton’s construction of his theory and no other theory predicted the tides (Leplin 1997; Psillos 1999). Nevertheless, even thus restricted, the induction will not meet Laudan’s challenge, for that challenge includes an undermining argument (Stanford 2003a): many discarded yet empirically successful theories were mature and yielded novel predictions—for example, Newton’s theory, caloric theory, and Fresnel’s theory of light—so, if our current theories are correct, these theories were false.

More recent responses to these counterexamples attempt to steer a middle course between optimistic inductions like Putnam’s NMA (§5d) and pessimistic inductions like Laudan’s and Stanford’s (§§7b, 11b). These responses typically have a two-part normal form: (1) they concede to the pessimists that some parts of past empirically successful theories are discarded, yet (2) they argue with the optimists that some parts of past successful theories are retained, improved upon, and explain the successes of the old theories. Advocates of this “divide and conquer” strategy (Psillos 1999) try to have their cake and eat it too.

Variants of the strategy depend on how one separates those “good” features of past theories that are preserved, that explain empirical success, and that warrant optimistic induction from those “bad” features that are discarded. Structuralists, we saw, argue that structure (form), but not nature (content), is what is both preserved and responsible for success. Kitcher (1993) distinguishes a theory’s working and presuppositional posits. The term “light-wave” in Fresnel’s usage referred to light, no matter what its constitution is, in some contexts and to what satisfies the descriptionthe oscillations of ethereal molecules” in other contexts. In the former contexts, “light-wave” referred to high frequency electromagnetic waves, a mode of reference that was doing explanatory and inferential work and was retained in later theories. In the latter contexts, “light-wave” referred to the ether (that is, nothing), a mode of reference that was presupposed yet empty, idle, and not retained in later theories.

Other variants rely on the causal theory of reference. Hardin and Rosenburg (1982) exploit the idea that one can successfully refer to X (by being suitably causally linked to X) while having (largely) false beliefs about X. Thus, Fresnel and Maxwell were referring to the electromagnetic field when they used the term “ether”, and, though they had many false beliefs about it (that it was a mechanical medium, for example), the electromagnetic field was causally responsible for their theories’ success and was retained in later theories.  A big problem with this response is that referential continuity does not suffice for partial or approximate truth (Laudan 1984; Psillos 1999). Psillos (1999) employs causal descriptivism to deal with this problem: “ether” in 19th century theories refers to the electromagnetic field, since that (and only that) object has the properties (medium of light-propagation that is the repository of energy and transmits it locally) that are causally responsible for the relations between measurements we get when we perform optical experiments. By contrast, “phlogiston” does not refer since nothing has the properties that the phlogiston theorists mistakenly believed to be responsible for the body of information they had about oxidation of metals, and so forth. During theory change, the causal-theoretical descriptions of some terms are retained and thereby their references also; these are the essential parts of the theory that contribute to its success; but this is consistent with less central parts being completely wrong.

The latest twist to these divide and conquer strategies is Chakravartty’s doctrine of semirealism (Chakravartty 1998, 2007). Taking his cue from Hacking-Cartwright (§9), Chakravartty distinguishes detection and auxiliary properties. The former are causal properties of objects (and the structure of real relations between them) that are well-confirmed by experimental manipulation because they underwrite the causal interactions we and our instruments exploit in experimental set-ups; the latter are merely theoretical and inferential aids. The former are retained in later theories; the latter are not. Past theories that were on the right track were so because they mathematically coded in systematic ways the detection properties (as opposed to the idle auxiliary properties).

Any of these strategies must meet two further challenges, emphasized in (Stanford 2003a, 2003b). First, they must answer the undermining challenge (above) in a way that is not ad hoc, question-begging, or transparently Whiggish. Simply arguing (with Hardin and Rosenburg) for preservation of reference via preservation of causal role is too easy: do Aristotle’s natural place, Newton’s gravitational action, and Einstein’s space-time curvature all play the same causal role in explaining free-fall phenomena? And if we tighten the account by claiming that continuity requires retention of core causal descriptions (Psillos) or detection property clusters (Chakravartty), are we engaged in a self-serving enterprise? Are we using our own best theories to determine the core causal properties/descriptions and then “reading” those back into the past discarded theories?

Second, they must respond to the trust argument. Divide and conquer strategies argue that successful past theories were right about some things but wrong about others. But then we should expect our own theories to be right about some things and wrong about others. Though perhaps an advance, this does not provide us with a good reason to trust any particular part of our own theories, especially any particular assessment we make (from our vantage point) of the features of a past discarded theory that were responsible for its empirical success. We judge that X-s in a past theory were working posits (Kitcher), essentially contributing causes of success (Psillos), detection properties (Chakravartty), while Y-s in that theory were merely presuppositional posits, idle, or auxiliary properties. But the past theorists were generally unable to make these discriminations, so why do we think we can now make them in a reliable manner. Stanford argues that realists can avoid this problem only if they can provide prospectively applicable criteria of selective confirmation—criteria that past theorists could have used to distinguish the good from the bad in advance of future developments and that we could now use—but they did not have such criteria, nor do we.

12. References and Further Reading

  • Boyd, R. (1973), “Realism, Underdetermination and the Causal Theory of Evidence”, Nous 7, 1-12.
  • Boyd, R. (1983), “On the Current Status of the Issue of Scientific Realism”, Erkenntnis, 19, 45–90.
  • Carnap, R. (1936), “Testability and Meaning”, Philosophy of Science 3, 419-471.
  • Carnap, R. (1937), “Testability and Meaning–Continued”, Philosophy of Science 4, 1-40.
  • Carnap, R. (1939), “Foundations of Logic and Mathematics”, International Encyclopedia of Unified Science 1(3), Chicago: The University of Chicago Press.
  • Carnap, R. (1950), “Empiricism, Semantics and Ontology”, Revue Intérnationale de Philosophie 4, 20-40.
  • Carnap, R. (1956), “The Methodological Character of Theoretical Concepts”, in H. Feigl and M. Scriven (eds), Minnesota Studies in the Philosophy of Science I, Minneapolis: University of Minnesota Press.
  • Cartwright, N. (1983), How the Laws of Physics Lie. Oxford: Clarendon Press.
  • Cartwright, N. (1999), The Dappled World. Cambridge: Cambridge University Press.
  • Chakravartty, A. (1998), “Semirealism”, Studies in the History and Philosophy of Science 29 (3), 391-408.
  • Chakravartty, A. (2007), A Metaphysics for Scientific Realism: Knowing the Unobservable. Cambridge: Cambridge University Press.
  • Churchland, P. (1985), ‘The Ontological Status of Observables: In Praise of the Superempirical Virtues’, in Churchland and Hooker 1985.
  • Churchland, P. and C. Hooker (eds) (1985), Images of Science: Essays on Realism and Empiricism, (with a reply from Bas van Fraassen). Chicago: University of Chicago Press.
  • Duhem, P. (1991/1954/1906), The Aim and Structure of Physical Theory. trans. P Wiener, intro. Jules Vuillemin, Princeton: Princeton University Press.
  • Field, H. (1972), “Tarski’s Theory of Truth”, Journal of Philosophy 64 (13), 347-375.
  • Field, H. (1982), “Realism and Relativism”, Journal of Philosophy 79 (10), 553-567.
  • Fine, A. (1996/1986), The Shaky Game. Chicago: University of Chicago Press.
  • Friedman, M. (1982), “Review of The Scientific Image”, Journal of Philosophy 79 (5), 274-283.
  • Friedman, M. (1999), Reconsidering Logical Positivism. Cambridge: Cambridge University Press.
  • Frigg, R. and I. Votsis. (2011), “Everything You Always Wanted to Know about Structuralism but Were Afraid to Ask”, European Journal for the Philosophy of Science 1, 227-276.
  • Hacking, I. (1983), Representing and Intervening. Cambridge: Cambridge University Press.
  • Hardin, C. and A. Rosenburg. (1982), “In Defense of Convergent Realism”, Philosophy of Science 49, 604-615.
  • Harman, G. (1965), “The Inference to the Best Explanation”, The Philosophical Review 74, 88–95.
  • Hempel, C. G. (1965), Aspects of Scientific Explanation. New York: Free Press.
  • Hertz, H. (1956), The Principles of Mechanics. New York: Dover.
  • Horwich, P. (1990), Truth. Oxford: Blackwell.
  • Kitcher, P. (1993), The Advancement of Science. Oxford: Oxford University Press.
  • Kitcher, P. (2001). “Real Realism: The Galilean Strategy”, The Philosophical Review 110 (2), 151-197.
  • Kuhn, T.S. (1970/1962), The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Kuhn, T.S. (1977/1974), The Essential Tension. Chicago: University of Chicago Press.
  • Kukla, A. (1998), Studies in Scientific Realism. Oxford: Oxford University Press.
  • Ladyman, J. and D. Ross. (2007), Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press.
  • Laudan, L. (1981), “A Confutation of Convergent Realism”, Philosophy of Science, 48, 19–48.
  • Laudan, L. (1984), “Realism without the Real”, Philosophy of Science, 51, 156-162.
  • Laudan, L. and J. Leplin. (1991), “Empirical Equivalence and Underdetermination”, Journal of Philosophy 88 (9), 449-472.
  • Leeds, S. (1995), “Truth, Correspondence, and Success”, Philosophical Studies 79 (1), 1-36.
  • Leeds, S. (2007), “Correspondence Truth and Scientific Realism”, Synthese 159, 1–21.
  • Leplin, J. (1997), A Novel Defence of Scientific Realism. Oxford: Oxford University Press.
  • Lewis, D. (1970), “How to Define Theoretical Terms”, Journal of Philosophy 67, 427-446.
  • Lewis, D. (1984). ‘Putnam’s Paradox’, Australasian Journal of Philosophy 62: 221-236.
  • Lipton, P. (2004/1991), Inference to the Best Explanation. London: Routledge.
  • Liston, M. (1985), “Is a God’s-Eye-View an Ideal Theory?”, Pacific Philosophical Quarterly 66.3-4, 355-376.
  • Liston, M. (2005), “Does ‘Rabbit’ refer to Rabbits?”, European Journal of Analytic Philosophy 1, 39-56.
  • Mach, E. (1893), The Science of Mechanics, trans. T. J. McCormack, 6th edition., La Salle: Open Court.
  • Magnus, P.D. and C. Callender. (2004), “Realist Ennui and the Base Rate Fallacy”, Philosophy of Science 71, 320–338.
  • Maxwell, G. (1962), “On the Ontological Status of Theoretical Entities”, in H. Feigl and G. Maxwell (eds.), Minnesota Studies in the Philosophy of Science III, Minneapolis: University of Minnesota Press.
  • Maxwell, G. (1970), “Structural Realism and the Meaning of Theoretical Terms”, in S. Winoker and M. Radner (eds.), Minnesota Studies in the Philosophy of Science IV, Minneapolis: University of Minnesota Press.
  • McMullin, E. (1991), “Rationality and Theory Change in Science”, in P. Horwich (ed.), Thomas Kuhn and the Nature of Science. Cambridge: MIT Press.
  • Merrill, G. H. (1980), “The Model-Theoretic Argument Against Realism”, Philosophy of Science 47, 69-81.
  • Miller, D. (1974), “Popper’s Qualitative Theory of Verisimilitude”, British Journal for the Philosophy of Science 25, 166–177.
  • Miller, R. (1987), Fact and Method. Princeton: Princeton University Press.
  • Musgrave, A. (1985), “Realism vs Constructive Empiricism”, in Churchland and Hooker 1985.
  • Musgrave, A. (1989), “Noa’s Ark–Fine for Realism”, Philosophical Quarterly 39, 383–398.
  • Newman, M. H. A. (1928), “Mr. Russell’s ‘Causal Theory of Perception”’, Mind 37, 137-148.
  • Niiniluoto, I. (1987), Truthlikeness. Dordrecht: Reidel.
  • Poincaré, H. (1913), The Foundations of Science. New York: The Science Press.
  • Psillos, S. (1995), “Is Structural Realism the Best of Both Worlds?”, Dialectica 49, 15-46.
  • Psillos, S. (1999), Scientific Realism: How Science Tracks Truth. London: Routledge.
  • Psillos, S. (2001), “Is Structural Realism Possible?”, Philosophy of Science 68, S13–S24.
  • Putnam, H. (1962), “What Theories Are Not”, in Putnam 1975c.
  • Putnam, H. (1975a), “Explanation and Reference”, in Putnam 1975d.
  • Putnam, H. (1975b), “The Meaning of ‘Meaning”’, in (Putnam 1975d).
  • Putnam, H. (1975c), Philosophical Papers 1: Mathematics, Matter and Method. Cambridge: Cambridge University Press.
  • Putnam, H. (1975d), Philosophical Papers 2: Mind, Language and Reality. Cambridge: Cambridge University Press.
  • Putnam, H. (1978), Meaning and the Moral Sciences. London: Routledge.
  • Putnam, H. (1981), Reason, Truth and History. Cambridge: Cambridge University Press.
  • Putnam, H. (2015), “Naturalism, Realism, and Normativity”, Journal of the American Philosophical Association 1(2), 312-328.
  • Quine, W.V. (1955), “Posits and Reality”, in W. V. Quine, The Ways of Paradox and Other Essays. Cambridge: Harvard University Press (1976), 246-254.
  • Quine, W.V. (1969), “Epistemology Naturalized”, in W. V. Quine, Ontological Relativity and Other Essays. New York: Columbia University Press (1969): 69-90.
  • Rorty, R. (1980), Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
  • Russell, B. (1927), The Analysis of Matter. London: Routledge, Kegan-Paul.
  • Stanford, P. K. (2001), “Refusing the Devil’s Bargain: What Kind of Underdetermination Should We Take Seriously?”, Philosophy of Science 68 (3), S1-S12.
  • Stanford, P.K. (2003a), “Pyrrhic Victories for Scientific Realism”, Journal of Philosophy 100 (11), 553-572.
  • Stanford, P.K. (2003b), “No Refuge for Realism: Selective Confirmation and the History of Science”, Philosophy of Science 70, 917-925.
  • Stanford, P.K. (2006), Exceeding our Grasp. Oxford: Oxford University Press.
  • Stanford, P.K. (2015), ““Atoms Exist” is Probably True, and Other Facts That Should Not Comfort Scientific Realists”, Journal of Philosophy 112 (8), 397-416 .
  • van Fraassen, B. (1980), The Scientific Image. Oxford: Clarendon Press.
  • van Fraassen, B. (2006), “Structure: its Shadow and Substance”, British Journal for Philosophy of Science 57, 275-307.
  • van Fraassen, B. (2008), Scientific Representation. Oxford: Clarendon Press.
  • Wilson, M. (1982), “Predicate Meets Property”, Philosophical Review 91(4), 549-589.
  • Wilson, M. (1985), “What can Theory Tell us about Observation?”, in Churchland and Hooker 1985.
  • Wilson, M. (1998), “Mechanics, Classical”, in Edward Craig (ed.), The Routledge Encyclopedia of Philosophy Vol. 6, 251-259, London: Routledge.
  • Wilson, M. (2006), Wandering Significance. Oxford: Oxford University Press.
  • Worrall, J. (1989), “Structural Realism: The Best of Both Worlds?”, Dialectica 43, 99–124.

 

Author Information

Michael Liston
Email: mnliston@uwm.edu
University of Wisconsin-Milwaukee
U. S. A.

Stoicism

Stoa of Attalus in AthensStoicism originated as a Hellenistic philosophy, founded in Athens by Zeno of Citium (modern day Cyprus), c. 300 B.C.E. It was influenced by Socrates and the Cynics, and it engaged in vigorous debates with the Skeptics, the Academics, and the Epicureans. The name comes from the Stoa Poikile, or painted porch, an open market in Athens where the original Stoics used to meet and teach philosophy. Stoicism moved to Rome where it flourished during the period of the Empire, alternatively being persecuted by Emperors who disliked it (for example, Vespasian and Domitian) and openly embraced by Emperors who attempted to live by it (most prominently Marcus Aurelius). It influenced Christianity, as well as a number of major philosophical figures throughout the ages (for example, Thomas More, Descartes, Spinoza), and in the early 21st century saw a revival as a practical philosophy associated with Cognitive Behavioral Therapy and similar approaches. Stoicism is a type of eudaimonic virtue ethics, asserting that the practice of virtue is both necessary and sufficient to achieve happiness (in the eudaimonic sense). However, the Stoics also recognized the existence of “indifferents” (to eudaimonia) that could nevertheless be preferred (for example, health, wealth, education) or dispreferred (for example, sickness, poverty, ignorance), because they had (respectively, positive or negative) planning value with respect to the ability to practice virtue. Stoicism was very much a philosophy meant to be applied to everyday living, focused on ethics (understood as the study of how to live one’s life), which was in turn informed by what the Stoics called “physics” (nowadays, a combination of natural science and metaphysics) and what they called “logic” (a combination of modern logic, epistemology, philosophy of language, and cognitive science).

Table of Contents

  1. Historical Background
    1. Philosophical Antecedents
    2. Greek Stoicism
    3. Roman Stoicism
    4. Debates with Other Hellenistic Schools
  2. The First Two Topoi
    1. “Logic”
    2. “Physics”
  3. The Third Topos: Ethics
  4. Apatheia and the Stoic Treatment of Emotions
  5. Stoicism after the Hellenistic Era
  6. Contemporary Stoicism
  7. Glossary
  8. References and Further Readings

1. Historical Background

Classically, scholars recognize three major phases of ancient Stoicism (Sedley 2003): the early Stoa, from Zeno of Citium (the founder of the school, c. 300 B.C.E.) to the third head of the school, Chrysippus; the middle Stoa, including Panaetius and Posidonius (late II and I century B.C.E.); and the Roman Imperial period, or late Stoa, with Seneca, Musonius Rufus, Epictetus and Marcus Aurelius (I through II century C.E.). Of course, Stoicism itself originated as a modification from previous schools of thought (Schofield 2003), and its influence extended well beyond the formal closing of the ancient philosophical schools by the Byzantine Emperor Justinian I in 529 C.E. (Verbeke 1983; Colish 1985; Osler 1991).

a. Philosophical Antecedents

Stoicism is a Hellenistic eudaimonic philosophy, which means that we can expect it to be influenced by its immediate predecessors and contemporaries, as well as to be in open critical dialogue with them. These includes Socratic thinking, as it has arrived to us mainly through the early Platonic dialogues; the Platonism of the Academic school, particularly in its Skeptical phase; Aristotelianism of the Peripatetic school; Cynicism; Skepticism; and Epicureanism. It is worth noting, in order to put things into context, that a quantitative study of extant records concerning known philosophers of the ancient Greco-Roman world (Goulet 2013) estimates that the leading schools of the time were, in descending order: Academics-Platonists (19%), Stoics (12%), Epicureans (8%), and Peripatetics-Aristotelians (6%).

Eudaimonia was the term that meant a life worth living, often translated nowadays as “happiness” in the broad sense, or more appropriately, flourishing. For the Greco-Romans this often involved—but was not necessarily entirely defined by—excellence at moral virtues. The idea is therefore closely related to that of virtue ethics, an approach most famously associated with Aristotle and his Nicomachean Ethics (Broadie & Rowe 2002), and revived in modern times by a number of philosophers, including Philippa Foot (2001) and Alasdair MacIntyre (1981/2013).

Stoicism is best understood in the context of the differences among some of the similar schools of the time. Socrates had argued—in the Euthydemus, for instance (McBrayer et al. 2010)—that virtue, and in particular the four cardinal virtues of wisdom, courage, justice and temperance, are the only good. Everything else is neither good nor bad in and of itself. By contrast, for Aristotle the virtues (of which he listed a whopping twelve) were necessary but not sufficient for eudaimonia. One also needed a certain degree of positive goods, such as health, wealth, education, and even a bit of good looks. In other words, Aristotle expounded the rather commonsensical notion that a flourishing life is part effort, because one can and ought to cultivate one’s character, and part luck, in the form of the physical and cultural conditions that affect and shape one’s life.

Contrast this to the rather extreme (even for the time) take of the Cynics, who not only thought that virtue was the only good, like Socrates, but that the additional goods that Aristotle was worried about were actually distractions and needed to be positively avoided. Cynics like Diogenes of Sinope were famous for their ascetic and shall we say rather eclectic life style, as is epitomized by a story about him told by Diogenes Laertius (VI.37): “One day, observing a child drinking out of his hands, he cast away the cup from his wallet with the words, ‘A child has beaten me in plainness of living.’”

Diogenes and the boy without a cup

Diogenes and the boy without the cup

One way to think of this is that the Aristotelian approach comes across as a bit too aristocratic: if one does not have certain privileges in life, one cannot achieve eudaimonia. By contrast, the Cynics were preaching a rather extreme minimalist lifestyle, which is hard to practice for most human beings. What the Stoics tried to do, then, was to strike a balance in the middle, by endorsing the twin crucial ideas, on which I will elaborate later, that virtue is the only true good, in itself sufficient for eudaimonia regardless of one’s circumstances, but also that other things—like health, education, wealth—may be rationally preferred (Proēgmena) or “dispreferred” (Apoproēgmena), as in the case of sickness, ignorance, and poverty, as long as one did not confuse them for things with inherent value.

b. Greek Stoicism

The “Greek” phase of the Stoa covers the first and second periods, from the founding of the school by Zeno to the shifting of the center of gravity from Athens to Rome in the time of Posidonius in the I Century B.C.E., who became a friend of Cicero—not a Stoic himself, but one of our best indirect sources on early Stoicism. Stoicism was not just born, but flourished in Athens, even though most of its exponents originated from the Eastern Mediterranean: Zeno from Citium (modern Cyprus), Cleanthes from Assos (modern Western Turkey), and Chrysippus from Soli (modern Southern Turkey), among others. According to Medley (2003), this pattern is simply a reflection of the dominant cultural dynamics of the time, affected as they were by the conquests of Alexander.

From the beginning Stoicism was squarely a “Socratic” philosophy, and the Stoics themselves did not mind such a label. Zeno began his studies under the Cynic Crates, and Cynicism always had a strong influence on Stoicism, all the way to the later writings of Epictetus. But Zeno also counted among his teachers Polemo, the head of the Academy, and Stilpo, of the Megarian school founded by Euclid of Megaria, a pupil of Socrates. This is relevant because Zeno came to elaborate a philosophy that was both of clear Socratic inspiration (virtue is the Chief Good) and a compromise between Polemo’s and Stilpo’s positions, as the first one endorsed the idea that there are external goods—though they are of secondary importance—while the second one claimed that nothing external can be good or bad. That compromise consisted in the uniquely Stoic notion that external goods are of ethically neutral value, but are nonetheless the object of natural pursuit.

Zeno established the tripartite study of Stoic philosophy (see the three topoi[[hyperlink]]) comprising ethics, physics and logic. The ethics was basically a moderate version of Cynicism; the physics was influenced by Plato’s Timaeus (Taran 1971) and encompassed a universe permeated by an active (that is, rational) and a passive principle, as well as a cosmic web of cause and effect; the logic included both what we today refer to as formal logic and epistemology, that is, a theory of knowledge, which for the Stoics was decidedly empiricist-naturalistic.

The Stoics after Zeno disagreed on a number of issues, often interpreting Zeno’s teachings differently. Perhaps the most important example is provided by the dispute between Cleanthes and Chrysippus about the unity of the virtues: Zeno had talked about each virtue in turn being a kind of wisdom, which Cleanthes interpreted in a strict unitary sense (that is, all virtues are one: wisdom), while Chrysippus understood in a more pluralistic fashion (that is, each virtue is a “branch” of wisdom).

The early Stoics could also be stubbornly anti-empirical in their apologetics of Zeno’s writings, as when Chrysippus insisted in defending the idea that the heart, not the brain, is the seat of intelligence. This went against pretty conclusive anatomical evidence that was already available in the Hellenistic period, and earned the Stoics the scorn of Galen (for example, Tieleman 2002), though later Stoics did update their beliefs on the matter.

Despite this faux pas, Chrysippus was arguably the most influential Stoic thinker, responsible for an overhaul of the school, which had declined under the guidance of Cleanthes, a broad systematization of its teachings, and the introduction of a number of novel notions in logic—the aspect of Stoicism that has had the most technical philosophical impact in the long run. Famously, Diogenes Laertius (2015, VII.183) wrote that “But for Chrysippus, there had been no Porch.”

In the six decades following Chrysippus there were just two heads of the Stoa, Zeno of Tarsus (south-central Turkey) and Diogenes of Babylon, whose contributions were rather less significant than those of Chrysippus himself. We have to wait until 155 B.C.E. for the next impactful event, when the heads of the three major schools in Athens—the Stoics, the Academics and the Peripatetics—were sent by the city to Rome in order to help with diplomatic efforts. (It is interesting to note, as does Sedley (2003) that the fourth large school, the Epicurean one, was missing, following their stance of political non-involvement.) The philosophers in question, including the Stoic Diogenes of Babylon, made a huge impression on the Roman public with their public performances (and, apparently, an equally worrisome one on the Roman elite, thus beginning a long tradition of tension between philosophers and high-level politicians that characterized especially the post-Republican empire), paving the road for the later shift of philosophy from Athens to Rome, as well as other centers of learning, like Alexandria.

Beginning with Antipater of Tarsus, and then more obviously Panaetius (late II Century B.C.E.) and Posidonius (early I Century B.C.E.), the Stoics revisited their relationship with the Academy, especially in light of the above mentioned importance of the Timaeus for Stoic cosmology. Apparently, what particularly interested Posidonius was the fact that Plato’s main character in the dialogue is a Pythagorean, a school that Posidonius somewhat anachronistically managed to link to Stoicism.

It appears that the broader project pursued by both Panaetius and Posidonius was one of seeking common ground (Sedley 2003 uses the term “syncretism”) among Academicism, Aristotelianism and Stoicism itself, that is, the three branches of Socratic philosophy. This process seems to have been in part responsible for the further success of Stoicism once the major philosophers of the various schools moved from Athens to Rome, after the diaspora of 88-86 B.C.E.

c. Roman Stoicism

If the visit to Rome by the head of various philosophical schools in 155 B.C.E. was crucial for bringing philosophy to the attention of the Romans, the political events of 88-86 B.C.E. changed the course of Western philosophy in general, and Stoicism in particular, for the remainder of antiquity.

At that time philosophers, particularly the Peripatetic Athenion and—surprisingly—the Epicurean Aristion, were politically in charge at Athens, and made the crucial mistake of siding with Mithridates against Rome (Bugh 1992). The defeat of the King of Pontus, and consequently of Athens, spelled disaster for the latter and led to a diaspora of philosophers throughout the Mediterranean.

To be fair, we have no evidence of the continuation of the Stoa as an actual school in Athens after Panaetius (who often absented himself to Rome anyway), and we know that Posidonius taught in Rhodes, not Athens. However, according to Sedley (2003), it was the events of 88-86 B.C.E. that finally and permanently moved the center of gravity of Stoicism away from its Greek cradle to Rome, Rhodes (where an Epicurean school also flourished), and Tarsus, where a Stoic was at one point chosen by Augustus to govern the city.

Most crucially, however, Stoicism became important in Rome during the fraught time of the transition between the late Republic and the Empire, with Cato the Younger eventually becoming a role model for later Stoics because of his political opposition to the “tyrant” Julius Caesar. Sedley highlights two Stoic philosophers of the late First Century B.C.E., Athenodorus of Tarsus and Arius Didymus, as precursors of one of the greatest and most controversial Stoic figures, Seneca. Both Athenodorus and Arius were personal counselors to the first emperor, Augustus, and Arius even wrote a letter of consolation to Livia, Augustus’ wife, addressing the death of her son, which Seneca later hailed as a reference work of emotional therapy, the sort of work he himself engaged in and became famous for.

Once we get to the Imperial period (Gill 2003), we see a decided shift away from the more theoretical aspects of Stoicism (the “physics” and “logic,” see below) and toward more practical treatments of the ethics. However, as Gill points out, this should not lead us to think that the vitality of Stoicism had taken a nose dive by then: we know of a number of new treatises produced by Stoic writers of that period, on everything ranging from ethics (Hierocles’ Elements of Ethics) to physics (Seneca’s Natural Questions), and the Summary of the Traditions of Greek Theology by Cornutus is one of a handful of complete Stoic treatises to survive from any period of the history of the school. Still, it is certainly the case that the best known Stoics of the time were either teachers like Musonius Rufus and Epictetus, or politically active, like Seneca and Marcus Aurelius, thus shaping our understanding of the period as a contrast to the foundational and more theoretical one of Zeno and Chrysippus.

Importantly, it is from the late Republic and Empire that we also get some of the best indirect sources on Stoicism, particularly several books by Cicero (2014; for example., Paradox Stoicorum, De Finibus Bonorum et Malorum, Tusculanae Quaestiones, De Fato, Cato Maior de Senectute, Laelius de Amicitia, and De Officiis) and Diogenes Laertius’ Lives of the Eminent Philosophers (Book VII, 2015). And this literature went on to influence later writers well after the decline of Stoicism, particularly Plotinus (205-270 C.E.) and even the 6th Century C.E. Neoplatonist Simplicius.

All of the above notwithstanding, what is most vital about Stoicism during the Roman Imperial period, however, is also what arguably made the philosophy’s impact reverberate throughout the centuries, eventually leading to two revivals, the so-called Neostoicism of the Renaissance, and the current “modern Stoicism” movement to which I will turn at the end of this essay. The sources of such vitality were fundamentally two: on the one hand charismatic teachers like Musonius and Epictetus, and on the other hand influential political figures like Seneca and Marcus. Indeed, Musonius was, in a sense, both: not only he was a member of the Roman “knight” class, and the teacher of Epictetus, he was also politically active, openly criticizing the policies of both Nero and Vespasian, and getting exiled twice as a result. Others were not so lucky: Stoic philosophers suffered a series of persecutions from displeased emperors, which resulted in murders or exile for a number of them, especially during the reigns of Nero, Vespasian and Domitian. Seneca famously had to commit suicide on Nero’s orders, and Epictetus was exiled to Greece (where he established his school at Nicopolis) by Domitian.

It is also important to appreciate different “styles” of being Stoic among the major Roman figures. As Gill (2003) points out, Epictetus was rather strict, arching back to the Cynic model of quasi-asceticism (see, for instance, his “On Cynicism” in Discourses III.22). Musonius was a sometimes odd combination of “conservative” and “progressive” Stoic, advocating the importance of marriage and family, but also stating very clearly that women are just as capable of practicing virtue and philosophizing as men are, and moreover that it is hypocritical of men to consider their extramarital sexual activities differently from those of women! Seneca was not only more open to the pursuit of “preferred indifferents” (he was a wealthy Senator, but it seems unfair to accuse him of endorsing a simplistic self-serving philosophy: see the nuanced biographies by Romm 2014 and Wilson 2014), but explicitly stated that he was critical of some of the doctrines of the early Stoics, and that he was open to learn from other schools, including the Epicureans. Famously, Marcus Aurelius was open—one would almost want to say agnostic—about theology, at several points in the Meditations (1997) explicitly stating the two alternatives of “Providence” (Stoic doctrine) or “Atoms” (the Epicurean take), for instance: “Either there is a fatal necessity and invincible order, or a kind Providence, or a confusion without a purpose and without a director. If then there is an invincible necessity, why do you resist? But if there is a Providence that allows itself to be propitiated, make yourself worthy of the help of the divinity. But if there is a confusion without a governor, be content that in such a tempest you have yourself a certain ruling intelligence” (XII.14); or: “With respect to what may happen to you from without, consider that it happens either by chance or according to Providence, and you must neither blame chance nor accuse Providence” (XII.24). More is said about this specific topic in the section on Stoic metaphysics and teleology.

There is ample evidence, then, that Stoicism was alive and well during the Roman period, although the emphasis did shift—somewhat naturally, one might add—from laying down the fundamental ideas to refining them and putting them into practice, both in personal and social life.

d. Debates with Other Hellenistic Schools

One should understand the evolution of all Hellenistic schools of philosophy as being the result of continuous dialogue amongst themselves, a dialogue that often led to partial revisions of positions within any given school, or to the adoption of a modified notion borrowed from another school (Gill 2003). To have an idea of how this played out for Stoicism, let us briefly consider a few examples, related to the interactions between Stoicism and Epicureanism, Aristotelianism, and Platonism—without forgetting the direct influence that Cynicism had on the very birth of Stoicism and all the way to Epictetus.

Epictetus is pretty explicit about his—negative—opinions of the Epicureans, drawing as sharp a contrast as possible between the latter’s concern with pleasure and pain and the Stoic focus on virtue and integrity of character. For example, Discourses I.23 is entitled “Against Epicurus,” and begins: “[1] Even Epicurus realizes that we are social creatures by nature, but once he has identified our good with the shell, he cannot say anything inconsistent with that. [2] For he further insists—rightly—that we must not respect or approve anything that does not share in the nature of what is good.” “The shell” here is the body, a reference to the Epicureans’ insistence on pleasure and the absence of pain as what leads to ataraxia, or tranquillity of mind—a term interestingly different from the one preferred by the Stoics, apatheia, or lack of disturbing emotions, as shall be seen below.

A longer section, II.20, is entitled “Against the Epicureans and the Academics,” at the beginning of which Epictetus calls the bluff, in his mind, on the rivals’ theories, which he understands as clearly impractical and contrary to common sense: “[1] Even people who deny that statements can be valid or impressions clear [that is, the Skeptics] are obliged to make use of both. You might almost say that nothing proves the validity of a statement more than finding someone forced to use it while at the same time denying that it is sound.” Epictetus even goes so far as suggesting that Epicurus is incoherent, as he advises a life of retired tranquility away from society, and yet bothers to write books about it, thus showing himself to be concern about the welfare of society after all: “[15] What urged him to get out of bed and write the things he wrote was, of course, the strongest element in a human being—nature—which subjected him to her will despite his loud resistance.”

Attacking the Skeptics among the Academics, Epictetus turns up the rhetoric significantly: “What a travesty! [28] What are you doing? You prove yourself wrong on a daily basis and still you won’t give up these idle efforts. When you eat, where do you bring your hand—to your mouth, or to your eye? What do you step into when you bathe? When did you ever mistake your saucepan for a dish, or your serving spoon for a skewer?” And he sees his invective as justified—in sure Stoic fashion—not on theoretical grounds, but on practical ones: “[35] We could give adulterers grounds for rationalizing their behavior; such arguments could provide pretexts to misappropriate state funds; a rebellious young man could be emboldened further to rebel against his parents. So what, according to you, is good or bad, virtuous or vicious—this or that?”

Even so, not all Stoics rejected either Academic or Epicurean ideas altogether. I have mentioned Marcus Aurelius’ relative “agnosticism” about Providence vs. Atoms (though he clearly preferred the first option, in line with standard Stoic teaching), and Seneca is often sympathetic to Epicurean views, though, as Gill (2003, note 58) comments, this is in the spirit of showing that even some of the rival school’s ideas are congruent with Stoic ones. He very clearly states, however, in Natural Questions: “I do not agree with [all] the views of our school” (2014, VII.22.1).

Cicero, in Book III of De Finibus, provides us with some glimpses of the disagreement between Stoics and Aristotelians, by way of his imaginary dialogue with Cato the Younger. At [41] he writes: “Carneades never ceased to contend that on the whole so-called ‘problem of good and evil,’ there was no disagreement as to facts between the Stoics and the Peripatetics, but only as to terms. For my part, however, nothing seems to me more manifest than that there is more of a real than a verbal difference of opinion between those philosophers on these points.” He continues: “The Peripatetics say that all the things which under their system are called goods contribute to happiness; whereas our school does not believe that total happiness comprises everything that deserves to have a certain amount of value attached to it,” referring to the different treatment of “external goods” between Aristotelians and Stoics.

There are well documented examples of Stoic opinions changing in direct response to challenges from other schools, for instance the modified position on determinism that was adopted by Philopator (80-140 C.E.), a result of criticism from both the Peripatetic and the Middle Platonist philosophers. We also have clear instances of Stoic ideas being incorporated by other schools, as in the case of Antiochus of Ascalon (130-69 B.C.E.), who introduced Stoic notions in his revision of Platonism, justifying the move by claiming that Zeno (and Aristotle, for that matter) developed ideas that were implicit in Plato (Gill 2003). Finally, Stoicism found its way into Christianity via Middle Platonism, at the least since Clement of Alexandria (150-215 C.E.).

2. The First Two Topoi

A fundamental aspect of Stoic philosophy is the twofold idea that ethics is central to the effort, and that the study of ethics is to be supported by two other fields of inquiry, what the Stoics called “logic” and “physics.” Together, these form the three topoi of Stoicism.

We will take a closer look to each topos in turn, but it is first important to see why and how they are connected. Stoicism was a practical philosophy, the chief goal of which was to help people live a eudaimonic life, which the Stoics identified with a life spent practicing the cardinal virtues (next section). Later in the Roman period the emphasis shifted somewhat to the achievement of apatheia, but this too was possible because of the practice of the topos of ethics.

This, in turn, was to be supported by the study of the other two topoi, “logic,” which was more expansive than the modern technical meaning of the term, including logic sensu stricto, but also a theory of knowledge (that is, epistemology), as well as cognitive science, and “physics,” by which the Stoics meant roughly what we would today identify as a combination of natural science and metaphysics (the latter including theology). Roughly, then, “logic” means the study of how to reason about the world, while “physics” means the study of that world.

Logic and Physics are related to Ethics because Stoicism is a thoroughly naturalistic philosophy. Even when the Stoics are talking about “God” or “soul,” they are referring to physical entities, respectively identified with the rational principle embedded in the universe itself and with whatever makes human rationality possible. Stoics often invoked creative imagery to explain the relationship among Physics, Logic and Ethics, as found in Diogenes Laertius (VII.39), for instance. Perhaps the most famous of such analogies is the one using an egg, where the shell is the Logic, the white the Ethics, and red part the Physics. However, given how the three topoi were meant to relate to each other, this is probably misleading, possibly due to a misunderstanding of the biology of eggs (the Physics is supposed to be nurturing the Ethics, which means that the former should be the white and the latter the red part of the egg). The best simile in my mind is that of a garden: the fence is the Logic—defending the precious inside and defining its boundaries; the fertile soil is the Physics—providing the nutritive power by way of knowledge of the world; and the resulting fruits are the Ethics—the actual focal objective of Stoic teachings.

While the Stoics disagreed on the sequence in which the three topoi should be presented to students (that is, just like faculty in a modern university, they had contrasting opinions about the merits of different curricula!), the crucial point is that of a naturalistic philosophy where there is no sharp distinction between “is” and “ought,” as assumed in much modern moral philosophy, because what an agent ought to do (Ethics) is in fact closely informed by that agent’s knowledge of the workings of the world (Physics) as well as her capacity to reason correctly (Logic). This section describes the first two topoi and the next describe Ethics.

a. “Logic”

Stoics made important early contributions to both epistemology (Hankinson 2003) and logic proper (Bobzien 2003), and much has, deservedly, been written about it. While Stoics held that the Sage, who was something of an ideal figure, could achieve perfect knowledge of things, in practice they relied on a concept of cognitive progress, as well as moral progress, since both logic and physics are related to, and indeed function in the service of, ethics. They referred to this idea as prokopê (making progress), and they engaged in a long running dispute with Academic Skeptics about just how defensible this notion actually is.

Unlike the Epicureans, Stoics did not maintain that all impressions are true, but rather that some of them were “cataleptic” (that is, leading to comprehension) and others were not. Diogenes Laertius explains the difference (VII.46): “the cataleptic, which [the Stoics] hold to be the criterion of matters, is that which comes from something existent and is in accordance with the existent thing itself, and has been stamped and imprinted; the non-cataleptic either comes from something non-existent, or if from something existent then not in accordance with the existent thing; and it is neither clear, nor distinct.”

So the Stoics did admit that one’s perception can be wrong, as in cases of hallucinations, or dreams, or other sources of phantasma (that is, impressions on the mind, the result of automatic—we would say unconscious—judgment), but also that proper training allows one to make progress in distinguishing cataleptic from non-cataleptic impressions (that is, impressions to which we may reasonably give or withhold assent). Chrysippus even suggested that it is important to absorb a number of impressions, since it is the accumulation of impressions that leads to concept-formation and to making progress. In this sense, the Stoic account of knowledge was eminently empiricist in nature, and—especially after relentless Skeptical critiques—relied on something akin to what moderns call inference to the best explanation (Lipton 2003), as in their conclusion that our skin must have holes based on the observation that we sweat.

It is important to realize that a cataleptic impression is not quite knowledge. The Stoics distinguished among opinion (weak, or false), apprehension (characterized by an intermediate epistemic value), and knowledge (which is based on firm impressions unalterable by reason). Giving assent to a cataleptic impression is a step on the way to actual knowledge, but the latter is more structured and stable than any single impression could be. In a sense, then, the Stoics held to a coherentist view of justification (for example, Angere 2007), and ultimately, like all ancients, to a correspondence theory of truth (for example, O’Connor 1975).

Hankinson (2003) comments on an interesting aspect of the dispute between Stoics and Academic Skeptics, concerning the epistemic warrant to be granted to cataleptic impressions. What, precisely, makes them “clear and distinct,” a Stoic terminology that clearly anticipates Descartes (who, obviously, was not an empiricist)? If clarity and distinctiveness are internal features of cataleptic impressions, then these are phenomenal features, and it is easy to come up with counterexamples where they do not seem to work (for instance, the common occurrence of mistaking one member of a pair of twins for the other one).

This is where we encounter one of the many episodes of growth of Stoic thought in response to external pressure. Cicero tells us (2014, in Academica II.77) that Zeno was aware that the same impression could derive from something that did or did not exist, so he modified his stance (as Diogenes Laertius reports: VII.50), adding the following clause: “of such a type as could not come from something non-existent.” Of course this does not solve the issue, but it builds on the Stoic metaphysical assumption that there cannot be two things that are exactly alike, as much as at times it may appear so to us. Frede (1983) advanced the further view that what makes a cataleptic impression clear and distinct is not any internal feature of that impression, but rather an external causal feature related to its origin. According to this account, then, Stoic epistemology is externalist (for example, Almeder 1995), rather than internalist (for example, Goldman 1980). Indeed, there is evidence that they became—again as a result of criticism from the Skeptics—reliabilists about knowledge (Goldman 1994). Athenaeus tells of the story of Sphaerus, a student of Cleanthes and colleague of Chrysippus, who was shown at a banquet what turned out to be birds made of wax. After he reached to pick one up he was accused of having given assent to a false impression. To which he—rather cleverly, but indicatively—replied that he had merely assented to the proposition that it was reasonable to think of the objects as actual birds, not to the stronger claim that they actually were birds.

When it comes to the area of Stoic “logic” that is closest to our, much narrower, conception of the field, the school made major contributions. Their system of syllogistics recognized that not all valid arguments are syllogisms and significantly differs from Aristotle’s, having more in common with modern-day relevance logic (Bobzien 2006). To simplify quite a bit (but see Bobzien 2003 for a somewhat in-depth treatment), Stoic syllogistics was built on five basic types of syllogisms, and complemented by four rules for arguments that could be deployed to reduce all other types of syllogisms to one of the basic five.

The broader Stoic approach to logic has been characterized as a type of propositional logic, anticipating aspects of Frege’s work (Beaney 1997). Stoic logic made a fundamental distinction between “sayables” and “assertibles.” The former are a broader category that includes assertibles as well as questions, imperatives, oaths, invocations and even curses. The assertibles then are self-complete sayables that we use to make statements. For instance, “If Zeno is in Athens then Zeno is in Greece” is a conditional composite assertible, constructed out of the individual simple assertibles “Zeno is in Athens” and “Zeno is in Greece.” A major difference between Stoic assertibles and Fregean propositions is that the truth or falsehood of assertibles can change with time: “Zeno is in Athens” may be true now but not tomorrow, and it may become true again next month. It is also important to note that truth or falsehood are properties of assertibles, and indeed that being either true or false is a necessary and sufficient condition for being an assertible (that is, one cannot assert, or make statements about, things that are neither true nor false).

The Stoics were concerned with the validity of arguments, not with logical theorems or truths per se, which again is understandable in light of their interest to use logic to guard the fruits of their garden, the ethics. They also introduced modality into their logic, most importantly the modal properties of necessity, possibility, non-possibility, impossibility, plausibility and probability. This was a very modern and practically useful approach, as it directed attention to the fact that some assertibles induce assent even though they may be false, as well as to the observation that some assertibles have a higher likelihood of being true than not. Finally, the Stoics, and Chrysippus in particular, were sensitive to and attempted to provide an account of logical paradoxes such as the Liar and Sorites cases along lines that we today recognize as related to a semantic of vagueness (Tye 1994).

b. “Physics”

The Stoic topos of Physics includes what we today would classify as natural science (White 2003), metaphysics (Brunschwig 2003), and theology (Algra 2003). Let us briefly look at each in turn.

When it comes to natural science and cosmology, recall that the Stoics sought to “live according to nature,” which requires us to make our best efforts to understand nature. This also implies a very different view of natural science from the modern one: its study is not an end in itself, but rather subordinate to help us live a eudaimonic life.

Stoics thought that everything real, that is, everything that exists, is corporeal—including God and soul. They also recognized a category of incorporeals, which included things like the void, time, and the “sayables” (meanings, which played an important role in Stoic Logic). This may appear as a contradiction, given the staunchly materialist nature of Stoics philosophy, but is really no different from a modern philosophical naturalist who nonetheless grants that one can meaningfully talk about abstract concepts (“university,” “the number four”) which are grounded in materialism because they can only be thought of by corporeal beings such as ourselves.

They embraced what we might call a “vitalist” understanding of nature, which is permeated by two principles: an active one (identified with reason and God, referred to as the Logos) and a passive one (substance, matter). The active principle is un-generated and indestructible, while the passive one—which is identified with the four classical elements of water, fire, earth and air—is destroyed and recreated at every, eternally recurring, cosmic conflagration, a staple of Stoic cosmology. The cosmos itself is a living being, and its rational principle (Logos) is identified with aether, or the Stoic Fire (not to be confused with the elemental fire that is part of the passive principle). Consequently, God is immanent in the universe, and it is in fact identified with the creative cosmic Fire. This also means that the Stoics, unlike the Aristotelians, did not recognize the concept of a prime mover, nor of a Christian-type God outside of time and space, on the ground that something incorporeal cannot act on things, because it has no causal powers. From all of this, as White (2003) puts it, emerges a biological, rather than a mechanical picture of causation, which is significantly different from post-Cartesian and Newtonian mechanical philosophy.

Cosmic conflagrations, for the Stoics, repeat themselves in exact manner, apparently because God/Nature laid out things in the best possible way the previous time around, and there is therefore no reason to change (though one would get the same outcome from an entirely deterministic causal model of the universe). It is interesting to muse about the fact that some modern cosmological models also predict either identical or varied recurring universes (Ungerer and Smolin 2014), but of course do away with the concept of Providence altogether. According to Eusebius (quoted by White), during the phase of cosmic conflagration, the creative Fire is “a sort of seed, which possesses the principles of all things and the causes of all things that have occurred, are occurring, and will occur—the interweaving and ordering of which is fate, knowledge, truth, and a certain inevitable and inescapable law of the things that exist.”

Cicero, in De Fato, lays out the Stoic theory of causality and actually equates fate with antecedent causes. Chrysippus had argued that there is no possibility of motion without causes, deducing that therefore everything has a cause. This concept of universal causality led the Stoics to accept divination as a branch of physics, not a superstition, as explained again by Cicero in De Divinatione, and this makes sense once one understands the Stoic view of the cosmos: predicting the future is not something that one does by going outside the laws of physics, but by intelligently exploiting such laws.

Metaphysically the Stoics were determinists (Frede 2003). Here is Cicero: “[the Stoics] say, that it is impossible, when all the circumstances surrounding both the cause and that of which it is a cause are the same, that things should not turn out a certain way on one occasion but that they should turn out that way on some other occasion” (De Fato, 199.22-25). The Stoics did have a concept of chance, but they thought of it (much like modern scientists) as a measure of human ignorance: random events are simply events whose causes are not understood by humans.

The consequences of Stoic physics for their ethics are clear, and are summarized again by Cicero, when he says that Chrysippus aimed at a middle position between what we today would call strict incompatibilism and libertarianism (Griffith 2013). White (2003) interestingly notes in this respect that—just like Spinoza—the Stoics shifted the emphasis from moral responsibility to moral worth and dignity.

In terms of fundamental ontology, the Stoics were anti-corpuscularian (unlike the pre-Socratic Atomists, and Stoics’ chief rivals, the Epicureans), on the grounds that the idea of atoms violated their concept of a seamless unity of the cosmos. It is tempting to see this as in the same ballpark of modern quantum mechanical theories that see the entire universe as constituted of a single “wave function” (Ladyman and Ross 2009), but of course this would be an anachronistic interpretation.

3. The Third Topos: Ethics

Stoic Ethics was not just another theoretical subject, but an eminently practical one. Indeed, especially for the later Stoics, ethics—understood as the study of how to live one’s life—was the point of doing philosophy. It was no easy task: Epictetus famously said (in Discourses III.24.30): “The philosopher’s lecture room is a hospital: you ought not to walk out of it in a state of pleasure, but in pain—for you are not in good condition when you arrive!” The starting point for Epictetus was the famous dichotomy of control, as expressed at the very beginning of the Enchiridion: “We are responsible for some things, while there are others for which we cannot be held responsible” (also translated as “Some things are up to us, other things are not up to us”).

The early Stoics were somewhat more theoretical in their approach, with Zeno, Cleanthes and Chrysippus attempting to both systematize their doctrines and defend them from critiques from both Epicurean and especially Academic-Skeptic quarters. The early Stoa’s famous motto in ethics was “follow nature” (or “live according to nature”), by which they meant both the rational-providential aspect of the cosmos (see Physics above) and more specifically human nature, which they conceived as that of a social animal capable of bringing rational judgment to bear on problems posed by how to live one’s life. (It appears that Zeno’s original articulation of the principle was “live consistently” to which Cleanthes added the clarifying clause “with nature”: Schofield 2003.) Tightly related to this idea of following (human) nature was the Stoic concept of oikeiôsis, often translated as affinity, or appropriation. For the Stoics human beings have natural propensities to develop morally, propensities that begin as what we today would call instincts and can then be greatly refined with the onset of the age of reason at the childhood stage and beyond. It is interesting to note that this naturalistic account of the roots of virtuous/moral behavior is highly compatible with modern findings in both evolutionary and cognitive science (for example, Putnam and others 2014).

Specifically, we naturally: (i) behave in a fashion as to advance our interests and goals (health, wealth, and so forth); (ii) identify with other people’s interests (initially our parents, then friends, then countrymen); (iii) figure out ways to practically navigate the vicissitudes of life. The Stoics related these propensities directly to the four cardinal virtues of temperance, courage, justice and practical wisdom. Temperance and courage are required to pursue our goals, justice is a natural extension of our concern for an ever-increasing circle of people, and practical wisdom (phronêsis) is what best allows us to deal with whatever happens.

Which brings us to the matter of how the virtues are related to each other. To begin with, the Stoics recognized the above mentioned four cardinal virtues, but also a number of more specific ones within each major category (complete list in Sharpe 2014, derived from Stobaeus): for instance, practical wisdom included good judgment, discretion, resourcefulness; temperance could be broken down into propriety, sense of honor, self-control; courage was divided into perseverance, confidence, magnanimity; and justice comprised piety, kindness, sociability. Even so, they held to a view of virtue that is much more unitary than it may come across from this kind of list (Schoefield 2003). The cardinal virtues are derived from Socrates, especially in Plato’s Republic, and so is a certain unifying way of considering the virtues. Justice can be conceptualized as practical wisdom applied to social living; courage as wisdom concerning endurance; and temperance as wisdom with regard to matters of choice. Chrysippus further elaborated this idea of pluralism within an underlying unity, making the virtues essentially inseparable, so that, say, one cannot be courageous and yet intemperate—in the Stoic sense of those words.

Hadot (1998) draws a series of parallels between the four virtues, the three topoi and what are referred to as the three Stoic disciplines: desire, action, and assent. The discipline of desire, sometimes referred to as Stoic acceptance, is derived from the study of physics, and in particular from the idea of universal cause and effect. It consists in training oneself to desire what the universe allows and not to pursue what it does not allow. A famous metaphor here, used by Epictetus, is that of a dog leashed to a cart: the dog can either fight the cart’s movement at every inch, thus hurting himself and ending up miserable; or he can decide to gingerly go along with the ride and enjoy the panorama. This is a version of what Nietzsche eventually called amor fati (love your fate), and that is encapsulated in Epictetus’ phrase “endure [what the universe throws your way] and renounce [what the universe does not allow]” (Fragments 10). Consequently, according to Hadot, the discipline of desire is linked to the virtues of courage (to follow the order of the cosmos) and temperance (to be able to control one’s desires).

The second discipline, of action, is also called Stoic “philanthropy” and is the most prosocial of the cardinal virtues. The basic idea is that human beings ought to develop their natural concern for others in a way that is congruent with the exercise of the virtue of justice. Here the area of study most directly connected to the discipline is that of ethics itself. A representative quote is perhaps the one found in Marcus Aurelius’ Meditations (VIII.59): “Men exist for the sake of one another. Teach them then or bear with them.” The first sentence is a statement of philanthropy (in the Stoic, not modern, sense), while the second one makes it clear that for the emperor this was a duty to be performed either by engaging other people positively or at the very least by suffering their non virtuous behavior, if that is the case.

The last discipline is that of assent, referred to as Stoic “mindfulness” (not to be confused with the variety of Buddhist concepts by the same name, especially the Zen one). I will get back to the concept of assent in the next section, as it is related to the Stoic treatment of the (moral) psychology of emotions, but for now suffice to say that the discipline regards the necessity to make decisions about what to accept or reject of our experience of the world, that is, how to make proper judgments. It is therefore linked to the virtue of practical wisdom, as well as to the area of study of logic. If we had to summarize it in a single sentence, Seneca’s “bring the mind to bear upon your problems” (On Tranquility of Mind, X.4) may be appropriate.

As we have seen so far, Stoic ethics is concerned exclusively with the concept of virtue (and associated disciplines)—whether understood as a unitary thing with a number of facets or otherwise. In this the Stoics were akin to the Cynics and unlike the Peripatetics, who instead allowed that a number of other things are necessary for a eudaimonic life, including (some) wealth, health, education, and so forth. The Peripatetics would not have assented to the idea of a eudaimonic Sage on the rack, a classic Stoic concept.

However, Stoic ethics actually attempts to strike a balance between the asceticism of the Cynics and the somewhat elitist views of the Peripatetics. It does so through the introduction of the Stoic concept of preferred and dispreferred “indifferents” briefly mentioned at the beginning. This is found already in Zeno’s book on Ethics, which is now lost, but about which we know from Diogenes Laertius (VII.4). Zeno distinguished between indifferents that have value (axia) and those that have disvalue (apaxia). The first group included things like health, wealth and education, while the second group was comprised of things like sickness, poverty and ignorance. The move was a brilliant one: as I argued above, it allowed the Stoics to get the best of both the Cynic and the Peripatetic worlds: yes, it is true that—if they don’t get in the way of practicing virtue—some indifferents are preferred; but they are called indifferents for a reason: they do not truly matter for the pursuit of the (moral) eudaimonic life. In other words, while it is undeniable that people naturally and rationally seek the preferred indifferents, it is also the case that one can be a person of moral integrity, achieving eudaimonia, regardless of one’s material circumstances.

There is much more to be said about Stoic ethics, of course, but before closing this introductory sketch let me comment on an issue that does not fail to come up, and which I have already briefly mentioned above: the connection between the undeniably teleological-providential views of the cosmos advanced by Stoic physics and the actual practice of Stoic ethics. The issue is this: given that the Stoic themselves insisted that the study of physics (and of logic) influences how we understand ethics, and given that they believed in the providential nature of the cosmos, does that mean that only people who accept the latter view can pursuit eudaimonia? The generally accepted answer is no.

Gregory Vlastos (referred to in Schofield 2003) convincingly argued that what he called the “theocratic” principle does affect one’s conception of the relation between virtue and the order of the cosmos, specifically because it tells us that being virtuous is in agreement with such order. Crucially, however, Vlastos maintains that this does not change the content of virtue, nor does it affect one’s conception of eudaimonia. This is so because although the “physics” (which, remember, is a combination of natural sciences and metaphysics, and hence theology) does inform the ethics, it does so in what modern philosophers would call an underdetermined fashion: while ethics is not independent of physics (or logic), in the Stoic system, it also cannot be read directly off it. Stoic ethics is naturalistic, and thus very modern in nature, but it—to put it in rather anachronistic terms—does not simplistically erase Hume’s is/ought divide.

Vlastos’ position finds plenty of textual support from a number of Stoic sources, perhaps no more obviously so than in Marcus, as already reported. There are, however, other passages in the classical Stoic literature that do not lend themselves to a clear cut position on the matter, such as this one from Epictetus: “What does it matter to me […] whether the universe is composed of atoms or uncompounded substances, or of fire and earth? Is it not sufficient to know the true nature of good and evil, and the proper bounds of our desires and aversions, and also of our impulses to act and not to act; and by making use of these as rules to order the affairs of our life, to bid those things that are beyond us farewell? It may very well be that these latter things are not to be comprehended by the human mind, and even if one assumes that they are perfectly comprehensible, well what profit comes from comprehending them? And ought we not to say that those men trouble in vain who assign all this as necessary to the philosopher’s system of thought? […] What Nature is, and how she administers the universe, and whether she really exists or not, these are questions about which there is no need to go on to bother ourselves” (Fragments 1). Please remember that for the Stoics “nature” was synonymous with “god.”

Indeed, it is because of this and other passages that Ferraiolo (2015), for instance, concludes that: “metaphysical doctrines about the nature and existence of God, and a rationally governed cosmos, are rather cleanly separable from Stoic practical counsel, and its conductivity to a well-lived, eudaimonistic life. Stoicism may have developed within a worldview infused with presuppositions of a divinely-ordered universe … but the efficacy of Stoic counsel is not dependent upon creation, design, or any form of intelligent cosmological guidance.”

On balance, it seems fair to say that the ancient Stoics did believe in a (physical) god that they equated with the rational principle organizing the cosmos, and which was distributed throughout the universe in a way that can be construed as pantheistic. While it is the case that they maintained that an understanding of the cosmos informs the understanding of ethics, construed as the study of how to live one’s life, it can also be reasonably argued that Stoic metaphysics underdetermined—on the Stoics’ own conception—their ethics, thus leaving room for a “God or Atoms” position that may have developed as a concession to the criticisms of the Epicureans, who were atomists.

4. Apatheia and the Stoic Treatment of Emotions

The naturalistic system of ethics developed by the Stoics bridges what would later be referred to as the is/ought gap by way of a sophisticated account of human developmental moral psychology (Brennan 2003). This section focuses on a related, major difference between Stoics and Epicureans, which begins with the respective use of two key terms indicating a desirable state of mind according to the two schools, and continuing with a broader discussion of the Stoic classification of emotions (or “passions”).

As we have seen, Epictetus explains in a number of places where the Stoa differs from the Garden (for example, “Against Epicurus,” Discourses I.23), while Seneca tells his friend Lucilius that he happily borrows from Epicurus when it makes sense, as it is his “custom to cross even into the other camp, not as a deserter but as a spy” (Letter II, A beneficial reading program, in the new translation by Graver and Long 2015).

Recall that the Stoics thought the pivotal thing in life is virtue and its cultivation, while the Epicureans thought that the point was to seek moderate pleasure and especially avoid pain. Nonetheless, both schools thought that a crucial component of eudaimonia (the flourishing life) was something very similar, to which the Stoics referred to as apatheia and the Epicureans as ataraxia. There are, however, some differences between the two concepts, especially in the way the two schools taught how one could achieve, or at the least approximate, the respective states of mind.

The IEP article on Epictetus defines the two terms in the following fashion:

apatheia: freedom from passion, a constituent of the eudaimôn life

ataraxia: imperturbability, literally “without trouble,” sometimes translated as “tranquillity”; a state of mind that is a constituent of the eudaimôn life

So, both apatheia and ataraxia are components of the eudaimonic life, and indeed, while the second term is usually associated with the Epicureans, both schools used it.

As far as the Stoics are concerned, however, it is good to remember that “passion” did not mean what we now mean by that term, and indeed it did not even exactly overlap with the term “emotion” in the modern sense of the word. That is why it is grossly incorrect to say that the Stoics aimed at a passionless life, or at the suppression of emotions. Rather, the Stoics divided the “passions” into unhealthy and healthy ones. The first group included pain, fear, craving, and pleasure. The second one “discretion,” “willing,” and “delight.” The latter were the opposite of the first group, except for pain, which does not have a positive counterpart. Here is a summary diagram:

stoicism-passions-table

A diagram of the Stoic passions

For the Stoics, then, the “passions” are not automatic, instinctive reactions that we cannot avoid experiencing. Instead, they are the result of a judgment, giving “assent” to an “impression.” So even when you read a familiar word like “fear,” don’t think of the fight-or-flight response that is indeed unavoidable when we are suddenly presented with a possible danger. What the Stoics meant by “fear” was what comes after that: your considered opinion about what caused said instinctive reaction. The Stoics realized that we have automatic responses that are not under our control, and that is why they focused on what is under our control: the judgment rendered on the likely causes of our instinctive reactions, a judgment rendered by what Marcus Aurelius called the ruling faculty (in modern cognitive science terminology: the executive function of the brain).

The Stoic view of emotions finds very nice parallels in modern neuroscience. For instance, Joseph LeDoux (2015) makes the important, if often neglected, point that there is a difference between what neuroscientists mean by “emotion” and what psychologists mean. Neuroscientifically, fear, for example, is the result of a defense and reaction mechanism that is involuntary and nonconscious, and whose major neural correlate is the amygdala. But what psychologists refer to when they talk of “fear” is a more complex emotion, constructed in part of the basic defense and reaction mechanism, to which the conscious mind adds cognitive interpretation, something very similar to the Stoic concept. The two meanings are not in contradiction, but are rather complementary. The cognitive interpretation of the raw emotion of fear, then, is brought about by a combination of one’s memories, cultural upbringing, deliberative thinking, and so forth. The Stoics clearly referred to the psychological, not the neuroscientific meaning of emotion as “passions,” and LeDoux’s own research seems to support the Stoic account and the practicability of their discipline of assent, seen in the previous section.

Going back to the above diagram: pain is not the simple sensation of pain, but the failure to avoid something that we mistakenly judge bad. Similarly for the other pathê: fear is the irrational expectation of something bad or harmful; craving is the irrational striving for something mistakenly judged as good; and pleasure is the irrational elation over something that is actually not worth choosing. Contrariwise, the eupatheiai are the result of a rational aversion of vice and harmful things (discretion), a rational desire for virtue (willing), and a rational elation over virtue (delight). (It should be clear now why there is no such thing as a rational emotional pain.)

All of the above is why apatheia is best construed as equanimity in the face of what the world throws at us: if we apply reason to our experience, we will not be concerned with the things that do not matter, and we will correspondingly rejoice in the things that do matter.

There is another crucial difference between the two schools to be highlighted here: they get to apatheia/ataraxia by very different routes. The Epicureans sought ataraxia as a goal, achieved most of all through the avoidance of pain, which meant especially to withdraw from social and political life. It was good, for Epicurus, to cultivate your close friendships, but attempting to play a full role in the polis was a sure way to experience pain (physical or mental), and therefore it was to be avoided. For the Stoics, on the contrary, the goal was the exercise of virtue, which led them to embrace their social role. Marcus Aurelius, for instance, constantly writes in the Meditations that we need to get up in the morning and do the job of a human being, which he interprets to mean to be useful to society. Hierocles elaborated on the Cynic/Stoic concept of cosmopolitanism. The motto of the school was “follow nature,” by which it was meant, as we have seen, the human nature of a social animal capable of rational judgment. And of course one of the four virtues examined in the previous section is justice, and one of the three disciplines is that of action—both explicitly prosocial. Apatheia, then, was not a goal for Stoics, but an advantageous byproduct (a preferred indifferent, so to speak) of living the virtuous life.

5. Stoicism after the Hellenistic Era

As Long (2003) has remarked, Stoicism has had a pervasive, yet largely unacknowledged influence on Western philosophical thought throughout the Middle Ages, Renaissance, and into modern times. Among the philosophers that he lists as being directly or indirectly affected by Stoicism are Augustine, Thomas More, Descartes, Spinoza, Leibniz, Rousseau, Adam Smith, and Kant, to which we can easily add David Hume. During the Renaissance both Stoic books, particularly Epictetus’ Enchiridion and Seneca’s Letters, and books favorable to Stoicism, like Cicero’s De Officiis, were widely read.

Christianity was far more sympathetic to Stoicism than to its main rival, Epicureanism (and it also absorbed elements of Platonism in its “neo” form). The Epicurean emphasis on pleasure, as well as their metaphysics of cosmic chaos, where prima facie incompatible with Christian theology. The case of Stoicism was more complex. On the one hand, the Stoic insistence on materialism and pantheism was criticized and rejected; on the other hand, the idea of the Logos could easily be adapted—if in a fashion that the Stoics themselves would not have recognized—and the emphasis on virtue was often seen as pretty much the best that people could manage before the coming of Christ.

This is why we find an interestingly mixed record of Christian attitudes toward Stoicism. Augustine initially wrote favorably about it, while later on he was more critical. Tertullian was positively inclined toward Stoicism, and versions of the Enchiridion were commonly used (with Paul replacing Socrates) in monasteries. Peter Abelard and John of Salisbury were influenced by Stoic ethics too, while Thomas Aquinas was critical, especially of an early attempt at reviving Stoicism made by David of Dinant at the beginning of the 13th century.

A major revival of Stoicism did eventually take place, during the Renaissance, largely because of the work of Justus Lipsius (1547-1606). He was a humanist and classic philologist who published critical editions of Seneca and Tacitus. His major opus was De Constantia (1584), where he argued that Christians can draw on the resources of Stoicism during troubled times, while at the same time carefully pointing out aspects of Stoicism that are unacceptable for a Christian. Lipsius also drew on Epictetus, whose Enchiridion had first been translated in English a few years earlier. Other Neostoics included the French statesman Guillaume Du Vair, the churchman Pierre Charron, the Spanish author Francisco de Quevedo, and most importantly Michel de Montaigne, who wrote one of his essays in defense of Seneca.

The reception of Neostoicism was mixed. Even before Lipsius, Calvin had strongly criticized the “novi Stoici” for their revival of the idea of apatheia, and later critics included Pascal. In part in order to preempt such reactions, according to Sellars, one of the Neostoic texts began with the following cautious endorsement: “philosophie in generall is profitable unto a Christian man, if it be well and rightly used: but no kinde of philosophie is more profitable and neerer approaching unto Christianitie than the philosophie of the Stoicks.” Despite the interest in Stoicism displayed by other Renaissance figures, even outside of philosophy (for example, the poet Petrarch), Neostoicism never really became a movement, and its import largely rests on the impact of Lipsius’ writings, and perhaps on the influence of Montaigne.

Arguably the most important modern philosopher to be influenced by Stoicism is Spinoza, who was in fact accused by Leibniz to be a leader of the “sect” of the new Stoics, together with Descartes (Long 2003). There are indeed a number of striking similarities between the Stoic conception of the world and Spinoza’s. In both cases we have an all-pervasive God that is identified with Nature and with universal cause and effect. While it is true that the Stoic understanding of the cosmos was essentially dualistic—in contrast with Spinoza’s monism—the Stoic “active” and “passive” principles were nonetheless completely entwined, ultimately yielding an essentially unitary reality. Long points out, however, that a major difference was Spinoza’s concept of God’s infinite attributes and extension, in marked contrast to the finite (if eternal) God of the Stoics: “the upshot of both systems is a broadly similar conception of reality—monistic in its treatment of God as the ultimate cause of everything, dualistic in its two aspects of thought and extension, hierarchical in the different levels or modes of God’s attributes in particular beings, strictly determinist and physically active through and through.” He goes on to remark that the similarities are even more marked in terms of ethics, and “Spinoza’s ethics becomes transparently and profoundly Stoic.” That said, another major difference is that Spinoza did not believe in an underlying teleology to the world. For him Nature has no aim and God does not direct the cosmic drama. Indeed, as Long puts it: “If the Stoics had taken Spinoza’s route of denying divine providence, they would have avoided a battery of objections brought against them from antiquity onward.” In an important sense, perhaps, one can think of Spinoza as updating the Stoic system to modern times, a project that is currently seeing a number of concerted efforts.

Finally, there is also a connection between the Stoics and Kant, particularly in their shared concept of duty which transcends the specific consequences of one’s action. But as Long again points out, the differences are also quite striking: while Kant arrived at his system by a priori reasoning, the Stoics were eminently naturalistic and empiricist at heart. This is a major distinction between a deontological system like Kant’s and a eudaemonistic one like the Stoic, and it is only with the recent resurgence of virtue ethics in contemporary philosophy (Foot 1978, 2001; MacIntyre 1981/2013; Nussbaum 1994) that the ground was laid out for yet another revival of Stoicism as a practical moral philosophy.

6. Contemporary Stoicism

The 21st century is seeing yet another revival of virtue ethics in general and of Stoicism in particular. The already mentioned work by philosophers like Philippa Foot, Alasdair MacIntyre, and Martha Nussbaum, among others, has brought back virtue ethics as a viable alternative to the dominant Kantian-deontological and utilitarian-consequentialist approaches, so much so that a survey of professional philosophers by David Bourget and David Chalmers (2013) shows that deontology is (barely) the leading endorsed framework (26% of respondents), followed by consequentialism (24%) and not too far behind by virtue ethics (18%), with a scatter of other positions gathering less support. Of course ethics is not a popularity contest, but these numbers indicate the resurgence of virtue ethics in contemporary professional moral philosophy.

When it comes more specifically to Stoicism, new scholarly works and translations of classics, as well as biographies of prominent Stoics, keep appearing at a sustained rate. Examples include the superb Cambridge Companion to the Stoics (Inwood 2003), individual chapters of which have been cited throughout this entry; an essay on the concept of Stoic sagehood (Brouwer 2014); a volume on Epictetus (Long 2002); a contribution on Stoicism and emotion (Graver 2007); the first new translation of Seneca’s letters to Lucilius in a century (Graver and Long 2015); a new translation of Musonius Rufus (King 2011); a biography of Cato the Younger (Goodman 2012); one of Marcus Aurelius (McLynn 2009); and two of Seneca (Romm 2014 and Wilson 2014); and the list could continue.

In parallel with the above, Stoicism is, in some sense, returning to its roots as practical philosophy, as the ancient Stoics very clearly meant their system to be primarily of guidance for everyday life, not a theoretical exercise. Indeed, especially Epictetus is very clear in his disdain for purely theoretical philosophy: “We know how to analyze arguments, and have the skill a person needs to evaluate competent logicians. But in life what do I do? What today I say is good tomorrow I will swear is bad. And the reason is that, compared to what I know about syllogisms, my knowledge and experience of life fall far behind” (Discourses, II.3.4-5). Or consider Marcus’ famous injunction: “No longer talk at all about the kind of man that a good man ought to be, but be such” (Meditations, X.16).

The Modern Stoicism movement traces its roots to Victor Frankl’s (Sahakian 1979) logotherapy, as well as to early versions of Cognitive Behavioral Therapy, for instance in the work of Albert Ellis (Robertson 2010). But Stoicism is a philosophy, not a therapy, and it is in the works of philosophers such as William Irvine (2008), John Sellars (2003), and Lawrence Becker (1997) that we find articulations of 21st century Stoicism, though the more self-help oriented contribution by CBT therapist Donald Robertson (2013) is also worthy of note. All of these authors attempt to distance the philosophical meaning of “Stoic”—even in a modern setting—from the common English word “stoic,” indicating someone who goes through life with a stiff upper lip, so to speak. While there are commonalities between “Stoic” and “stoic,” for instance the emphasis on endurance, the latter is a diminutive version of the former, and the two should accordingly be kept distinct.

Perhaps the most comprehensive and scholarly attempt to update (as opposed to simply explain) Stoicism for modern audiences comes from Becker (1997), though a more accessible treatment is offered by Irvine (2008). One of Irvine’s major contributions is shifting from Epictetus’ famous dichotomy of control to a more reasonable trichotomy: some things are up to us (chiefly, our judgments and actions), some things are not up to us (major historical events, natural phenomena), but on a number of other things we have partial control. Irvine recasts the third category in terms of internalized goals, which makes more sense of the original dichotomy. Consider his example of playing a tennis match. The outcome of the game is under your partial control, in the sense that you can influence it; but it is also the result of variables that you cannot control, such as the skill of your opponent, the fairness of the referee, or even random gusts of wind interfering with the trajectory of the ball. Your goal, then, suggests Irvine, should not be to win the game—because that is not entirely within your control. Rather, it should be to play the best game you can, since that is within your control. By internalizing your goals you can therefore make good sense of even the original Epictetean dichotomy. As for the outcome, it should be accepted with equanimity.

Becker (1997) is more comprehensive and even includes a lengthy appendix in which he demonstrates that the formal calculus he deploys for his normative Stoic logic is consistent, suggesting also that it is complete. There are three important differences between his New Stoicism and the ancient variety: (i) Becker defends an interpretation of the inherent primacy of virtue in terms of maximization of one’s agency, and builds an argument to show that this is, indeed, the preferred goal of agents that are relevantly constituted like a normal human being; (ii) he interprets the Stoic dictum, “follow nature” as “follow the facts” (that is., abide by whatever picture of the universe our best science allows), consistently with Stoic sources attesting to their respect for what we would today call scientific inquiry, as well as with an updated Stoic approach to epistemology; and (iii) Becker does away with the ancient Stoic teleonomic view of the cosmos, precisely because it is no longer supported by our best scientific understanding of things. This is also what leads him to make his argument for virtue-as-maximization-of-agency referred to in (i) above. Whether Becker’s (or Irvine’s, or anyone else’s) attempt will succeed or not remains to be seen in terms of further scholarship and the evolution of the popular movement.

That movement has grown significantly in the early 21st century, manifesting itself in a number of forms. There is a good number of high quality blogs devoted to practical modern Stoicism. There is also a significant presence on social networks, for instance the Stoicism Group on Facebook.

7. Glossary

The Stoics were well known (some would say infamous) for having developed a rich technical vocabulary. Cicero, in book III of De Finibus, explicitly says that Zeno invented a number of new terms, and he feels that Latin is not a sufficiently sophisticated tongue to render all the subtleties of Greek thought. Below are some of the major Stoic terms and their meanings.

Andreia = courage, fortitude, one of the four Stoic cardinal virtues.

Apatheia = tranquility, overcoming disturbing desires and emotions.

Apoproēgmena = dispreferred indifferents, externals, outside of virtue that—other things being equal—should be avoided.

Aretê = virtue, excellence at one’s function. For Becker this is equivalent to the perfection of agency.

Ataraxia = absence of fear, largely an Epicurean concept, but also adopted by the Stoics.

Dikaiosynê = justice, integrity, one of the Stoic cardinal virtues.

Eu̯dai̯monía = flourishing, by means of living an ethical life.

Eupatheiai = the healthy passions cultivated by the Sage.

Hormê = the discipline of action.

Kathēkon = appropriate, rational, action, the thing one ought to do.

Logos = rational principle governing the universe.

Oikeiôsis = something properly yours, leading to Hierocles’ circle of expanding affection, Stoic cosmopolitanism.

Orexis = the discipline of desire.

Philanthrôpia = love of mankind, related to the concept of Oikeiôsis.

Phronȇsis = practical wisdom, one of the Stoic cardinal virtues.

Proēgmena = preferred indifferents, externals, outside of virtue that—other things being equal—can be pursued unless they compromise one’s virtue.

Propatheiai = involuntary emotional reactions, to which one has not yet given or withdrawn assent.

Prosochê = applying key ethical precepts to the present moment, mindfulness.

Sôphrosynê = self-discipline, temperance, one of the Stoic cardinal virtues.

Sunkatathesis = the discipline of assent.

 

8. References and Further Readings

  • Algra, K. (2003) Stoic theology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Almeder, R. (1995) Externalism and justification. Philosophia 24:465-469.
  • Angere, S. (2007) The defeasible nature of coherentist justification. Synthese 157:321-335.
  • Beaney, M. (1997) The Frege Reader. Blackwell.
  • Becker, L.C. (1997) A New Stoicism. Princeton University Press.
  • Bobzien, S. (2003) Logic. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bobzien, S. (2006) Ancient logic. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/entries/logic-ancient/ (accessed on 22 December 2015)
  • Bourget, D. and Chalmers, D.J. (2013) What do philosophers believe? Philosophical Studies 3:1-36.
  • Brennan, T. (2003) Stoic moral psychology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Broadie, S. & Rowe, C. (eds.) (2002) Aristotle — Nicomachean Ethics. Oxford University Press.
  • Brouwer, R. (2014) The Stoic Sage: The Early Stoics on Wisdom, Sagehood and Socrates. Cambridge University Press.
  • Brunschwig, J. (2003) Stoic metaphysics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bugh, G.R. (1992) Athenion and Aristion of Athens. Phoenix 46:108-123.
  • Cicero, C.T. (2014) Complete Works. Delphi Classics.
  • Colish, M. (1985) The Stoic Tradition from Antiquity to the Early Middle Ages. E.J. Brill.
  • Diogenes Laertius (trans. by R.D. Hicks) (2015) Lives of the Eminent Philosophers. Delphi Classics.
  • Epictetus (trans. by R. Dobbin) (2008) Discourses and Selected Writings. Penguin
  • Ferraiolo, W. (2015) God or Atoms: Stoic Counsel With or Without Zeus. International Journal of Applied Philosophy 29:199-205.
  • Foot, P. (1978) Virtues and Vices. Blackwell.
  • Foot, P. (2001) Natural Goodness. Clarendon Press.
  • Frede, M. (1983). Stoics and skeptics on clear and distinct impressions. In: M.F. Burnyeat (ed.), The Skeptical Tradition. University of California Press, pp.65–93.
  • Frede, D. (2003) Stoic determinism. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Gill, C. (2003) The School in the Roman Imperial period. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Goldman, A. (1980) The internalist conception of justification. Midwest Studies in Philosophy 5:27-52.
  • Goldman, A. (1994) Naturalistic epistemology and reliabilism. Midwest Studies in Philosophy 19:301-320.
  • Goodman, R. (2012) Rome’s Last Citizen: The Life and Legacy of Cato, Mortal Enemy of Caesar. Thomas Dunne.
  • Goulet, R. (2013) Ancient philosophers: a first statistical survey. In: M. Chase, R.L. Clark, and M. McGhee (eds.) Philosophy as a Way of Life: Ancients and Moderns — Essays in Honor of Pierre Hadot. John Wiley & Sons.
  • Graver, M. (2007) Stoicism and Emotion. University Of Chicago Press.
  • Graver, M. and Long, A.A. (translators) (2015) Letters on Ethics: To Lucilius. University of Chicago Press.
  • Griffith M. (2013) Free Will: The Basics. Routledge.
  • Hadot, P. (1998) The Inner Citadel: the Meditations of Marcus Aurelius. Trans. by M. Chase, Cambridge University Press.
  • Hankinson, R.J. (2003) Stoic epistemology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Inwood, B. (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Irvine, W.B. (2008) A Guide to the Good Life: The Ancient Art of Stoic Joy. Oxford University Press.
  • King, C. (2011) Musonius Rufus: Lectures and Sayings. CreateSpace.
  • Ladyman, J. and Ross, D. (2009) Every Thing Must Go: Metaphysics Naturalized. Oxford University Press.
  • LeDoux, J. (2015) Anxious: Using the Brain to Understand and Treat Fear and Anxiety. Viking.
  • Lipton, P. (2003) Inference to the Best Explanation. Routledge.
  • Long, A.A. (2002) Epictetus: A Stoic and Socratic Guide to Life. Oxford University Press.
  • Long, A.A. (2003) Stoicism in the philosophical tradition: Spinoza, Lipsius, Butler. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • MacIntyre, A. (1981/2013) After Virtue. A&C Black.
  • Marcus Aurelius (trans. by G. Long) (1997) Meditations. Dover.
  • McBrayer, G.A., Nichols, M.P., and Schaeffer, D. (2010) Euthydemus. Focus.
  • McLynn, F. (2009) Marcus Aurelius: A Life. Da Capo Press.
  • Nussbaum, M. (1994) The Therapy of Desire: Theory and Practice in Hellenistic Ethics. Princeton University Press.
  • O’Connor, D.J. (1975) The Correspondence Theory of Truth. Hutchinson.
  • Osler, M.J. (1991) Atoms, pneuma and tranquillity: Epicurean and Stoic Themes in European Thought. Cambridge University Press.
  • Putnam, H., Neiman, S., and Schloss, J.P. (eds.) (2014) Understanding Moral Sentiments: Darwinian Perspectives? Transaction Publishers.
  • Robertson, D. (2010) The Philosophy of Cognitive-behavioural Therapy (CBT): Stoic Philosophy as Rational and Cognitive Psychotherapy. Karnac Books.
  • Robertson, D. (2013) Stoicism and the Art of Happiness – Ancient Tips For Modern Challenges: Teach Yourself. Teach Yourself.
  • Romm, J. (2014) Dying Every Day: Seneca at the Court of Nero. Knopf.
  • Sahakian, W.S. (1979) Logotherapy’s Place in Philosophy. In: Logotherapy in Action. J. Fabry, R. Bulka, and W.S. Sahakian (eds.), foreword by Viktor Frankl. Jason Aronson.
  • Schofield, M. (2003) Stoic ethics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sedley, D. (2003) The School, from Zeno to Arius Didymus. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sellars, J. (2003) The Art of Living: The Stoics on the Nature and Function of Philosophy. Ashgate.
  • Seneca, L.A. (2014) Complete Works of Seneca the Younger. Delphi Classics.
  • Sharpe, M. (2014) Stoic virtue ethics. In: S. van Hooft and N. Athanassoulis (eds.), The Handbook of Virtue Ethics. Acumen Publishing.
  • Taran, L. (1971) The creation myth in Plato’s Timaeus. In: J.P. Anton, G.L. Kustas and A. Preus (eds.) Essays in Ancient Greek Philosophy. State University of New York Press, 372-407.
  • Tieleman, T. (2002) Galen on the seat of the intellect: anatomical experiment and philosophical tradition. In: C. Tuplin (ed.) Science and Mathematics in Ancient Greek Culture. Oxford University Press, pp. 256-273.
  • Tye, M. (1994) Sorites paradoxes and the semantics of vagueness.  Philosophical Perspectives 8:189-206.
  • Ungerer, R.M. and Smolin, L. (2014) The Singular Universe and the Reality of Time: A Proposal in Natural Philosophy. Cambridge University Press.
  • Verbeke, G. (1983) The Presence of Stoicism in Medieval Thought. Catholic University of America Press.
  • White, M.J. (2003) Stoic natural philosophy. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Wilson, E. (2014) The Greatest Empire: A Life of Seneca. Oxford University Press.

 

Author Information

Massimo Pigliucci
Email: mpigliucci@ccny.cuny.edu
City University of New York
U. S. A.

Aesthetics in Continental Philosophy

Klee angelus novus, 1920Although aesthetics is a significant area of research in its own right in the analytic philosophical tradition, aesthetics frequently seems to be accorded less value than philosophy of language, logic, epistemology, metaphysics, and other areas of value theory such as ethics and political philosophy. Many of the most prominent analytic philosophers have not written on aesthetics at all. Matters stand very differently in continental philosophy, where aesthetics has been given an important place by nearly every major thinker and tradition. There are undoubtedly important extra-philosophical reasons for this—such as the importance of art in European education and tradition and the French model of the philosophe as philosopher-writer—but there are also clearly philosophical reasons. In the analytic tradition, meaning and truth are frequently thought to be exemplified by logic, science, and the formal structures of language, whereas in continental philosophy, art has often taken this role of exemplifying meaning and truth. As such, art becomes akin to a philosophical activity insofar as it is thought to produce meaning and truth, and aesthetics takes an important place because it is seen as a branch of philosophy which gives access to some of philosophy’s perennially central concerns. Moreover, while the analytic tradition tends to abstract aesthetic questions from other concerns, the continental tradition tends to think about its role in relation to epistemology and metaphysics, to emphasise art’s historical and social situatedness, and to ask questions concerning its role and value in culture, politics, and everyday life. However, and in further contrast to analytic aesthetics, there is no general consensus concerning central topics of debate in continental aesthetics. Instead, and following a method of organisation typical of continental philosophy, this area of aesthetics may be approached according to major traditions and thinkers. This article gives a synoptic overview of such in the twentieth and twenty-first centuries. The ideas developed by each often remain highly unique, yet they have also influenced and reacted against each other (and these points of contact are marked within the article). Most of these developments have taken place in critical relation with modern and nineteenth-century aesthetics, especially as exemplified by the works of Immanuel Kant, G.W.F. Hegel, and Friedrich Nietzsche. Kant’s Critique of the Power of Judgement (1790) has been particularly important in shaping debates in later continental aesthetics, since on the one hand it stakes out aesthetics as a domain autonomous in relation to other areas of philosophical concern, such as epistemology and practical philosophy, and on the other it shows how this domain has relevance for other areas, (In Kant’s system, aesthetics provides a model for how judgement acts as a power that can unify the other branches of philosophical interest.)

Table of Contents

  1. The Position of Aesthetics in Continental Philosophy
  2. Phenomenology and Existentialism
    1. Introduction
    2. Heidegger
    3. Merleau-Ponty
  3. Hermeneutics
    1. Introduction
    2. Gadamer
  4. Psychoanalysis
    1. Introduction
    2. Freud
    3. Lacan
  5. Critical Theory
    1. Introduction
    2. Benjamin
    3. Adorno
  6. Poststructuralism
    1. Introduction
    2. Derrida
    3. Lyotard
    4. Deleuze
  7.  Developments in the Early 21st Centrury
    1. Introduction
    2. Rancière
    3. References and Further Reading

1. The Position of Aesthetics in Continental Philosophy

The importance and scope of aesthetics in continental philosophy may be indicated at the outset by taking the relatively ‘canonical’ example of Heidegger’s reading of Nietzsche on art. While the specific views about art and aesthetics expressed in this reading do not extend their influence to all traditions and thinkers within continental philosophy, the example gives a good indication of the dominant role aesthetics frequently takes in such traditions. During his first lecture course on Nietzsche, ‘The Will to Power as Art’, Heidegger sets out five statements on art:

  1. Art is the most perspicuous and familiar configuration of will to power;
  2. Art must be grasped in terms of the artist;
  3. According to the expanded concept of artist, art is the basic occurrence of all beings; to the extent that they are, beings are self-creating, created;
  4. Art is the distinctive countermovement to nihilism;
  5. Art is worth more than ’the truth’.

In addition to the above, and taking preeminent place as an expression of Nietzsche’s entire thinking on art, Heidegger adds the following major statement on art: art is the greatest stimulant of life.

These theses indicate that for (Heidegger’s) Nietzsche, art is far more than a pleasant diversion; it has profound ontological, cultural, political, and existential significance, and is even worth more than truth itself. Heidegger expands these theses as follows. Nietzsche’s ontology is that of the ‘will to power’, in which Being as a whole is understood in terms of shifting relations of confluent and conflictual forces, producing the creation and destruction of particular beings. Art is privileged both as an expression of will to power, and as a being which gives us special insight into the nature of Being as a whole as will to power.  The first statement suggests that of all types of being, art is that which is most clearly accessible to us in its essence. Moreover, art does not simply illuminate itself as a particular type of being, but illuminates Being as a whole. The second statement shifts the significance of art from its reception to its creation, and this shift opens up the ontological scope of aesthetics. Art, considered from the perspective of the artist, is then understood in terms of the creative act itself, which illuminates the way that beings in general are ‘brought forth’. Aesthetics, as meditation on art, may then be understood not simply as a consideration of beautiful things, but as ontology, the thinking of the essence of Being as a whole. The third statement tells us that, according to Nietzsche’s ontology, the will to power becomes visible as and with art, understood as paradigmatic for all creation, or productive ‘bringing forth’.

The fourth and fifth statements give art a practical dimension in Nietzsche’s philosophy. Heidegger insists that ‘truth’ in the fifth statement (and in all of Nietzsche’s philosophy) must be understood in a specifically Platonic sense as referring to the supposedly true supersensuous world of the Ideas, in contrast with the untrue sensuous world of mere appearances. For Nietzsche, the old values he associates with nihilism—the decadence of culture and the devaluation of life—are essentially grounded in this Platonic conception of truth through its dominance in the religion, morality, and philosophy of the Western tradition. Art then operates as a countermovement to nihilism on two essential points: first, as a sensuous thing, its very nature is to affirm the value of the sensuous world that nihilism denies; and second, as a paragon expression of will to power it helps us to understand what Nietzsche posits as the necessary grounding principle for the creation of new, non-nihilistic values (the will to power itself). Heidegger develops this reading of Nietzsche’s philosophy of art in order to critique it (see section 2. b. below). Nevertheless, in its clear emphasis on the ontological and practical roles of art, this reading indicates well the significance and scope that aesthetics (understood as philosophical reflection on art in general) has had for twentieth and twenty-first century continental philosophers.

2. Phenomenology and Existentialism

 a. Introduction

Phenomenology is a philosophical method which focuses on the close examination of phenomena, that which appears. In its contemporary sense, it was founded by Edmund Husserl in the early part of the twentieth century. Many philosophers influenced by Husserl’s work developed phenomenology in ways which contributed significantly to aesthetics, including Martin Heidegger, Roman Ingarden, Jean-Paul Sartre, Mikel Dufrenne, Maurice Merleau-Ponty, and Michel Henry. While developed in varying ways by each of these thinkers, in general phenomenology has offered an approach which displaces the traditional aesthetic categories of subject and object. Instead, phenomenology has focused on the examination of aesthetic experience and the work of art in terms of appearance and the conditions of appearance, thought as prior to the categorisation of ourselves and the world into subject and object. Moreover, phenomenological aesthetics has examined the ontology of the work of art as a special kind of thing which appears, with a distinctive character marking it out from the rest of appearances. Art has frequently been given a privileged status as affording us special insight into the way in which things in general come to appear, and how meaning as such is constituted. In this way, for many phenomenologists aesthetics has been closely connected with ontology, epistemology, and value theory.

Existentialism, while stemming from nineteenth century thinkers such as Søren Kierkegaard and Nietzsche, intersects with phenomenology in the thought of many of its prominent twentieth century exponents (for example, Heidegger, Sartre, de Beauvoir, and Merleau-Ponty). Existentialism focuses on the concrete existence of human life. It rejects the adequacy of traditional philosophical thought, which proceeds by way of abstract essences and general categories, to do justice to the lived experience of the individual. For existentialists, art—and, in particular, literature (many existentialist philosophers were also literary authors)—has an advantage over philosophy insofar as it is able to dramatise concrete experience through singular imaginary examples and ‘indirectly communicate’ existential truths. Art is also better able to evoke the irrational—such as sensation, affect, feeling, mood, and everyday, non-theoretical modes of thinking—which existentialists believe is necessary to do justice to the full range of human experience. Existentialism has typically emphasised human freedom, especially the freedom to create values, and art has been taken as a testament to, and an exemplary model for, such creative activity. I will develop some of these themes in further detail here by focusing on two of the most well-known and influential contributors to aesthetics in the tradition of existential phenomenology: Heidegger and Merleau-Ponty.

 b. Heidegger

Edmund Husserl’s most prominent student, Martin Heidegger, combined the method of phenomenology with a deep attention to the history of philosophy, especially that of ancient Greece, to forge one of the most influential philosophical bodies of work of the twentieth century. Heidegger turns his attention to art and aesthetics as part of his wider philosophical project, which seeks to uncover the meaning and truth of Being. Heidegger contends that the history of philosophy, and of western culture generally, has seen a decline with respect to Being, such that today Being has practically become nothing. (This is how he interprets Nietzsche’s thesis of nihilism, the negation of meaning and value as such.) Heidegger understands Being as the way particular beings (or ‘entities’) come to appear as what they are, with the meaning that they have, within particular historical epochs. For Heidegger, Being is a historical process, such that beings appear differently in different epochs. He specifies three main epochs, the ancient, medieval, and modern, to each of which corresponds a leading meaning of Being; that is, a main way in which beings are revealed. Heidegger’s reflections on art and aesthetics appear within his critique of the modern epoch and his attempts to retrieve a deeper understanding of Being from the near-oblivion into which he believes it has fallen.

Heidegger specifies that ‘aesthetics’ is itself a part of this modern tradition: a particular way of viewing art which first became explicit with Alexander Baumgarten’s Aesthetica (1750), then was quickly taken up in successive influential formulations, in particular by Kant (Critique of the Power of Judgement, 1790) and Hegel (Lectures on Fine Art, c. 1818–29). For Heidegger, the distinctive feature of the modern way of revealing beings is to give them the character of subject and object, as per Descartes’ philosophy. The aesthetic view of art follows suit by positioning the artwork as an object, ‘experienced’ by the one who takes in and appreciates it, positioned as subject. Heidegger wants to critique aesthetics as one aspect of modern philosophy, which he critiques in general, in order to allow a different view of art to emerge. According to Heidegger, the modern worldview covers over a more primordial relation of being-in-the-world in which we are immersed in and alongside other beings in the world. Heidegger’s own form of phenomenological philosophy aims to be attentive to beings as they appear in order to overcome the modern presupposition of thinking, which distributes things according to the subject/object divide before we truly encounter them, and to reveal the more primordial character of things which such presuppositions hide. For Heidegger, the modern tendency is particularly pernicious because it views beings according to a framework he calls Gestell, which determines them as resources available to be put to use (Bestand). Such a view distracts us from being attentive to the many other ways beings can be revealed and impoverishes the meaningfulness of the world we inhabit by reducing everything—including human beings themselves—to this narrow scope.

‘The Origin of the Work of Art’ is the most significant and well-know of Heidegger’s essays in which he tries to draw attention to the enigma of art and to sketch an alternative ontology of the artwork. Here, the artwork is described as the setting-to-work of truth. Truth must be understood not in its usual meaning as thought corresponding to facts in the world, but in Heidegger’s specific determination as aletheia, meaning disclosure, uncovering, or revealing. For Heidegger, truth means the way in which beings come to light as what they are and with the meaning they have. Art for him, therefore, has a privileged relation to truth because great art can be an occurrence of truth; that is, it has the power to reveal not just itself, but other beings in a particular way. He expresses this by saying that art ‘sets up a world’ and ‘sets forth the earth’, and that the artwork enacts the ‘strife’ which is the turbulent relation between world and earth. As in much of Heidegger’s writings, these terms are more suggestive than determinate, and remain open to contesting interpretations. They may, however, be roughly glossed as follows. ‘World’ is the network or system of interpretations in which beings appear as what they are and form an open but interrelated whole. The world, in short, is the set of shared meanings of a historical culture. It is the meaning of the artwork as interpreted in that culture, and the effect on cultural meaning in general that the work has. ‘Earth’, on the other hand, refers to the material and sensory thing that the work necessarily must be, and to its capacity to hold dimensions of potential meaning currently concealed but which might, in the future, come to be revealed. The earth is the inexhaustibility of the work, its irreducibility to interpretation by a critic or a culture. For Heidegger, every artwork, at least if it is ‘great’, contains both these dimensions, which exist in a state of tension or strife: world opening up to reveal meaning, earth drawing back to keep meaning opaque. While these terms themselves may seem opaque, arguably they do successfully describe something of the mysterious character of artworks in their ability to simultaneously reward and thwart our desire to understand them. Heidegger also asserts in this essay that the essence of all art is poetry. ‘Poetry’ needs to be understood here in a specific sense, not simply as the pleasing combination of words, but insofar as Heidegger understands poetry as the essence of language, which again has a significant relation to truth. ‘Poetry’ in this sense designates the capacity of language to reveal beings and determine them as what they are in their specific character by naming them, and he asserts here that poetry has a privileged place in the system of the arts by virtue of being that art which most exemplifies the ontological power of all art to reveal. In sum, Heidegger’s phenomenological approach to art aims to subvert the aesthetic tradition’s determination of art as the object of aesthetic experience, and to uncover a deeper meaning in which art may be understood as a site of ontological revealing. This is consonant with his wider project of overcoming the metaphysical tradition, and modern philosophy more specifically, in order to rethink the meaning and truth of Being.

 c. Merleau-Ponty

Albeit in a different way, Merleau-Ponty also seeks to understand art in the context of an ontology which moves beyond the subject/object division characteristic of modern philosophy. This is most evident in his late essay ‘Eye and Mind’ (1960), while the earlier essay ‘Cézanne’s Doubt’ (1945) presents a more existentialist perspective. For Merleau-Ponty, it is as if painting were phenomenological research pursued by other means. In fact—and this contra Sartre—he holds art in a privileged position over philosophy and literature, and accords to painting a higher position at least than music, which for him remains too amorphous to properly render phenomenological insight. The most distinctive feature of Merleau-Ponty’s own brand of phenomenology is the emphasis he puts on the embodied nature of human existence only noted in passing by Heidegger. The body takes on primary importance for Merleau-Ponty as he sees it as the fundamental condition for the appearing of a world. Such appearing takes place through the body’s perceptual system. For Merleau-Ponty, science and philosophy have distorted both our idea of the body and the way perception works by freezing them in idealised third-person representations. The aim of his phenomenological ontology is to return us to a more primordial understanding of our first-person experience of the body as it is lived by us, and of perception as it reveals the world. He gives painting a privilege over science and philosophy because he sees the painter as performing a kind of natural epoche, the phenomenological ‘reduction’ which suspends our commonsense beliefs about things, while he or she tries to see how things genuinely appear to vision and to render what they see on canvas. Moreover, Merleau-Ponty emphasises the necessarily bodily activity of painting, claiming that it would be impossible for a pure mind to paint. This is because painting revels in the bodily conditions that abstract thought tends to idealise and cover, such as the position of the body in space from which any sight is necessarily seen (the viewpoint), the movement of the eye, prevailing lighting conditions affecting the quality of visual perception, and so on.

As well as science and philosophy in general, Merleau-Ponty’s critical targets are Descartes’ theory of vision in his Dioptics (1637), and the ‘linear perspective’ method of constructing the space of painting developed in the Italian Renaissance, which employs precise geometric rules for creating the illusion of three-dimensional space on the two-dimensional surface of the painting. Neither is condemned outright by Merleau-Ponty. Rather, what he criticises in both is a partial truth which abstracts certain elements from our perceptual experience, and elevates them to the pretension of sole and exhaustive truth. Descartes’ theory of vision treats space as a homogenous expanse equally accessible at all points; an abstraction useful for thought, but impossible for any actual body to experience. Linear perspective, for its part, has for centuries been treated as the formula of realism, and has been supposed to present things as they actually appear to the eye. Merleau-Ponty contests this, citing various abstractions and exclusions which operate on real perception in order to construct linear perspective. Two distinctive features on which he concentrates are depth and movement.

First, perspective presents the illusion of depth by varying the sizes of objects relative to ‘parallel’ lines which converge at a vanishing point. Because this method was presented as rendering the true nature of visual space, the theoreticians of the Renaissance had to deny the theorem of Euclid’s Geometry which states that parallel lines never converge. Second, Merleau-Ponty notes that static art such as photography, painting, and sculpture, no matter how supposedly realistic, falsifies reality by excluding time, and hence, motion. Following a suggestion made by Auguste Rodin, he asserts that the phenomenology of movement is best expressed by a paradoxical arrangement in which different aspects of the figure in motion, which would be visible at different times in real life, are presented simultaneously in the artwork. According to his analysis, the truth of movement is better expressed by (for example) Théodore Géricault’s anatomically incorrect painting of racing horses Epsom Derby (1821) than by the gaits of horses photographically captured by Étienne-Jules Marey. What the painter is able to capture, Merleau-Ponty asserts, is not the outside of the object of motion, but motion’s ‘secret cipher’: time rendered visible in an indirect, stylistic manner.

In general for Merleau-Ponty, it is a focus on the primary qualities (those which can be specified with exactness: quantified and rationally calculated, such as extension and form) which is responsible for the intellectual abstractions that distort our understanding of the body’s perception. The supposedly secondary qualities, especially colour, are what reveal more primordial truths about perception in painting.  For him, painting is capable of rendering visible the birth of perception, the way that fully-formed, recognisable objects emerge from a deeper, more primordial, inchoate visual field. This is evident in Paul Cézanne’s painting in the way that rather than beginning with lines which give form to objects and then adding colour, the reverse procedure is evident—dashes of graduated colour build up and give shape to forms. In this way, Merleau-Ponty sees Cézanne and artists in general as performing phenomenological work: they are able to reveal the conditions and processes of perception, which are usually covered over by our focus on the end product, or what is perceived.

Merleau-Ponty’s essay ‘Cézanne’s Doubt’ displays aspects of an existentialist aesthetic by giving attention to the relationship between the life of the artist and the meaning of the artist’s work. This takes place in conversation with Sigmund Freud’s psychoanalytic treatment of Leonardo da Vinci (see section 4. b. below), which is often taken to excessively reduce the meaning of the work to the artist’s psychopathology. Echoing positions worked out in The Phenomenology of Perception (1945) in response to Sartre’s views on radical human freedom, Merleau-Ponty develops a nuanced view of the relationship between freedom and concrete situation in human existence. He asserts that the meaning of the work takes its bearings from the artist’s life but cannot be wholly reduced to it, just as the free transcendence of the subject must take its bearings from the facts of the life which it is born into but to which it is not reducible. For Merleau-Ponty, the individual’s freedom is necessarily determined in relation to facts over which we have no choice (for example, Leonardo’s being abandoned by his father in early childhood), but how we respond to such facts retains an element of choice. At the same time, the artwork produced by an individual will be informed by facts about the artist’s life, yet the meaning of such works will nevertheless defy any attempt to reduce it to such facts. Thus, Merleau-Ponty develops a qualified defence of the psycho-biographical approach to art characteristic of psychoanalytical criticism. According to him, the doubt which plagued Cézanne’s life and work stems from a general feature of situated human existence (albeit one which he felt more keenly than most): our freedom to create meaning is attended by no guarantee that it will become meaningful in the fullest sense of which it is capable; that is, to be accepted in the eyes of others, to transform the world that others inhabit, and to contribute to the store of human culture. In these various ways, then, Merleau-Ponty’s reading of Cézanne exemplifies the concerns of an existentialist aesthetic: to see the meaning of the artwork in relation to the life of the individual artist and to see art itself as exemplifying general features of human existence.

Merleau-Ponty’s essay ‘Eye and Mind’ develops many of the themes above in the context of his late ontology of ‘the visible and the invisible’, which attempts to overcome the subject/object division by invoking new concepts such as ‘flesh’. Against the transparent lucidity of Cartesian consciousness, flesh invokes the thickness and density of the perceptual field: the ambiguous region of the mutual imbrication of the perceiving self and the perceived world in a space in which both overlap without clear dividing line or boundary, but also without either being able to be reduced to the other. The perceiver and perceived cross over and inhabit the same ambiguous space, but this space is non-homogenous, and includes breaks, gaps, and undersides where the two fail to meet, or at least to do so in mutual accord. Merleau-Ponty gives the helpful example of seeing the world from the bottom of a pool of water. In this case, the water itself is like flesh, and the ‘distortions’ of our perception it introduces appear to be between subject and object insofar as they condition how we see objects beyond the pool. For Merleau-Ponty, our whole perceptual field and our body’s being-in-the-world, has this in-between character in all cases, even when this is less evident. Although in this later work he abandons much of the language of ‘classical’ phenomenology, Merleau-Ponty continues the phenomenological project of bringing to light the conditions of appearances which themselves are usually not apparent. Art holds for him a privileged status in its capacity to do this. Merleau-Ponty’s ontology of painting differs markedly from Heidegger’s aesthetics by disputing the latter’s location of the essence of all art in poetry, and insisting that there is a difference between the kind of meaning apparent in language and that which emerges with the visual. (This difference is taken up and developed by Jean-François Lyotard—see section 6. c.)

3. Hermeneutics

 a. Introduction

Hermeneutics is a theory of interpretation and understanding which has roots in Biblical exegesis and developments in German romanticism, but which emerged as a significant branch of contemporary continental philosophy primarily through the work of one of Heidegger’s most prominent students: Hans-Georg Gadamer. Under Gadamer’s influence, hermeneutics has been developed in differing ways by various other leading proponents, such as Paul Ricoeur in France and Gianni Vattimo in Italy. However, it is Gadamer’s work which remains at the core of this tradition, and continues to be most influential in aesthetics.

 b. Gadamer

As a philosophical heir to Heidegger, Gadamer took up aspects of the phenomenological tradition but focused on further developing reflections on interpretation and understanding found in Heidegger’s Being and Time. The relevance of Gadamer’s hermeneutics for aesthetics can be understood to have two parts. First, in his 1960 magnum opus Truth and Method, art and aesthetic experience are used as an example to defend the kind of understanding appropriate to the human sciences from methodological scientism and to develop a general theory of hermeneutics. Second, in various later essays, Gadamer wrote more explicitly about the hermeneutic character of the experience of artwork and of specific literary and artistic works. Of these later essays, the most significant for developing a hermeneutic philosophy of art is ‘The Relevance of the Beautiful’ (1964).

Gadamer’s general approach to hermeneutics developed in Truth and Method seeks to defend the human sciences from reduction to the kind of methodology appropriate for the natural sciences. Rather than trying to gain objective knowledge about reality in the way natural sciences do, Gadamer argues that the human sciences aim at understanding works (theoretical or literary texts, artworks, and other cultural products), which themselves cannot be understood to be entirely separate from the one who seeks to understand. Rather, both are inscribed within a common horizon of tradition. Tradition (Überlieferung), he insists, must not be understood in the static sense of conserving what already exists but rather as a transmission through which works from a world different to the one we inhabit are passed down to us. Understanding then becomes something like a translation, in which the aim is a ‘fusion of horizons’ of the world of the work and the world we inhabit. Both ourselves, as interpreters, and the work interpreted are transformed by these acts of interpretation, so that we cannot speak of understanding in the human sciences on the model of a subject correctly representing external natural objects to itself. Tradition, for Gadamer, is a matter of constant change and transformation of understanding through acts of interpretation within a continuous overarching horizon.  In the first section of Truth and Method, aesthetic experience and the artwork stand as paradigm examples to show why understanding in the human sciences differs from explanation in the natural sciences, as propaedeutic to working out a general theory of hermeneutics.

Turning to the later essays and Gadamer’s more explicit ‘aesthetics’, we continue to see Heidegger’s influence as well as a number of significant differences. Like Heidegger, Gadamer seeks to challenge and overcome what has been called aesthetics in modern philosophy but for different reasons and with a different aim in mind. While Heidegger concerns himself primarily with the kind of monumental art that can open and sustain a world, Gadamer is interested in any and all kinds of aesthetic experiences, including more mundane ones. Gadamer objects to Kant’s thesis of the disinterested nature of aesthetic experience, insisting instead that such experience brings a cognitive content which connects artworks with our understanding of other features of the world, and other types of experience, including our ‘interests’. Artworks are encountered and understood as part and parcel of the general fabric of interpretations we weave as we encounter the wider world. It is this supposedly disinterested nature of aesthetic experience which Gadamer believes needs to be overcome in modern aesthetics in order to make way for a more genuine understanding of, and possibility for, encountering artworks. In this way, he asserts, modern philosophical aesthetics should be overcome by being absorbed into hermeneutics.

In ‘The Relevance of the Beautiful’, Gadamer applies his hermeneutic approach to illuminate the nature of the work of art. The problem he sets himself to understand is the nature of the artwork considered trans-historically, so that we may understand what ‘art’ means such that the same word refers to the works of the ancient world and to contemporary experimental arts, such as non-objective painting. He proposes that we may do this with reference to the anthropological basis of our experience of art, which he develops through three key ideas: play, symbol, and festival. Like Freud (see section 4. b. below), though without any direct reference, Gadamer links the phenomenon of art as a human activity to that of play. What is significant about play for Gadamer is that it is an intentional activity involving a mere repetition without any real purpose or goal. Applied to the work of art, the concept of play implies that there is no real separation between the work itself and the one who receives it: the work is constituted through a kind of playful activity of the receiver with the work. This activity constitutes the work; it brings the work’s different aspects together through the synthetic activity of interpretation and constitutes the unity of the work. (This understanding of the work again challenges the modern aesthetic tradition, which maintains a distinction between the subject and object in aesthetic experience.)

Gadamer uses the notion of the symbol to explain the way in which an artwork should be thought to be meaningful. As we have seen, Gadamer wants to underline the cognitive dimension of artworks: what they are about and the connections to our interests and involvements that they imply. At the same time, he objects to Hegel’s idealist account of artworks, which reduces them entirely to conceptual content. For Hegel, art has ‘ended’ because the conceptual content artworks were best able to express in sensuous form in classical times has been superseded by philosophy’s more perspicuous conceptual articulation. Here Gadamer again takes inspiration from Heidegger, and reiterates the latter’s idea that every revealing is also a concealing; every setting up of a world also sets forth an earth, so that every artwork always maintains something concealed within it which resists the current interpretation. Gadamer translates this into every encounter with a work of art, so that the artwork is imputed with an inexhaustible excess of meaning, gradually revealed through repeated engagements with the work (like a conversation), without the prospect that such meaning might ever be wholly revealed or exhausted.

The symbol, for Gadamer, then expresses the way that an artwork can have a meaning which is cognitive and quasi-linguistic yet excessive and inexhaustible. For Gadamer, language stands as a paragon for all experience of meaning, so that even apparently non-linguistic experiences such as encounters with artworks must be understood according to a linguistic model. Art speaks to us, it says something to us, and understanding a work of art is all about working out what it has to say. This means learning to listen to or read a work of art, which in turn means learning to understand the kind of language it speaks. He notes how both ancient and contemporary works can challenge us by appearing to speak a language we do not understand and how interpreting them requires a process of learning the appropriate language in order to approach a meaning which will, however, never be able to be summed up in a simple conceptual determination or linguistic formulation. In this manner, understanding a work of art will be an interminable affair involving repeated encounters and acts of interpretation.

Finally, the festival reveals something about the temporal character of the artwork and its affinity with human community. Like the experience of a festival or holiday, the artwork invites us into an experience of time which differs from the quantitative, calculative experience of time we have when we are engaged in work (and similar everyday activities). Gadamer calls this ‘fulfilled’ or ‘autonomous’ time; it is time which has a certain unity and cannot be dissolved into separate moments, and which stands apart—and stands us apart, as we experience it—from everyday concerns. It is here that we see the phenomenological character of Gadamer’s aesthetics: like Heidegger, he is concerned with the way that art invites us to ‘let things be’, to be open to the way they reveal themselves. It is also through festive experiences that community is formed by dissolving the usual hierarchical distances which divide citizens according to social roles.  Art, Gadamer proposes, can also be experienced in a way which does not appeal to any particular social class, but unites people in sharing the same kind of experience. While hermeneutics has sometimes been characterised as conservative because of its emphasis on tradition and community, it must be emphasised that Gadamer sees both in terms of openness and transformation. And while he reminds us of the trans-historical importance of great works such as Ancient Greek tragedy, it is notable that Gadamer also asserts the legitimacy and importance of contemporary experimental arts, of happenings and anti-art, and even of pop music.

4. Psychoanalysis

 a. Introduction

Psychoanalysis, which received more mainstream acceptance in continental Europe than in English-speaking countries, produced a body of theory which has become a major current feeding into continental philosophy, including its aesthetic reflections. Sigmund Freud, the father of psychoanalysis, extended his theoretical model of the human psyche beyond the clinical setting to develop many aspects of a general philosophical anthropology. One area he treated in a number of essays was art. In his writings, we find both some general reflections on the psychological significance of creative activity in general and the interpretation of a number of specific painters and writers, including Leonardo, Michelangelo, Fyodor Dostoevsky, and E.T.A Hoffmann. Following Freud, other psychoanalysts, notably C.J. Jung, Melanie Klein, and Jacques Lacan, have included ruminations on art in their distinctive developments of psychoanalytic theory. Psychoanalysis has inspired many art critics to adopt its ideas and methods, and artists themselves have also been subject to its influence (most notably in the Surrealist movement). Aspects of psychoanalytic theory have been taken up by some continental philosophers, such as Julia Kristeva, Jean-François Lyotard, and Slavoj Zizek, in their own aesthetics and philosophies of art. Such philosophers have engaged deeply with the writings of psychoanalysts themselves, treating their ideas as philosophical theories, and we will do the same here, outlining some of the prominent contributions to aesthetics we find in the works of Freud and Lacan.

 b. Freud

Freud develops the outlines of a general approach to aesthetics that he calls variously ‘pathography’ or ‘psychobiography’. He finds the origins of the artist’s creative activity in children’s play, in which the child imaginatively creates a world of his or her own. He suggests that this creative activity then takes the form of phantasies in adulthood, where phantasies are understood as imaginative fulfilments of desires which remain unfulfilled in reality. Artists, he contends, are particularly neurotic people who are especially incapable of fulfilling their desires in reality and find a substitute sense of fulfilment by externalising their fantasies in works of art. However, artists are also especially gifted: their talents allow them to represent such desires in a way which makes them acceptable to others when the desires confronted directly in themselves would be repulsive (because of their brute, animalistic character—which is why they are often repressed). Artists’ own activities have a therapeutic value for them because artistic expression acts as a release for the pressures of desires which are unfulfilled and/or repressed. Moreover, Freud contends that the real reason we enjoy art is that it serves the same function for the aesthetic spectator. The formal or properly aesthetic qualities, he suggests, merely have an initial ‘incentive value’ to draw us to the work, while the real enjoyment comes from the release we feel by sharing with the artist the phantasy of fulfilling unfulfilled or repressed desires.

In general, the desires which artists represent are of two main kinds: ambitious (the desire for power, achievement, and security) and erotic (the desire for love and sexual pleasure). However, according to Freud, artists express their own unique phantasies with enough specificity that, with the help of biographical knowledge of the artist’s life, we may interpret artworks like symptoms (hence, ‘pathography’, meaning ‘marks of illness’) in order to reconstruct a picture of the artist’s psychological life (hence ‘psychobiography’.) Freud’s most famous example here comes from Leonardo. He sees in Leonardo’s painting The Virgin and Child with St. Anne (c. 1503) the outlined figure of a vulture in the Virgin’s clothing and uses this as a clue to a psychoanalytic interpretation which also draws on Leonardo’s diaries and on biographical accounts. He notes a passage in the diaries which seems to underline the significance of the vulture where Leonardo writes of a memory from early childhood in which a vulture repeatedly struck him on his open mouth with its tail. Interpreting this as a fellatio fantasy, and drawing it together with a number of other interpretive elements, Freud diagnoses Leonardo as a passive homosexual who did not actively pursue his homosexual desires but sublimated them into his work. This case his proven infamous because of Freud’s misinterpretation of the key word in Leonardo’s writings—he in fact refers to a kite, not a vulture—but the psychoanalytic approach does not stand or fall with a single example. The general approach illustrated here, that of psychobiography, has since been taken up and developed by numerous other writers in relation to many other examples. More significantly, and aside from the many criticisms to which psychoanalysis in general has been subject, this form of psychoanalytic aesthetics has been criticised for reducing the value of the work to the life of the artist (see section 2. c.) and attempting psychoanalysis without the benefit of a living and present analysand.

 c. Lacan

Jacques Lacan is arguably the second most influential psychoanalyst, both in general and in aesthetics and art theory, after Freud. Lacan’s contribution to aesthetics is at once less and more ambitious than Freud’s: less insofar as he did not produce any lengthy psycho=biographical studies of artists yet more insofar as he moved beyond Freud’s hesitant tone where art was concerned to confidently pronounce on the motive of artistic creation and the power of visual art to fascinate. Moreover, his writings and seminars are peppered with examples from the arts used to explain psychoanalytic concepts, an approach which has proven influential on many other art writers and cultural theorists, who (questionably) assume that the reverse also holds true: that such psychoanalytic concepts tell us something about the arts that may be used to illustrate them. Lacan’s distinctive approach to psychoanalysis may be glossed as an attempt to update and formalise Freud’s teachings by applying the concepts and methods of structural linguistics (see section 6. a.). Freud’s own works lend themselves to this because of their extensive references to the importance of language in psychic functioning. Lacan’s approach is well indicated by his famous dictum: ‘the unconscious is structured like a language’.

While retaining Freud’s categories (the unconscious, id, ego, and so forth), Lacan added three ‘registers’ of psychological functioning to his model of the mind: the imaginary, the symbolic, and the real. The imaginary concerns thinking in images, the symbolic is the register of language and formal symbolisation but is also associated with the law and social custom, while the real designates what falls outside the limits of the imaginary and the symbolic. Together, the imaginary and symbolic constitute what we think of as ‘reality’, which contrasts with ‘the real’. The real has its origins in the experience of infantile life, before the imaginary and symbolic mental functions have developed, but it returns as a kind of surplus energy to make its presence felt in those registers throughout life. Two further key ideas are necessary to understand Lacan’s specific contributions to aesthetics: the famous ‘mirror stage’ and symbolic castration. Lacan postulates that at around 6–18 months of age, the child develops an awareness of their separation from the mother’s body, but experiences their own body as disorganised, lacking unity. This unity is provided by identification with other people (as in a mirror image) but at the cost of a fundamental split and lack: contra Descartes, the thinking subject is not a self-sufficient unity but gains an identity only through a fundamental identification with an other. This identification with an other structures our own desire, as we learn to desire what the other desires (by imitation) but also in relation to what the other desires (we ask ourselves: what does the other want from me, or want me to be?).

Lacan transforms Freud’s theories of the Oedipus complex and castration anxiety by suggesting that the Oedipus complex is in fact resolved through the accomplishment of a symbolic castration. This occurs as the child learns language and also learns to attenuate their desires in relation to the law and social custom. This is attended by a further alienation and feeling of lack as the unconscious idea of a pleasurable plenitude prior to both the alienation of the mirror stage and symbolic castration develops. Lacan postulates that things or objects can take on the role of what he calls the ‘object petit a’ (the object designated with a little ‘a’, for autre, the French word for ‘other’). Human beings are principally motivated by a ‘fundamental fantasy’ which is the fantasy of fulfilling our lack through the object petit a, which we unconsciously see as a lost object (symbolically, the phallus which is lost through castration), the regaining of which would make us whole. The central aim of Lacanian psychoanalysis is to help the analysand ‘traverse the fantasy’—to realise the fundamental fantasy for what it is (precisely that, an impossible fantasy), and to accept the inevitable necessity of symbolic castration.

With these basic points in place, we are in a position to understand some of the key ideas Lacan outlines in the text which has been his most influential in aesthetics, the section called ‘Of the Gaze as object petit a’ in the transcript of his Seminar Book XI: The Four Fundamental Concepts of Psychoanalysis (delivered in 1964). Coincidentally, the first of these seminars was given in the same week that Merleau-Ponty’s posthumous work The Visible and the Invisible appeared, and Lacan takes the phenomenologist’s work as the basis from which he develops a psychoanalytic theory of art (see the section on Phenomenology and Existentialism above). At the same time, it must be understood that Lacan develops these ideas in a direction which, by focusing on the unconscious and a structuralist approach, seeks to provide an alternative account of the visual to that of existential phenomenology. The latter, for Lacan, does not adequately account for the decentred and split nature of subjectivity because it identifies it too strongly with the conscious ego. Lacan begins from Merleau-Ponty’s contention that the seeing subject is not the most primordial aspect of the visual, as such, and develops a deeper conception of the visible, and the invisible which conditions it, in terms of a distinction between ‘the eye’ and ‘the gaze’. ‘The eye’ designates the seeing subject, associated with the Cartesian analysis of vision, and the ideal point of the observer in single-point perspective painting. By contrast, ‘the gaze’ designates something in the visual field which escapes the eye’s ability to see it clearly and, in so doing, is the presence in the visual of something which escapes the Cartesian subject’s supposedly transparent self-consciousness and mastery of the objects it surveys. The skewed perspective in the visual field that the gaze suggests is famously exemplified by Lacan with Hans Holbein’s anamorphic painting The Ambassadors (1533). The skull at the bottom of the picture can only be seen clearly by viewing the rest of the painting from an extreme angle, illustrating both the lack of a single perspective which can master the visual field and the lack at the heart of the subject (exemplified by the skull as the symbol of death, the memento mori of the painting’s traditional interpretation).

The primordial model of this ‘gaze’ is the gaze of the mother, which evokes in the child questions that persist in our unconscious and shape our experience of vision: what does the (m)other want from me? What does she want me to be? As a seeing subject (‘the eye’), Lacan argues, my visual field is haunted by something which looks back at me, but which I cannot clearly see. This haunting of the visual field by an other which decentres my own point of vision is what Lacan understands by ‘the gaze’. As patterned on our mother’s look, the gaze is an example of the object petit a, and it evokes in us the sense of a lack which might be filled by regaining the appropriate lost object. Lacan answers the question ‘What is a picture?’ by suggesting that a picture is something the artist produces in the hope of appeasing or pacifying the gaze: it is an attempt to give the mother what she wants in the hope of regaining the fantasised, lost, utopian, perfect relationship with her, and thus fulfilling our own lack. This then leads to Lacan’s pronouncements on the meaning of all painting (and by extension, all visual art): it is ‘a trap for the gaze’.

For Lacan, visual art has a particular psychological function which works specifically in the register of the imaginary: it acts as a ‘lure’ for desire, inviting us to fantasise about the overcoming of alienation and the regaining of the lost object. This is because it is not only the painter who seeks to fulfil his or her desire by giving the other what it wants with the picture but the picture itself, as an image of self-enclosed completeness, which satisfies the spectator’s desire to fulfil the fundamental fantasy. It is in this way, Lacan suggests, that the picture can have a taming, civilizing, and fascinating power. Some interpreters (such as Lyotard; see section 6. c.) have taken this to be Lacan’s last word on art, which leaves it at a level inferior to the symbolic and the possibility of traversing the fantasy. Others, however, have suggested that Lacan’s work is open to the reading that some visual art can work at the symbolic level by deconstructing the illusion of painting from within, showing how the supposed realism of the painting is a product of artifice. Desire is then prevented from fantasising about its own fulfilment in the supposed unity and wholeness of the image and is forced to confront the arbitrary constructions of the symbolic order. This seems to be suggested, for example, by Lacan’s reading of Diego Velázquez’s Las Meninas (1656) in his May 1966 seminar. Notably, this was a response to Michel Foucault’s examination of the same painting in his immediately popular and now classic 1966 work The Order of Things, where it is examined as a representation of how representation itself was understood in what the French call the Classical period. However it is interpreted, Lacan’s idea of the gaze makes a fascinating contribution to aesthetics by suggesting that our experience of the visual is not a simple given, reducible to geometric analysis, but is conditioned by our split subjectivity and the intrigues of our desire. His ideas have been particularly influential in film theory, where through various formulations they have dominated the field for several decades. A key development for film theory, and beyond it, is the way that Lacan’s ideas were taken up by Louis Althusser to develop a structuralist theory of ideology according to which our subjectivity is structured through its capture in the gaze of the big Other, the symbolic authority figure which conditions social reality in any given society.

5. Critical Theory

 a. Introduction

The tradition known as critical theory is associated with the Frankfurt School, more formally known as the Institute for Social Research (Institut für Sozialforschung), which was founded in Frankfurt in 1923, moved its base of operations to New York in 1934, then returned to Frankfurt in 1951. With its unique combination of sociology and philosophy, Critical Theory is arguably the most prominent strand of Western Marxism. A number of philosophers and cultural critics working in this tradition have made contributions to aesthetics that have been highly influential throughout continental philosophy and in wider aesthetic and art-critical contexts. These thinkers have been concerned with both the fate of art under the conditions of industrial capitalism and the potential of art to critique such conditions. These themes were treated by Walter Benjamin in his highly influential essay ‘The Work of Art in the Age of Mechanical Reproduction’ (1936), and by Theodor W. Adorno in his Aesthetic Theory (1970) and numerous shorter works.

 b. Benjamin

Benjamin’s essay hypothesises that the function of the artwork has changed as the conditions of production under industrial capitalism have changed. Following Marx but seeking to update his analyses, he suggests that it has taken some time for the changes at the superstructural (cultural) level to manifest the implications of the changes at the substructural (economic) level, which Marx analysed in the nineteenth century. The key factor in this change is the technique of mechanical reproduction. Benjamin concedes that reproducibility as such has always been a concern with art since antiquity and points to developments in the history of reproducibility such as the printing press and lithography. However, he classifies these techniques as forms of manual reproduction and asserts that with mechanical reproduction we see the development of something significantly new. The main artforms he has in mind, and discusses at length in the essay, are photography and film. With these, the process of the production of images itself is largely mechanical and reproduction can no longer be said to simply copy an original. Benjamin famously claims that what artworks previously had, which they lose through mechanical reproduction, is what he calls aura. The aura of a work is the unique ‘presence’ which the original exudes in occupying a distinct time and place.  It is what gives a work its authenticity and makes it possible to distinguish between an authentic original and a forgery: the authentic original has occupied a unique series of times and places, which constitutes its history. Benjamin rightly notes that with arts such as photography and film, it no longer makes sense to draw such distinctions: one reproduction from a photographic negative, for example, is no more or less authentic than another. Instead of existing as a unique object, the work of art in the age of mechanical reproduction now exists as a multiplicity of copies.

According to Benjamin, this change also has the effect of extracting the artwork from tradition. For him, the uniqueness of a work means that it is imbedded in a fabric of tradition. This traditional uniqueness is associated with the anthropological basis of artworks in ritual, and Benjamin uses Marx’s categories of use value and exchange value to suggest that ritual or cult value is the original use value of an artwork. While such value becomes secularised in the ‘cult of beauty’ that is modern aesthetics and the art world, something of the use value of the work persists in the emphasis on its authenticity. However, Benjamin argues that with the advent of mechanical reproduction, artworks are finally liberated from this cult value and instead take on an ‘exhibition value’. Copies are put into mass circulation, exhibited far more widely than would be possible with an authentic original. For Benjamin, this accords with a broader social and cultural phenomenon of ‘the mass’, a sense of the universal equivalence and exchangeability of all things in the social domain. Mechanical reproduction feeds the desire of the masses for things to be brought close, as distinct from the unique work of art which is always at a distance (even when one is ‘present’ to it in a gallery or other setting). According to Benjamin, the quantitative transformation of artworks demanded by the masses also leads to a qualitative transformation, as the nature and function of art comes to be understood according to the model of the arts of mechanical reproduction. Thus, art is transformed by its loss of aura.

Moreover, Benjamin asserts that human modes of perception are historically transformable, and the arts of mechanical reproduction are altering our perceptions of the world. Techniques in photography and film such as the close-up and slow motion are not simply reproducing what we previously know of the world but introducing new perceptions and knowledges as they capture things entirely unknown to the naked eye. Benjamin suggests that as art loses its ritual or cult value it takes on a political value, and while the essay lacks a clear political programme or set of prescriptions, he asserts that the concepts it develops are useful for a revolutionary communist politics of art. For him, older aesthetic traditions based around the aura can be seen as culpable in their co-optation by fascist regimes, while the transformations wrought in the arts by industrial technologies opens more promising possibilities for the politicization of art through the democratic communication of ideas.

 c. Adorno

While Adorno’s reflections on art and culture developed to some extent from a critical disagreement with Benjamin over the democratizing potentials of radio and film, this developed into a highly productive engagement. Adorno agrees with Benjamin that contemporary developments in society and the arts radically challenge modern and romantic aesthetics but has a far more pessimistic view of the mass media ‘culture industry’. For Adorno, popular culture (including most radio and film) is complicit with the contemporary social system of capitalist exploitation, which he analyses as a culmination of the logic of ‘instrumental rationality’ devolving from the Enlightenment. In this system, human beings are radically alienated from nature through the project of manipulation and control of the natural world, which has not resulted in the hoped-for emancipation of human beings, but a ‘new barbarism’ in which we are psychologically dominated by the very system which was supposed to set us free. This system is one which determines everything according to a logic of rational specification and calculation, leaving little of any other way of understanding ourselves or relating to the world than what we understand to be instrumentally useful or productive.

The culture industry acts as an ideological support of this system, keeping people blind to the real conditions of their existence. However, Adorno saw more positive potentials in ‘genuine’ art, in particular experimental modernism. Through extensive writings on musicology, literature, and the visual arts, Adorno contributed one of the most important bodies of work in continental aesthetics. These reflections culminated in Adorno’s last book, Aesthetic Theory, completed but not finally edited by him.

Aesthetic Theory deliberately employs strategies which resist any easy summation of the work into simply stated concepts and theses. Adorno developed a paratactic style of writing on the model of atonal music, in which sentences clash with each other to dissonant effect, rather than developing a clear line of argument. Moreover, Adorno deploys his own ‘negative’ dialectical style of thinking, in which pairs of contrasting concepts ‘constellate’ around topics of discussion without resolving into static propositions and conclusions. These difficulties are far from arbitrary and are, in fact, highly motivated by Adorno’s own views on how critical thought can best resist the system in which it operates. This includes a demand that thought take time and be difficult, in contrast to the expectations of quick and easy consumption which dominate in the culture industry. Nevertheless, there are several key themes the work develops which are readily appreciable in the context of the broader tradition of aesthetics in continental philosophy we are outlining here.

First, Aesthetic Theory is a critical interrogation of the tradition of philosophical aesthetics itself, especially as exemplified by the works of Kant and Hegel. Adorno’s view is that many of the categories of traditional aesthetics are outmoded because of developments in both society and the arts. Nevertheless, he seeks to rethink such categories critically rather than simply abandon the legacy of the aesthetic tradition. To take one easily appreciable example, Adorno draws attention to the limitations of the aesthetic tradition’s focus on the beautiful in the face of the apparent ugliness and dissonance characteristic of much modernist art. Second, Adorno (somewhat infamously) asserts the autonomy of the artwork. (Here he follows Kant on the autonomy of aesthetic judgement but insists that this autonomy should also be ascribed to the art object itself.) This is an insistence that the aesthetic value of an artwork is independent of other values which might be ascribed to it, such as epistemological or ethical value. Unfortunately, this claim has often been (mis)interpreted to mean that artworks should be understood to be entirely unrelated to their social context or political value. Adorno’s view is more complex and, in fact, strongly asserts the relevance of cultural context and the political import of art.

As a third major point, then, for Adorno, artworks may be understood as ‘monads’ (a concept drawn from Gottfried Leibniz): while they are independent, self-enclosed entities, they are products of the social conditions in which they are created and mirror these social conditions within them. Following on from Marx’s framework of analysis, Adorno sees the conditions of contemporary capitalist society as fundamentally contradictory, and it is these contradictions which the artwork embodies. Adorno argues that the most politically relevant artworks are not ones with explicit political content, but those which best reflect the deeply conflicted conditions of contemporary culture (such as the atonal compositions of Arnold Schoenberg or the absurdist literature of Samuel Beckett). Such works have a ‘truth content’ but not one which could be stated in clear propositions with cognitive value. Moreover, the autonomy of art in fact gives it a function of political resistance: while the artwork, like everything else under the conditions of contemporary capitalism, has a commodity form, it also resists incorporation into the instrumentally rationalised system of production and consumption through its very uselessness. For Adorno, modernist experimental art is a privileged site of politics in the contemporary world, as it can both reflect and resist the difficulties and contradictions of contemporary existence better than explicit political discourse. Nevertheless, due to its opacity, art still needs a philosophical aesthetics to aid in its comprehension, and the complex arguments of Aesthetic Theory attempt to rework the concepts of the aesthetic tradition so that they become adequate to the task. While there was no outright dialogue between them, there are palpable and interesting resonances between Adorno’s aesthetic theory and the dominant modernist school of art theory which was developed in America by Clement Greenberg, Michael Fried, and others in the twentieth century. As we find in Adorno’s writings, this brand of aesthetic modernism also combined a concern with formalism, autonomy, and experimentation in the arts with a belief in the socially and politically critical relevance of such works.

6. Poststructuralism

 a. Introduction

Poststructuralism is the name given in the English-speaking world for a loose collection of influential French philosophers and theorists working in the wake of structuralism, a movement which itself deserves some mention for its impact on aesthetics in continental philosophy. Structuralism came to prominence in France in the nineteen-fifties and -sixties, rivalling and, to some extent, succeeding phenomenology and existentialism as a leading methodological approach in the human sciences. It applies some basic tenets of Ferdinand de Saussure’s structural linguistics to phenomena other than language, such as the unconscious (Lacan, as we have seen above in section 4. c.), myth and ritual (Claude Lévi-Strauss), and history (Michel Foucault). Most significantly for aesthetics, Roland Barthes applied structuralist principles to literary criticism, and developed Saussure’s suggestion of a ‘semiology’, a study of signs in general (broader than the study of linguistic signs alone), applying such an approach to various forms of art and culture. Simply put, structuralism views the meaningful content of any phenomena as given in the structured relations between basic units (signs). This structure is taken to be hidden (or deep), and interpretation of an artwork or cultural product then becomes a matter of making the structure which informs it explicit. Because of its formalism and methodological rigour, structuralism was touted by its supporters as a more ‘scientific’ method for studying the phenomena of the human sciences (that is, ‘meaningful’ phenomena), and it swept through the French academy like a revolution.

To some extent, poststructuralism can be understood as a philosophical reaction to the excessive zeal for formal method that structuralism exhibited. Most poststructuralists continued to draw on the phenomenological tradition, as well as psychoanalytic theory, and adopted aspects of structuralism while critiquing others. In short, poststructuralists tend to argue that meaning is not reducible to static structures and cannot be uncovered using a formal method. Generalising greatly, we might say that poststructuralists insist upon the necessity of some element of indeterminacy (which accounts for the genesis of the structure) that operates within the structure to generate meaning, and that constitutes an instability which threatens the coherence of the structure and may disrupt it and cause it to change. Understood as an interplay between structure and the element of indeterminacy (often called ‘the event’), meaning cannot be uncovered using a formal method, and poststructuralists have had recourse to highly unorthodox, experimental modes of thinking and writing in theorising and demonstrating those aspects of meaning or effect they believe structuralism misses. Art and aesthetics have been significant topics for all poststructuralists because, as the philosophical tradition attests, aesthetic meaning or effect seems to be a paradigm case of a kind of meaning which is not ‘scientific’. I will summarise here some of the key ideas of the two poststructuralists who have been most influential in aesthetics (Jacques Derrida and Gilles Deleuze) as well as those of the philosopher in this tradition who has engaged most extensively with art, Jean-François Lyotard.

 b. Derrida

Derrida and his philosophy of deconstruction have had an enormous influence on literary criticism and some influence as well in the wider arts and aesthetic theory. Notoriously difficult to summarise, Derrida’s works may be approached for our purposes through the observation that he develops a quasi-transcendental theory of meaning, which has implications for how meaning is understood to operate in philosophy, literature, and the arts. In post-Kantian, contemporary continental philosophy, ‘transcendental’ refers to the ‘conditions of possibility’ for a thing. The ‘quasi’ in Derrida’s case notes that while traditional transcendental thinking posits a priori structures that are taken to be universal and necessary, Derrida follows Heidegger in positing that the way things become meaningful is a function of time and subject to temporal and historical change. Derrida’s ‘principle’ of meaning, which claims to capture something of these conditions for anything whatsoever being meaningful, is ‘arche-writing’. This term indicates that it is some of the key features or properties of writing, as it has been understood in the metaphysical tradition, which are quasi-transcendental conditions of the possibility of meaning, rather than writing as such. These features are indicated by Derrida’s well-known term différance, which indicates spatial differing and temporal deferring.

This idea of différance contests the principle of meaning which has, according to Derrida, dominated throughout the Western tradition, which he calls the ‘metaphysics of presence’. This theory proposes an origin or full presence of ‘pure’ meaning in an idea held in the mind, which is then progressively corrupted by being put into spoken, then written, discourse. This supposed corruption of meaning corresponds with the spatial and temporal differing and deferring which, Derrida contends, are in fact the conditions of anything being meaningful in the first place. According to Derrida, there is no possibility of a pure, simple, original meaningful presentation, and every apparently original presentation is always already a repetition or a re-presentation. His arguments are extremely complex but may be treated summarily by noting how they draw on the traditions of phenomenology and structural linguistics. In the Husserlean phenomenological tradition, which takes consciousness as the transcendental condition of meaning, Derrida reads Husserl to show that conscious experience requires a synthesis of different temporal moments, such that any ‘presence’ of something to consciousness is already subject to the passing of time, that is, temporal difference. From the structural linguistics of de Saussure, Derrida draws the idea that every linguistic meaning only functions because of the possibility of its reiteration, or what Derrida calls its ‘iterability’.  Every linguistic usage draws from an already-existing store of linguistic meaning (the virtual structure of language as a whole), and in that sense is already a reiteration. Moreover, every use presupposes the possibility of the listener or reader reiterating the use in another context, because the very nature of linguistic competence—and thus, the capacity to understand—depends upon the ability to use language in this appropriative and citational manner.

While Derrida has often been seen as collapsing the distinction between philosophy and literature, he is in fact drawn to the latter and deploys it to contaminate and complicate the former because of the differences he sees between them. While he seeks to deconstruct any simple opposition between philosophy and literature, such a deconstruction would not be possible without also insisting on the differences between them. Philosophy has traditionally set itself up in opposition to the ‘merely’ literary, claiming truth to be its own exclusive competence and categorising literature as belonging to the fictional or untrue. Philosophical texts have typically been tightly structured according to the metaphysics of presence, deployed in structures of binary oppositions which set up hierarchies of meaning, such as truth/falsity, essence/appearance, form/matter, presence/absence, and so on. By contrast, although Derrida sees all meaning and all texts as to some degree structured by the metaphysics of presence, he sees the virtue of literature (and especially the works of experimental writers such as Stéphane Mallarmé, James Joyce, or Antonin Artaud) as asserting and developing the ambiguities, contradictions, aporias, and playfulness of meaning that philosophical texts and modes of writing strive to suppress. Deconstruction, for Derrida, is a strategy of reading and writing which aims to identify and subvert the binary oppositions structuring a text, showing how the privileged term is in fact parasitic on the underprivileged one, and opening up the space for a play of meaning beyond simple oppositions by inventing concepts (such as the trace, différance, the hymen, and so on) which are ‘undecidable’ from the point of view of such oppositions. What Derrida finds in literature are such undecidables already in play to a much greater extent than in philosophical texts, and he affirms and reinscribes these in his own writings. Derrida strove to emulate literary modes of writing in his philosophical texts precisely in order to open them to a freer play of meaning. Through Derrida’s influential association with prominent literary critics such as Paul de Man and J. Hillis Miller, deconstruction became enormously popular in literary criticism from the nineteen-seventies to -nineties, often taking the form of a reductive methodology for exposing contradictions internal to a text, which Derrida himself would never have approved of.

When Derrida turns his attention to the visual arts, in texts such as The Truth in Painting, he develops concepts (such as the trait, the parergon, and the subjectile) which essentially follow the same differential logic as arche-writing. Derrida suspects any supposition of a pure presence of meaning in an image and works in various ways to complicate this, showing that images depend on an ambiguous play between concepts and categories such as the inside and outside of the frame, the visible and the invisible, word and image, single artwork and entire oeuvre, and so on. These playful movements are processes of spatial differing and temporal deferring, which work against the metaphysics of presence and underline a differential form of meaning in the visual which is similar to that which he sees operating in the written text. Moreover, Derrida also seeks to complicate any simple opposition between visual and textual meaning, seeing such an opposition as itself implying a metaphysics of presence. This complication is notably played out in his text on the Italian artist Valerio Adami in The Truth in Painting, where attention is given to the communication and interplay of meaning between Adami’s images and the text he places within, outside, and in transgression of the frame of his visual works.

 c. Lyotard

Lyotard’s Discourse, Figure (1971) stages a significant encounter between phenomenology, structuralism, and psychoanalysis, with the aim of doing justice to the aesthetic event, and in particular the visual. Lyotard insists—against structuralism, hermeneutics, and indeed much of the literature of art history and visual culture—that the visual has its own kind of meaning, which differs from and cannot be reduced to linguistic meaning. For him, it is wrong to say that a picture can be ‘read’. Instead, he tries to account for how art can leave us with the feeling of being ‘lost for words’. The first part of Discourse, Figure carefully compares the kind of meaning proper to perception, as developed in Merleau-Ponty’s phenomenology (see section 2. c.), with the kind of meaning operative in language according to structuralism. While it is clear that Lyotard thinks Merleau-Ponty gives a more adequate account of the kind of meaning specific to the visual, he also finds phenomenology ultimately inadequate. He argues that Merleau-Ponty’s notion of the ‘flesh’ remains a too-harmonious interface with the world at the level of conscious perception, and has recourse to psychoanalysis to try to find in the unconscious the source of radical creativity and sheer unexpectedness characteristic of avant-garde art.

Lyotard objects to Lacan’s structuralist reading of the unconscious, however, and believes that the latter’s interpretation of art as lodged in the register of the imaginary, acting as a lure for desire, is an affront to the grandeur of art (see section 4. c.). Employing a close reading of Freud, he develops an alternative view of the unconscious, which emphasises plastic transformations (rather than linguistic operations) of its contents. Lyotard also objects to much of Freud’s own explicit aesthetics, however, and argues that the meaning of an artwork is not to be found in the pathology of the artist. Instead, he develops Freud’s theory of the unconscious and desire, along with Merleau-Ponty’s phenomenology, to give a complex account of the artwork: it is neither simply the impression of conscious perceptions, nor the expression of unconscious desires (fantasies), but a mutual deconstruction of one by the other, which produces a new and unrecognisable form. He refers to this deconstructive element as the figural.

In Lyotard’s later work, he reconfigures the traditional aesthetic category of the sublime to account for and defend avant-garde art and the significance of the aesthetic in the contemporary world. Lyotard now follows Adorno in postulating a crisis of traditional aesthetics, both in relation to the conditions of (post)industrial capitalism and developments in the arts, and tries to update aesthetics in response (see section 5. c.). For Lyotard, there is a crisis of meaning on the level of perception in the contemporary world, because—following the analyses of Heidegger, Benjamin, Adorno, and others—scientific and technological developments, operating in tandem with capitalism, have mutated the perceptual bearings by which we coordinated ourselves in the world. Sciences and technologies have both extended our sensory capacities (seeing and hearing at a distance, through television and telephones, for example), and revealed a reality beyond our body’s capacities for sensory awareness (atoms, microbes, nebulae, and so on). According to Lyotard, these changes have meant that the basic forms of sensory experience—time and space—have been thrown into uncertainty.

Lyotard sees avant-garde art of the twentieth-century as having pursued an analogous exploration of this crisis of perception. Traditionally, aesthetics has been concerned with the beautiful, understood in the arts as an ideal fit between the form and matter of a work. Lyotard sees avant-garde art, especially minimalism and abstraction, as moving away from a concern with ‘good form’ and towards an exploration of matter. Following Kant, ‘matter’ is something which defies rational calculation and specification: for example, colour in painting and timbre in music. While Kant himself only saw the sublime in art in depictions of sublime scenes in nature (mountains, storms at sea, and so forth), Lyotard suggests that the sublime is the aesthetic category appropriate to art which is less interested in exploring formal structure than in experimenting with matter, precisely because the sublime concerns feeling in relation to something ‘formless’. Lyotard characterises the sublime stake of art as ‘presenting the unpresentable’, because for him the aesthetic event is something which cannot be reduced to a ‘presentation’, understood in the Kantian sense as a ‘good form’ given to a sensation. Rather, art-events evoke thoughts and feelings in relation to works which surprise us and we cannot make sense of on the level of perception as well as concept: works that leave us feeling moved but lost for words. In his later works, the sublime is the aesthetic which Lyotard thinks best names this feeling. However, he also seeks to update this category in relation to the way it was understood in romantic or modern aesthetics. While in such aesthetics it had invoked the Idea of the absolute through a nostalgic feeling of loss for something transcendent which is missing, Lyotard posits a postmodern, immanent sublime in which the absolute is nothing other than the formless work itself.

 d. Deleuze

Both in his writings with Félix Guattari and on his own, Gilles Deleuze made important and influential contributions to the philosophy of film, painting, literature, and music. Many of his reflections on aesthetic issues are summarised in his last book with Guattari, What Is Philosophy?, where they are accompanied by a criticism of the phenomenological approach to aesthetics and Merleau-Ponty’s notion of the ‘flesh’ in particular. Characteristic of all of Deleuze’s work, he sees the level of perception with which phenomenologists are preoccupied as insufficiently deep to provide a full account of reality. It is on this level that he and Guattari situate, for example, Mikel Dufrenne’s a priori of aesthetic experience, and Merleau-Ponty’s flesh (see section 2. c.). Deleuze and Guattari delve deeper to give an account of art and aesthetic experience grounded in a metaphysical description of reality, where ‘sensation’ becomes the key aesthetic issue. Sensation is posited in a register prior to the distinction between subject and object, and consists of two main types: percepts and affects. Understood in this specific sense, they are perceptions and feelings considered independently of the lived experience which reveals them and raised to the level of independent metaphysical existence. For Deleuze and Guattari, a work of art is a ‘being of sensation’, a compound of percepts and affects, which is a ‘monument’ that preserves the sensation in and as the material from which the work is made. While for them the artist undoubtedly has a role in creating the work and the spectator or auditor a role in appreciating it, the emphasis is on the independent ontological status of the work as embodying that aspect of being which is sensation. They associate Merleau-Ponty’s flesh with the lived experience which reveals sensation, but insist on two further, deeper, and necessary conditions for sensation: the ‘house’, and ‘cosmic forces’. (While terms such as these may appear strange in the context of philosophical discourse, these and others are inspired by the writings of artists and other non-philosophers, and their use indicates a characteristic poststructuralist desire to think art with artists and art itself, rather than construct an independent theory about it, on the model of traditional aesthetics.)

Briefly glossed, the ‘house’ is the structure that gives sensations some consistency, such as the frames of paintings or the walls of architectural constructions but also more abstract principles of composition. ‘Cosmic forces’ are the basic physical and metaphysical forces constituting the real. Deleuze and Guattari list gravity, heaviness, rotation, the vortex, explosion, expansion, germination, and time. The main point here is that Deleuze and Guattari want to connect the activity of art with things usually considered extraneous to art and, indeed, with the universe as a whole. One notable way this placement of art within a broad metaphysical view plays out is the claim that animals can be artists through their exploitation of the expressive qualities of materials in marking territory, attracting mates, and so on. However, Deleuze and Guattari also insist that art must be considered to be a form of thinking which thinks with sensations, just as philosophy thinks with concepts and science thinks with functions. Art thinks against the common opinions, doxa, or clichés of our perceptions and feelings and adds new varieties of sensation to the world. This insistence gives art a legitimacy equal to that of philosophy and science, again indicating the importance accorded to the aesthetic which is characteristic of continental philosophy.

7. Developments in the Early 21st Century

 a. Introduction

Contemporary continental philosophy continues to see contributions to aesthetics which develop on all of the previous traditions discussed. Some twentieth century contributions to continental aesthetics, such as Adorno’s Aesthetic Theory or Lyotard’s extensive writings on the arts, still await much needed interpretation and discussion before the potential of their influence can be made manifest. In addition, contributing to the broad, pluralist landscape of aesthetics in continental philosophy, most of the more prominent continental philosophers in the early 21st century have written on the arts, including figures such as Slavoj Zizek, Alain Badiou, Giorgio Agamben, Jean-Luc Nancy, Michel Serres, Peter Sloterdijk, and Bernard Stiegler. Perhaps the most notable of these, however, is Jacques Rancière, whose distinctive works in aesthetics during the first fifteen years of the 21st century has revivified thinking on the relations between art and politics. We may therefore take Rancière’s work as an indicative example of early 21st century developments in continental aesthetics, while keeping in mind that this is just one of many important developments.

 b. Rancière

Rancière has become known for the idea of the ‘distribution of the sensible’, which suggests that systems of inclusion and exclusion, and of political relationships generally, don’t only operate on the conceptual or cognitive level, but on the sensory level. The idea of the distribution of the sensible captures the way the world is divided up according to sensations and the political implications of this. Rancière suggests that within communities there is a dimension of the sensible that is held in common by all members, allowing a common participation in the community as such, but that this is subdivided into parts, dividing members according to different areas of participation and non-participation. The distribution of the sensible concerns the circulation of words and images, the demarcation of spaces and times, and the forms of activity. It concerns the way that certain things are held to be meaningful or even self-evident to sense perception, while others are dismissed as meaningless noise. The different ways that what is held to be meaningful on a sensorial level in various contexts then affects what can meaningfully be thought, said, made, or done in those contexts. According to Rancière, social inequalities are in large part a result of this sensible distribution. A key implication of this idea is that art can be understood to be directly political on the level of the sensible (rather than indirectly, as simply representing ideas about social and political issues). Rancière’s politics is one of a non-utopian ideal of democratic emancipation, understood as the constant process of intervening in the current order to broaden spaces of participation and to open potentials of inclusion and participation where these are closed to parts of the community through the existing distribution of the sensible. Art can play an important political role by intervening in the existing order of distributions and helping to redistribute the sensible.

Rancière has also made a notable contribution to aesthetics in contesting the category of ‘modernism’, which has dominated much of the discourse around  art history and aesthetic theory in the early 21st century. According to Rancière, modernism attempts to impose a single meaning and a single historical narrative on the course of developments in the arts, a course which he sees as more complex, involving multiple meanings and temporalities. Moreover, modernism abstracts developments in the arts from other social and cultural forms of collective experience, which, on the contrary, Rancière sees as co-determining them. In place of categories which organise artistic developments according to a simple linear historical progression, Rancière proposes three ‘regimes’ of the arts. These regimes operate to some degree in a historically periodising fashion—as different regimes have predominated in different historical periods—but they also complicate and cut across such periodization. This is because they are not most fundamentally historical categories but, rather, ways that art is thought to operate or be significant, which can function in any historical period. Significantly, more than one regime of art can be operative at a single time. These regimes of art are 1. the ethical regime of images; 2. the representative (or poetic) regime of art; and 3. the aesthetic regime of art.

The ethical regime of images predominated in Ancient Greece and is exemplified with Plato’s discussion of images. Art does not emerge as a category here. Images are understood in relation to their effect on the ethos, or mode of behaviour, of members of the community, and they are interrogated according to their origin and their end, function, or purpose. In this regime, there are images which are thought to be truer or falser and to have a beneficial or detrimental effect on the ethical community. The representative regime of art predominated from the Renaissance to the nineteenth century. Here the idea of not just art but of a system of the arts emerged. Arts were thought of in terms of poetics; that is, sets of rules which determine the different forms of expression and arrange them in a hierarchy, and which also determine which forms of expression (arts, genres) are suitable for particular types of content. It is called the representative regime because this system of categorisation of the arts is organised around the key idea of representation, or mimesis, understood as a fit between form of expression and type of content. Finally, the aesthetic regime of the arts roughly corresponds with the experimentations more usually categorised with terms such as ‘modernism’ or ‘the avant-garde’. With this regime, the idea of art as something truly unique and singular emerges. However, this singularity is involved in a paradox, insofar as the rules for governing the arts characteristic of the representative regime also break down. In the aesthetic regime, art is asserted as a special kind of activity but, since anything can now count as art, there are no longer any criteria for distinguishing it from other forms of activity or production. While the aesthetic regime predominates in the contemporary world with its highly pluralist art scene, Rancière insists that all three regimes are still operative today to some degree.

8. References and Further Reading

  • Adorno, Theodor W. Aesthetic Theory. Edited by Gretel Adorno and Rolf Tiedemann, translated by Robert Hullot-Kentor. London and New York: Continuum, 1997.
    • Adorno’s major work on aesthetics.
  • Benjamin, Walter. ‘The Work of Art in the Age of Mechanical Reproduction’. In Illuminations, edited by Hannah Arendt, translated by Harry Zorn, 211–44. London: Pimlico, 1999.
    • The best known of the several versions of this highly influential essay, in which Benjamin develops the concept of artistic ‘aura’.
  • Cazeaux, Clive, ed. The Continental Aesthetics Reader, 2nd ed. London and New York: Routledge, 2011.
    • A collection of classic readings in aesthetics across the major traditions in continental philosophy, accompanied by insightful introductory essays.
  • Deleuze, Gilles, and Félix Guattari. What Is Philosophy? Translated by Hugh Tomlinson and Graham Burchell. London: Verso, 1994.
    • The chapter ‘Percept, Affect, and Concept’ condenses many aspects of Deleuze’s more extensive treatments of painting, film, and literature, and positions art in relation to philosophy and science.
  • Derrida, Jacques. Acts of Literature. Edited by Derek Attridge. London and New York: Routledge, 1992.
    • An edited collection of some of Derrida’s most important writings on literary topics, including essays on Mallarmé, Joyce, Kafka, Ponge, and Celan, and an interview with Derrida on literature.
  • Derrida, Jacques. The Truth in Painting. Translated by Geoff Bennington and Ian McLeod. Chicago and London: University of Chicago Press, 1987.
    • Derrida’s most well-known application of deconstructive strategies to aesthetic topics beyond literature. Contains the essay on Valerio Adami, ‘+R (Into the Bargain)’.
  • Freud, Sigmund. Writings on Art and Literature. Stanford: Stanford University Press, 1997.
    • A selection of Freud’s writings on aesthetic topics collected from James Strachey’s Standard Edition, which however does not include the two important texts listed below.
  • Freud, Sigmund. ‘Creative Writers and Day-Dreaming’. In Jensen’s ‘Gradiva’ and Other Works. Vol. 9 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 141–54. London: Hogarth Press, 1959.
    • Freud, Sigmund. ‘Leonardo da Vinci and a Memory of His Childhood’. In Five Lectures on Psycho-Analysis, Leonardo da Vinci, and Other Works. Vol. 11 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 63–138. London: Hogarth Press, 1957.
  •  Gadamer, Hans-Georg. Truth and Method, 2nd revised ed. Translated by Joel Weinsheimer and Donald G. Marshall. London: Bloomsbury, 2013.
  •  Gadamer, Hans-Georg. The Relevance of the Beautiful and Other Essays. Edited by Robert Bernasconi, translated by Nicholas Walker. Cambridge: Cambridge University Press, 1986.
  •  Heidegger, Martin. Nietzsche: Volumes One and Two. Translated by David Farrell Krell. San Francisco: Harper Collins, 1991.
    • Volume one, ‘The Will to Power as Art’, presents Nietzsche’s view of art as holding a privileged ontological status and a value higher than truth.
  • Heidegger, Martin. ‘The Origin of the Work of Art’. In Off the Beaten Track, edited and translated by Julian Young and Kenneth Haynes, 1–56. Cambridge: Cambridge University Press, 2002.
    • Heidegger’s most extensive, significant, and well-known contribution to aesthetics.
  • Kearney, Richard and David Rasmussen, eds. Continental Aesthetics: Romanticism to Postmodernism: An Anthology. London: Wiley-Blackwell, 2001.
    • A useful collection of classic readings, though unaccompanied by any guiding text.
  • Lacan, Jacques. The Seminar of Jacques Lacan, Book XI: The Four Fundamental Concepts of Psychoanalysis. Edited by Jacques-Alain Miller, translated by Alan Sheridan. New York and London: W.W. Norton & Company, 1978.
    • Transcripts of Lacan’s seminar delivered in 1964 and first published in French in 1973. Contains the most influential of Lacan’s work relating to aesthetics, ‘Of the Gaze as Object petit a.’ A problematic translation, but still the only one available.
  • Lyotard, Jean-François. Discourse, Figure. Translated by Antony Hudek and Mary Lyton. Minneapolis: University of Minnesota Press, 2011.
    • The definitive statement of Lyotard’s early aesthetics, which stages an encounter between structuralist, phenomenological, and psychoanalytic approaches.
  • Lyotard, Jean-François. Writings on Contemporary Art and Artists. Edited by Herman Parret. Leuven: Leuven University Press, 2013.
    • Extensive, though still not exhaustive, bilingual (French and English) collection of Lyotard’s writings on aesthetic topics.
  • Merleau-Ponty, Maurice. The Merleau-Ponty Aesthetics Reader. Edited by Michael B. Smith. Evanston, Illinois: Northwestern University Press, 1993.
    • Contains the essays ‘Cézanne’s Doubt’, ‘Indirect Language and the Voices of Silence’, and ‘Eye and Mind’, along with introductory essays on each and a collection of critical essays on Merleau-Ponty’s philosophy of art.
  • Merleau-Ponty, Maurice. The Visible and the Invisible. Edited by Claude Lefort, translated by Alphonso Lingis. Evanston: Northwestern University Press, 1968.
    • Merleau-Ponty’s final, unfinished book. Contains the chapter ‘The Intertwining—The Chiasm’, which outlines the ontology developed in relation to painting in ‘Eye and Mind.’
  • Rancière, Jacques. Dissensus: On Politics and Aesthetics. Edited and translated by Steven Corcoran. London: Bloomsbury, 2010.
    • A collection of some of Rancière’s most important essays on the relationship of politics and aesthetics.
  • Rancière, Jacques. The Politics of Aesthetics. Edited and translated by Gabriel Rockhill. London: Bloomsbury, 2013.
    • A brief, accessible introduction to some of Rancière’s most important ideas in aesthetics, such as the distribution of the sensible, the critique of ‘modernism’ as an aesthetic category, and the three regimes of art.
  • Rancière, Jacques. Aisthesis: Scenes from the Aesthetic Regime of Art. Translated by Zakir Paul. London: Verso, 2013.
    • Rancière’s most complete and definitive work on aesthetics to date.

 

Author Information

Ashley Woodward
Email: a.z.woodward@dundee.ac.uk
University of Dundee
United Kingdom

Epistemology and Relativism

Epistemology is, roughly, the philosophical theory of knowledge, its nature and scope. What is the status of epistemological claims? Relativists regard the status of (at least some kinds of) epistemological claims as, in some way, relative— that is to say, that the truths which (some kinds of) epistemological claims aspire to are relative truths. Self-described relativists vary, sometimes dramatically, in how they think about relative truth and what a commitment to it involves. Section 1 outlines some of these key differences and distinguishes between broadly two kinds of approaches to epistemic relativism. Proposals under the description of traditional epistemic relativism are the focus of Sections 2-4. These are, (i) arguments that appeal in some way to the Pyrrhonian problematic; (ii) arguments that appeal to apparently irreconcilable disagreements (for example, as in the famous dispute between Galileo and Bellarmine); and (iii) arguments that appeal to the alleged incommensurability of epistemic systems or frameworks. New (semantic) epistemic relativism, a linguistically motivated form of epistemic relativism defended in the most sophistication by John MacFarlane (for example, 2014), is the focus of Sections 5-6.  According to MacFarlane’s brand of epistemic relativism, whether a given knowledge-ascribing sentence is true depends on the epistemic standards at play in what he calls the context of assessment, which is the context in which the knowledge ascription (for example, ‘Galileo knows the earth revolves around the sun’) is being assessed for truth or falsity. Because the very same knowledge ascription can be assessed for truth or falsity from indefinitely many perspectives, knowledge-ascribing sentences do not get their truth values absolutely, but only relatively. The article concludes by canvassing some of the potential ramifications this more contemporary form of epistemic relativism has for projects in mainstream epistemology.

Table of Contents

  1. Relativism in Epistemology: Two Approaches
  2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument
  3. Traditional Arguments for Epistemic Relativism: Non-Neutrality
  4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity
  5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’
  6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology
  7. References and Further Reading

1. Relativism in Epistemology: Two Approaches

“Relativism” is notoriously difficult to define. There are however some core insights about relativism that are more or less embraced across the board amongst self-described relativists. One such insight is negative, framed in terms of what relativists are characteristically united in denying. Take for example the following epistemological claims:

  1. Copernicus’s belief that the earth revolves around the sun is justified.
  2. Edmund does not know that the man who will get the job has ten coins in his pocket.
  3. Knowledge is not factorable into component parts.
  4. Beliefs formed on the basis of direct observation are better justified than beliefs formed on the basis of drug-induced wishful thinking.

Relativists of all stripes typically deny at least one—if not all—of the following: that the truth of claims like (a-d) are applicable to all times and frameworks; that they are objective (for example, trivially dependent on our judgments or beliefs) and monistic (for example, in the sense that competing claims are mutually exclusive) (see Baghramian and Carter (2015)). In some cases—a notable example here is Richard Rorty (1979)—philosophers have been labelled relativists primarily on the basis of their distinctive denial(s) of such claims about the status of these kinds of judgments.

Moreover, along with denying the sorts of claims characteristic of metaepistemological realism (for example, Cuneo 2007: Ch 3), the epistemic relativist is also committed to denying the metaepistemological analogues of non-relativist positions that are familiar territory in contemporary metaethics.

For example, contra epistemic error theory (for example Olson 2009), which insists that claims like (a)-(d) which attribute epistemic properties are categorically false, the epistemic relativist maintains that some claims like (a)-(d), which attribute epistemic properties, are true—albeit, true in a way that is in some interesting sense ‘relative’. Likewise, contra the epistemic expressivist (for example Chrisman 2007; Gibbard 1990; Field 1998) who insists that claims like (a-d) are expressions of attitude, the relativist is a cognitivist. Accordingly, the relativist maintains that (a)-(d) are truth-apt, while adding that the truth-aptness is not to be thought of as the realist thinks of it; expressions like (a)-(d) are relatively truth-apt in that the truths they aspire to are relative truths. (We consider shortly what this might involve—as the point is highly controversial amongst relativists).

Another core insight about relativism, generally construed, is co-variance (for example Baghramian 2004; 2014 and Swoyer 2014). Co-variance is the idea that some object, x, depends on some underlying, independent variable, y, such that, in some suitably specified sense, change in the latter results in a change in the former. In embracing relativism about some class of truths, one thereby embraces some kind of co-variance claim. For example, a cultural relativist about epistemic justification tells us that the truth of claims (a-b) varies with local cultural norms and in doing so holds that cultural norm change instances change in what one counts as knowing, justifiably believing, and so forth.

Beyond these mostly uncontroversial ingredients of a relativist proposal—or necessary conditions for being a relativist—the matter of what is sufficient for a view to count as a relativist view is controversial. One influential approach to characterizing relativism has been put forward by Paul Boghossian (2006a). As Boghossian sees things, we can attribute to the epistemic relativist the following package of three claims: epistemic non-absolutism, epistemic relationism and epistemic pluralism.

Epistemic Relativism (Boghossian’s Formulation)

  1. There are no absolute facts about what belief a particular item of information justifies. (Epistemic non-absolutism)
  2. If a person, S’s, epistemic judgments are to have any prospect of being true, we must not construe his utterances of the form ‘‘E justifies belief B’’ as expressing the claim E justifies belief B but rather as expressing the claim: According to the epistemic system C, that I, S, accept, information E justifies belief B. (Epistemic relationism)
  3. There are many fundamentally different, genuinely alternative epistemic systems, but no facts by virtue of which one of these systems is more correct than any of the others. (Epistemic pluralism)

Boghossian’s model is often called the replacement model for formulating epistemic relativism. This is largely due to the inclusion of claim (B), the epistemic relationism thesis. In attributing relationism to the epistemic relativist, Boghossian (2006a: 84) regards the relativist as effectively endorsing a replacing of unqualified epistemic claims with explicitly relational ones. As he puts it:

[…] the relativist urges, we must reform our talk so that we no longer speak simply about what is justified by the evidence, but only about what is justified by the evidence according to the particular epistemic system that we happen to accept, noting, all the while, that there are no facts by virtue of which our particular system is more correct than any of the others.

One of the central moves Boghossian makes against the epistemic relativist in his monograph Fear of Knowledge is to argue that epistemic relativism—formulated as such—is ultimately an incoherent position. In response, some critics—notably Martin Kusch (2010)—have replied that epistemic relativism, formulated in accordance with the replacement model, is not incoherent for the reasons Boghossian suggests—or, at least, in Kusch’s case, that there is a version of this view that is defensible.

A comparatively deeper issue, however, and one that is prior to whether the replacement model leads to incoherence, is whether the inclusion of the relationist clause is an apt way of representing the relativist’s view. Though Boghossian and Kusch disagree on the matter of whether epistemic relativism formulated within the replacement model is tractable, both think that the framework is capable of characterising the epistemic relativist’s core position.

But this point is highly controversial. Crispin Wright (2008: 383) for instance, says of Boghossian’s inclusion of the relationist clause in formulating epistemic relativism:

We can envision an epistemic relativist feeling very distant from this characterisation and of its implicit perception of the situation.

Wright’s complaint, in the main, is that, insisting on the relationist clause is tantamount to insisting that the only way the relativist (who must reject absolute facts about what justifies what) can make sense of how claims of the form ‘S is justified in believing X’ are true (at all) is by construing their content in an explicitly relational way, so that the explicitly relational truths (for example ‘S is justified in believing X, according to system A) are themselves candidates for absolute truth.

But this, Wright says:

[…] is just to fail to take seriously the thesis that claims such as [sic … S is justified in believing X] can indeed be true or false, albeit, only relatively so. (Ibid., 383, my italics).

Wright’s complaint, as quoted in this passage, gestures to what is probably the most substantial divide in the contemporary landscape in relation to epistemic relativism. There are really two important and connected ideas that need unpacking here. The first has to do with charity, and the second has to do with inclusiveness.

Regarding charity: to the extent that one insists that epistemic relationism is an indispensable component of epistemic relativism, one is de facto excluding (by viewing as tacitly unintelligible) the thought that non-explicitly relational claims (for example S is justified in believing p) can be true or false, albeit, only relatively so. And so if it turns out that that this excluded possibility is a viable one, then the attribution to the relativist of the relationist clause is not a suitably charitable way of formulating the relativist’s position.

New (semantic) relativists—whose motivations draw from analytic philosophy of language—regard this excluded possibility as not only viable, but moreover, the only legitimate way to capture a philosophically interesting kind of relativist position. The rationale for thinking this way has been articulated most notably by John MacFarlane (for example 2007, 2011, 2014). MacFarlane’s work over the past decade has stressed that simply relativizing propositional truth to what seem like exotic parameters (for example other than worlds and times—such as judges, perspectives, or standards (including epistemic standards)—is not in itself ‘enough to make one a relativist about truth in the most philosophically interesting sense’. This is because such relativization is compatible with truth absolutism, and MacFarlane’s position is that philosophically interesting relativism must part ways with the absolutist.

Consider, for example, that the epistemic contextualist (for example Cohen 1988; DeRose 1992, 2009) insists that whether ‘S knows that p’ is true can shift with different standards at play in different contexts in which the sentence ‘S knows that p’ is used. This is because, for the contextualist, my utterance of “Keith knows the bank is open” can express different propositions depending on the context in which I use this sentence. If I use the sentence in a context in which it doesn’t matter to me whether Keith knows the bank is open, what I’ve asserted can be true even if uttering the very same sentence would come out false if uttered in a context in which it is extremely important to me that the bank is open—and for the contextualist, this is so even if all other epistemically relevant features of Keith’s situation (for example what evidence Keith has for thinking the bank is open) are held fixed across these contexts of use. When knowledge ‘is relative to an epistemic standard’ in the way that the contextualist relativizes knowledge to an epistemic standard, it remains that a particular occurrence of ‘knows’ used in a particular context, gets its truth value absolutely. A philosophically interesting relativist, as MacFarlane sees it, denies this. The line, according to MacFarlane, between the (genuine) relativist and the non-relativist is best understood as a line that is between views that allow truth to vary with the context of assessment and those that do not’ (2014, vi). A context of assessment is a possible situation in which a use of a sentence might be assessed, where the agent of the context is the assessor of the use of a sentence. This view is described in more detail in Section 5.

This brings us to the point about inclusiveness. From the perspective of the new-age (semantic) relativist like MacFarlane, the kind of position described by Boghossian as epistemic relativism is not really an interesting relativist position. Boghossian’s epistemic relativist, modelled on Gilbert Harman’s (1975) moral relativism, is (by MacFarlane’s lights) best understood as a version of contextualism (see MacFarlane (2014: 33, fn. 5)). After all, (a la epistemic relationism) the explicitly relational claims which Boghossian regards the relativist as in the market to putting forward as true are candidates for absolute truth.

This article does not attempt to adjudicate which kind of approach to thinking about relativism, more generally, is the right one. Rather, the article is divided into two main parts: in short, (i) arguments for epistemic relativism which do not give a context of assessment a significant semantic role (Sections 2-4)—which is termed traditional arguments for epistemic relativism, and (ii) arguments that do—which is termed new (semantic) epistemic relativism (Sections 5-6). The former kinds of arguments are not primarily motivated by considerations to do with how we use language whereas the latter kind of argument strategy (the focus of Sections 5-6) is.

2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument

One influential argument strategy under the banner of epistemic relativism takes as a starting point a famous philosophical puzzle traditionally associated with Pyrrhonian skepticism— that is to say, the Pyrrhonian problematic. The most famous version of the puzzle, the ‘regress’ version of the problematic, goes as follows—the simple presentation here owes to John Greco (2013, 179). Suppose you claim to know that p is true but you are asked to provide a good reason for p. If it is granted that good reasons—for example the sort of reasons good enough to epistemically justify a belief—are non-arbitrary reasons, reasons that we have good reason to believe, then a regress threatens. The idea is that, at least, with the above assumptions in place, it looks as though knowledge as well as epistemic justification require an infinite number of good reasons. But it seems that this is something we do not have, and thus, as the puzzle goes, it looks like we do not know or justifiably believe anything. With reference to this puzzle, the sceptic effectively places the onus on her non-sceptical adversary to reject one or more of the assumptions underwriting the puzzle. Foundationalism, coherentism and infinitism are typically distinguished from one another with reference to which assumption(s) is rejected.

Against this background, Howard Sankey (2010; 2011; 2012) has argued, in a series of papers, that the Pyrrhonian problematic offers the tools to capture the most compelling argument strategy available to the epistemic relativist; in one place, he writes that the ancient Pyrrhonian argument “constitutes the foundation for contemporary epistemic relativism” (Sankey 2012, 184, my italics).

Sankey’s argument comes in primarily in two parts: a negative part and a positive part. Before outlining the negative part, some terminology is helpful.  Sankey (2013: 3) defines epistemic relativism in a restricted way: as a view about epistemic norms, where he defines an epistemic norm as ‘a criterion or rule that may be employed to justify a belief’.  Epistemic relativism is then defined as the thesis that there are no epistemic norms over and above the variable epistemic norms operative in different (local) cultural settings or contexts, where these local contexts are defined as always including at least a system of beliefs and a set of norms. (Sankey 2012, 187). For Sankey’s relativist, whether a belief is justified, or counts as knowledge, depends on epistemic norms, and so, given that different epistemic norms can operate in different contexts, the same belief might be rational/justified/knowledge relative to one context, and not to another.

Sankey’s ‘negative’ argument on behalf of the relativist appeals to the Pyrrhonian puzzle to generate the intermediate conclusion that all epistemic norms are on equal standing; his positive argument moves from the equal standing claim established by the negative argument to the conclusion that epistemic relativism (as he has defined it) is true. The negative argument can be summarized as follows: Take an epistemic norm, N1. Question: how is N1 to be justified? With reference to the Pyrrhonian puzzle, the options don’t look very promising. One option is to Justify N1 by appealing to a further epistemic norm N2. Another option is to justify N1 by appealing to N1. Sankey says neither of these options satisfactorily justifies N1; the former generates an infinite regress, the latter is viciously circular. Now: take any other epistemic norms, N3, N4 … Nn. By running through this same line of thinking with any of N3, N4 … Nn in an attempt to justify any of these norms, we end up in the same place. That is, each of N1 and N3, N4 … Nn are equally lacking in justification. From here, Sankey’s positive move (for example see Sankey 2011 §3, esp. pp. 564-566) on behalf of the relativist goes as follows:

If no norm is better justified than any other, all norms have equal standing. Since it is not possible to provide an ultimate grounding for any set of norms, the only possible form of justification is justification on the basis of a set of operative norms. Thus, the norms operative within a particular context provide justification for beliefs formed within that context. Those who occupy a different context in which different norms are operative are justified by the norms which apply in that context… the relativist is now in a position to claim that epistemic justification is relative to locally operative norms.

Sankey himself, not a relativist, attempts a naturalistically motivated overriding strategy to the argument—one which grants the relativistic challenge as legitimate and then attempts to meet the challenge (2010). Carter (2016) and Seidel (2013) by contrast have proposed undercutting responses which call into question whether the relativist can viably use the argument strategy which Sankey regards as the epistemic relativist’s strongest play. Carter (2016, Ch. 3) challenges the first (negative) part of the argument by noting that the intermediate conclusion (that all norms are equally justified) is one the would-be relativist is entitled to only if it is already granted that foundationalism, coherentism and infinitism are all unsuccessful. But Sankey’s relativist proposes no positive case for this—but rather takes it for granted.

Carter (2016) and Markus Seidel (2013, 137) have both expressed worries that, even if the first part of the argument were granted (and so, even if it were granted that by the Pyrrhonian strategy is effective in establishing that all epistemic norms are on epistemic standing), it’s not clear how relativism is to be motivated over scepticism. As Seidel puts it, Sankey’s relativist actually travels so far down the road with the sceptic that the relativist is “at pains to provide us with reasons [for the relativist to] part company” (137). That is: once it has been claimed that all norms are equally unjustified—no norm is more justified than in any other in any way—it is not apparent, as Seidel observes, how locally credible epistemic norms are supposed to have any positive epistemic status, positive status the relativist wants to preserve when insisting that epistemic norms aspire to relative justification.

For an alternative perspective for how relativism might be better motivated than scepticism—generally speaking—see Michael Williams (for example, 1991; 2001) who defends an anti-sceptical form of relativism (though he rejects this label), specifically a Wittgensteinian-inspired brand of contextualism’ (compare, DeRose 1992), as an alternative to both scepticism as well as metaepistemological realism.

3. Traditional Arguments for Epistemic Relativism: Non-Neutrality

Another kind of argument for traditional epistemic relativism is what Harvey Siegel (2011: 205) has termed the non-neutrality argument. A much-discussed reference point for this argument strategy is Rorty’s (1979) discussion of the famous dispute between Galileo and Cardinal Bellarmine about Copernican heliocentrism. In short, Galileo and Cardinal Bellarmine could not agree about the truth of Copernican heliocentrism, but even more, they also could not agree about what evidential standards were even relevant to settling the matter. Galileo had argued for the Copernican picture on the basis of telescopic evidence. Cardinal Bellarmine dismissed Galileo’s suggestion that Earth revolves around the sun as heretical, by appeal to Scripture. From these disparate starting points, Rorty noted, it looked as though neither was in a position to appeal to neutral ground in the service of rational adjudication—each was operating within a different “grid which determines what sorts of evidence there could be for statements about the movements of the planets” (Rorty 1979: 330-331).

Siegel (2011: 105-106) captures, with reference to this case, the relativist’s reasoning as follows:

The relativist here claims that there can be no non-relative resolution of the dispute concerning the existence of the moons, precisely because there is no neutral, non-question-begging way to resolve the dispute concerning the standards. Any proposed meta-standard that favors regarding naked eye observation, Scripture, or the writings of Aristotle as the relevant standard by which to evaluate “the moons exist” will be judged by Galileo as unfairly favoring his opponents since he thinks he has good reasons to reject the epistemic authority of all these proposed standards; likewise, any proposed metastandard that favors Galileo’s preferred standard, telescopic observation, will be judged to be unfair by his opponents, who claim to have good reasons to reject that proposed standard. In this way, the absence of neutral (meta-) standards seems to make the case for relativism.

The pro-relativist argument that is motivated by the Galileo/Bellarmine dispute, which Siegel (2011: 206) calls “No Neutrality, Therefore Relativism”, as represented in Siegel’s passage, can be pared down to the following argument:

“No Neutrality, Therefore Relativism”

  1. There can be a non-relative resolution of the dispute concerning the existence of the moons, only if there is an appropriately neutral meta-norm available.
  2. In the context of the dispute between Galileo and Bellarmine, no such metanorm is available.
  3. Therefore, it is not the case that there can be a non-relative resolution of the dispute concerning the existence of the moons.
  4. Therefore, epistemic relativism is true.

As stated, the argument is not valid. In order to make the argument valid, a further ‘bridge’ premise (or premises) would be needed to get from (3)—the premise that there can be no non-relative resolution of the dispute concerning moons [or some similar such dispute]—to the conclusion that epistemic relativism is true (4).

What are the prospects of ‘bridging’ (3) and (4)? The viability of a no-neutrality therefore relativism-style argument rests importantly on this question. Steven Hales (2014) defends a version of the no-neutrality therefore relativism argument which attempts to bridge the gap (between (3) and (4)) via process of elimination. Hales argues, with reference to a case involving a similarly deadlocked dispute concerning the nature of the human soul (by interlocutors who adhere to analytic philosophy of mind and the Catechism, respectively) that—from their irreconcilable position—the salient options for resolving the dispute are: (i) keep arguing until capitulation, (ii) compromise, (iii) locate an ambiguity or contextual factors; (iv) accept scepticism or (v) adopt relativism (Hales 2014: 63). Relativism is defended by Hales as the most satisfactory option.

Carter (2015, Ch. 4) has criticised this strategy. For one thing, appealing to relativism’s success as a disagreement-resolution strategy doesn’t obviously help move one from (3) to (4). For example, even if both parties’ can easily resolve their disagreement by adopting the belief that relativism is true, relativism might just as well be false. More generally, that interlocutors’ accepting something X is efficacious in resolving a dispute is not satisfactory grounds for thinking X is true or even probably true. Furthermore, Hales’ process of elimination strategy dismisses skepticism out of hand as “throwing in the towel.” However, this just reinvites the issue of why relativism should be (in the face of the no-neutrality, therefore relativism) argument regarded as motivated over skepticism. As with Sankey’s redeployment of the Pyrrhonian argument considered in Section 2, it is not clear how this is so.

It is worth noting that the no-neutrality therefore relativism argument is but one way philosophers have attempted to motivate relativism by pointing to disagreements. Another route is to appeal to what Max Kölbel (2003) calls “faultless disagreements” (for example, apparently genuine disagreements in some discretionary area of discourse where it seems neither party to the disagreement has made a mistake). These faultless disagreement strategies which appeal to disagreements to motivate relativism, and the neutrality-based strategy considered in this section, are only superficially similar. Unlike the no-neutrality, therefore relativism argument, faultless-disagreement arguments simply do not regard properties of any particular disagreement (for example, the disagreement between Bellarmine and Galileo) as in the market for establishing epistemic relativism. Faultless disagreement-style arguments reason from semantic and pragmatic evidence about disagreement patterns, much more generally, to the conclusion that a relativist semantics (in certain domains where we find such disagreements) best explains our practices of attributing certain terms. This kind of argument is discussed in more detail in Section 5, as it is an argument strategy used by new (semantic) epistemic relativists.

4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity

A third kind of argument which has motivated versions of epistemic relativism appeals to incommensurability and epistemic circularity. The idea is that, upon confronting radically different epistemic systems (for example, radically different Kuhnian paradigms, Wittgensteinian framework propositions or individuals who employ what Ian Hacking (1982) calls alien ‘styles of reasoning’) we are called upon to justify not just ordinary beliefs as we usually do, but rather the very epistemic system (that is, the set of epistemic principles or rules) within which we our epistemic evaluations are made. However, once we begin to attempt to justify our own epistemic system, epistemic circularity threatens. Michael Williams (2007: 3-4) expresses the idea on behalf of the relativist as follows:

In determining whether a belief—any belief—is justified, we always rely, implicitly or explicitly, on an epistemic framework: some standards or procedures that separate justified from unjustified convictions. But what about the claims embodied in the framework itself: are they justified? In answering this question, we inevitably apply our own epistemic framework. So, assuming that our framework is coherent and does not undermine itself, the best we can hope for is a justification that is epistemically circular, employing our epistemic framework in support of itself. Since this procedure can be followed by anyone, whatever his epistemic framework, all such frameworks, provided they are coherent, are equally defensible (or indefensible).

There are really two ‘key moves’ in this line of thinking. The first key move contends that—in the face of radically different epistemic systems from our own—our activity of attempting to justify our own epistemic system will lead to epistemic circularity. The second key move adverts to the claim that all attempts to justify epistemic systems result in epistemic circularity and from this claim concludes the epistemic relativist-friendly conclusion that all epistemic systems are equally defensible, or on a par.

The first move, stated more carefully, seems to be that, when an individual S is in a position where S is trying to justify S’s own epistemic framework or system, X, by attempting to justify the claims that comprise the system (x1 … xn), then: (i)  S must (inevitably) apply that system (X); and, the application, by S, of a system X to justify the claims (x1 … xn) of that very system, X, is sufficient for leaving S’s epistemic justification for the claims of X (x1 … xn) circular.

From here, it is helpful to note three central issues which are relevant to the success of this kind of ‘pro-relativist’ strategy, in so far as the kind of epistemic circularity that is supposed to materialise via the application of a system in its own defence is itself of a sort that will leave all epistemic systems equally defensible. The first two issues concern the first key move and the third concerns the second key move.

Firstly, note that it seems in principle possible to pre-empt epistemic circularity altogether by simply rejecting that the justification of S’s epistemic framework depends on S’s ability to non-circularly justify that framework. Consider, for example, the line an externalist reliabilist might take. The process reliabilist (for example, Goldman 1979) might say that the epistemic principles constituting S’s epistemic system (X) are justified simply provided they are reliable and regardless of whether one can successfully justify or know that they are reliable. Compare here the reliabilist’s commitment to basic knowledge— that is to say, that S can know p even though S has no antecedent knowledge that the process R that produced S’s belief is reliable. Likewise, as this idea goes—at greater generality—the reliabilist is in a position to submit that any positive epistemic status which the belief that our own epistemic principles are correct has does not depend on any antecedent facts about our appreciation that they have this status. The reliabilist attempts to undercut the circularity objection then by mooting it.

Two salient replies to this line of reasoning have to do with assertion and bootstrapping, respectively. Regarding assertion: as Mikkel Gerken (2012, 379) has suggested, although some conversational contexts are ones where “S may assert something although S is unable to provide any reason for it” other contexts may not be permissive in this way.  Discursive contexts are on Gerken’s view ones where “interlocutors share a presupposition that an asserter must be able to back up unqualified assertions by reasons… and in which ‘being a cooperative speaker involves being sensitive to reasons for and against what is asserted” (2012, 379). Gerken’s position is that, in such contexts, epistemically appropriate assertion must be discursively justified, where discursive justification is something S possesses only if S is able to articulate some epistemic reasons for believing that p. But if this is right, then, there is a case to make that while an externalist line such as the one sketched above cuts epistemic circularity off at the pass, it does so in a way that would effectively leave one in no position to claim (in the face of a challenge from an interlocutor with a radically different epistemic system) to know that one’s own system is correct.

A second salient kind of reply to the externalist move is to suggest, in short, that even if (with reference to the Williams passage quoted above) it looks as though epistemic circularity materialises only once one uses the epistemic principles constituting one’s own epistemic system in the service of justifying it, this might be misleading. The idea here is that if one attempts to cut this kind of epistemic circularity off at the pass, by opting for the reliabilist move sketched above, then one at the same time (at least, potentially) encounters what is allegedly another malignant form of epistemic circularity in the form of bootstrapping (for example, Vogel 2000)— that is to say, that one would be in a position to acquire track-record evidence via the deliverances of applying one’s own epistemic principles that the application of one’s own epistemic principles is reliable. This point, in conjunction with the previous point about assertion, suggest that the kind of circularity problem Williams intimates can’t be simply circumvented by ‘going externalist’ without also incurring some further challenges.

So the viability of an attempt to block epistemic circularity ex ante by “going externalist” was the first of three issues to highlight relevant to the viability of the kind of argument strategy Williams describes. The second issue concerns the nature of the epistemic circularity in question and which on this line of argument is said to materialise when one attempts to justify one’s epistemic system by appealing to it. Consider that there are in fact two very different kinds of ways in which one might apply an epistemic principle or rule in the service of justifying one’s epistemic system (where, again, the epistemic system is understood as a set of epistemic principles).

Firstly, one might apply a principle by simply following it (for example as when one might follow an inference rule in the service of justifying that inference rule or perhaps justifying the epistemic system of which the inference rule is a part). See Boghossian (2001). However, just as a judge might apply a rule (consider, the rule that ‘one must drive only with a license’) not by following the rule but by invoking its authority (for example McCallum 1966), one might also apply one’s own epistemic principle or principles not by following them but by invoking their authority. For example, one might attempt to justify inference to the best explanation (IBE) by invoking the authority of the wider system of epistemic principles within which IBE belongs: Western Science.

The overarching point here is that the kind of epistemic circularity that materialises as a function of one’s appealing to one’s epistemic system in the service of justifying it can take on different shapes—with different kinds of premise-conclusion dependence relations. Accordingly, an argument that attempts to move from epistemic circularity to relativism must thus be appropriately sensitive to these different shapes epistemic circularity can potentially take on when one applies one’s own epistemic system in the service of justifying it. This is because it is not obvious that all such shapes are equally epistemically objectionable. (For discussion on this point, see Pryor 2004 and Wright 2007).

The third issue to raise concerns the second ‘key move’ in the sequence Williams describes: the move that is supposed to get us from circularity to relativism. Even on the assumption that the kind of epistemically circular justification one is left with for one’s own epistemic principles (and more generally, one’s epistemic system) renders all epistemic principles on an ‘equal footing’—this equal-footing option is compatible with both scepticism as well as relativism. An argument successfully establishes epistemic relativism from the position described only if provides a non-arbitrary reason to embrace relativism over scepticism.

5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’

One recurring objection-type to traditional arguments for epistemic relativism (of the sort surveyed in §2-4) is that these arguments face a shared difficulty when it comes to showing why, in light of the philosophical considerations adverted to, relativism is at the end of the day a more attractive option than skepticism. New (semantic) epistemic relativism doesn’t face this kind of challenge. This is because new (semantic) relativism (hereafter, new relativism) is motivated on the basis of very different kinds of philosophical considerations than the argument strategies considered in §§2-4.

The present section is organised as follows: two preliminary points about new relativism are first noted, and then MacFarlane’s most substantial (2014) argument for an assessment-sensitive semantics for “knows” is outlined; it is an argument that depends on two key premises, and MacFarlane’s rationale for defending these premises are discussed in some depth. Note that while there are other ways of motivating semantic relativism that do not appeal explicitly to ‘contexts of assessment’ (for example, Richard 2004; Egan 2007), which is MacFarlane’s distinctive terminology, I am in what follows focusing on MacFarlane’s presentation, as it is the most developed.

That said, the first preliminary point to note concerns the relationship between epistemic contextualism and relativism. As was noted in section 1, epistemic contextualism is—by MacFarlane’s lights—not on the interesting side of the line between absolutism and relativism. The point to stress here is that while the contextualist can, no less than the relativist, recognize a ‘standards’ parameter (and in this respect can allow the extension of “knows” to vary with standards), for the contextualist, its value will be supplied by the context of use, whereas the relativist (proper) takes it to be supplied completely independently of the context of use, by the context of assessment.

The second preliminary remark concerns the rationale for embracing a MacFarlane-style relativist semantics for “knows” which should be understood as differing from the kind of rationale we find in Lewis’s (1980) and Kaplan’s (1989) foundational work in semantics according to which sentence truth was relativized to familiar parameters such as worlds, times and locations. The important point here is that while Lewis’s and Kaplan’s reasons for “proliferating” parameters were primarily based on considerations to do with intensional operators, the more contemporary reasons (for example as appealed to by MacFarlane and other ‘new relativists’) for adding a standards parameter (that is in the context of assessment) are often to do with respecting linguistic use data, for example disagreement data (for example, see Baghramian and Carter 2015). For example, those who endorse truth-relativism about predicates of personal taste, (for example Lasersohn 2005; Kölbel 2003, MacFarlane 2014) take a truth-relativist semantics to better explain our patterns of using terms like “tasty” than do competing contextualist, sensitive and insensitive invariantist semantics. Accordingly, defending new-age relativism typically involves, for some area of discourse D, a philosophical comparison of costs and benefits of different competing semantic approaches to the relevant D expressions, replete with a case for thinking that the truth-relativist all-things-considered performs the best. A familiar such claimed advantage by a MacFarlane-style truth-relativist is that the kind of ‘subjectivity’ (for example standards-dependence) the contextualist claims the traditional invariantist cannot explain can be captured by the relativist without—or so the relativist tells us—“losing disagreement” where losing disagreement is a stock objection to contextualism in areas where disagreements appear genuine.

In three different places, MacFarlane (2005, 2009, 2014) has argued that knowledge attributions of the form “S knows that p” are assessment-sensitive. The focus of his presentation has varied across these three defenses of the view, but one core strand of thought resurfaces each time.

For ease of convenience, we can call this core strand MacFarlane’s “master argument” for an assessment-sensitive semantics for knowledge attributions.

Master Argument for Assessment Sensitive Semantics for Knowledge Attributions

(1) Standard invariantism, contextualism and SSI all have advantages and weaknesses.

(2) Relativism preserves the advantages while avoiding the disadvantages.
(3) Therefore, prima facie, we should be relativists about knowledge attributions.

The remainder of this section attempts to show why MacFarlane thinks that premises (1) and (2) of the master argument are true, and thus why he thinks we should embrace a relativist treatment of “knows”. The discussion to this end draws primarily from MacFarlane’s latest presentation of his relativist treatment of “knows”, one which gives the notion of relevant alternatives a central place.

Question: Why should we think (1) is true? As MacFarlane sees things, each of the three standard views of the semantics of knowledge-attributions—standard invariantism, contextualism and subject-sensitive invariantism (SSI)—has a grain of truth to it, as well as an “Achilles heel: a residuum of facts about our use of knowledge attributions that it can explain only with special pleading” (2005, 197).

His latest way of making this point relies on a kind of sceptical “conundrum”, one which arises in light of our ordinary practices of attributing knowledge, and which he uses as a frame of reference for magnifying what he regards as the salient weaknesses of the three standard views.

MacFarlane’s Conundrum: If you ask me whether I know that I have two dollars in my pocket, I will say that I do. I remember getting two dollar bills this morning as change for my breakfast; I would have stuffed them into my pocket, and I haven’t bought anything else since. On the other hand, if you ask me whether I know that my pockets have not been picked in the last few hours, I will say that I do not. Pickpockets are stealthy; one doesn’t always notice them. But how can I know that I have two dollars in my pocket if I don’t know that my pockets haven’t been picked? After all, if my pockets were picked, then I don’t have two dollars in my pocket. It is tempting to concede that I don’t know that I have two dollars in my pocket. And this capitulation seems harmless enough. All I have to do to gain the knowledge I thought I had is check my pockets. But we can play the same game again. I see the bills I received this morning. They are right there in my pocket. But can I rule out the possibility that they are counterfeits? Surely not. I don’t have the special skills that are needed to tell counterfeit from genuine bills. How, then, can I know that I have two dollars in my pocket? After all, if the bills are counterfeit, then I don’t have two dollars in my pocket (2014: 177).

MacFarlane articulates the form of the conundrum-argument as follows:

(i) p obviously entails q. [premise]

(ii) If a knows that p, then a could come to know that q without further empirical investigation. [1, Closure]

(iii) a does not know that q and could not come to know that q without further empirical investigation. [premise]

(iv) Hence a does not know that p. [2, 3, modus tollens]

Standard (insensitive) invariantism, the view that the epistemic standards that must be met for “S knows p” to be true are not (in any way) context sensitive, faces two central problems, by MacFarlane’s lights. Both problems are familiar. Firstly, standard invariantism has trouble making sense of the variability of our willingness to attribute knowledge. Secondly, standard invariantism seems stuck with an unhappy choice of either: embracing scepticism (if the invariantist simply accepts (iv)), embracing dogmatism (if the invariantist tries to avoid the sceptical conclusion (iv) by rejecting (iii)), or rejecting the closure principle which licenses the move from (i) to (ii)— that is to say, the principle that (as MacFarlane states it): ‘if a knows that p, and p obviously entails q, then a could come to know q without further empirical investigation’ (2014, 177).

By contrast, contextualism offers a way to avoid each of these problems facing standard invariantism. Unlike the invariantist whose position is at tension with data about the variability of our willingness to attribute knowledge, the contextualist has an explanation to offer for this variability: namely, our willingness to attribute knowledge varies across contexts because what is meant by “knows” is sensitive to the context in which it is used. As MacFarlane writes, “on the most natural form of this view, ‘knowing’ that p requires being able to rule out contextually relevant alternatives to p. Which alternatives are relevant depends on the context”. For instance, and with reference to MacFarlane’s Conundrum, when I’m first asked whether I know (p)—that I have two dollars in my pocket—‘knowing’ that p requires I need only to be able to rule out very basic alternatives (for example that I didn’t already spend the $2); I needn’t be able to also rule out that my pockets have been picked to count as ‘knowing’ (Ibid., p. 177). Though when someone asks me whether my pockets have been picked, then ‘knowing’ requires ruling out this alternative, and if I can’t, then the standard required for ‘knowing’ in this context is not met. Contextualism can make sense not only of the variability of our willingness to attribute knowledge, but it also avoids the unpalatable dilemma facing standard invariantism: reject closure or embrace scepticism or dogmatism. As the standard line goes, contextualists needn’t be tarred as sceptics or dogmatists because they can in fact preserve closure, at least, within any one context of use. So contextualism is looking pretty good.

However, although treating “knows” like “tall”—where the meaning of knows depends on the context in which it is being used—offers a nice escape route (vis-à-vis MacFarlane’s Conundrum), there are other respects in which treating “knows” like “tall” raises new problems. For example, an apparent disagreement between A and B about whether Michael Jordan is tall quickly is revealed to be no disagreement at all when it is clear to both parties that A means “tall for a given person” and B means “tall for an NBA player”. However, as MacFarlane notes, things are different with “know”. He writes:

If I say “I know that I have two dollars in my pocket,” and you later say, “You didn’t know that you had two dollars in your pocket, because you couldn’t rule out the possibility that the bills were counterfeit,” I will naturally take your claim to be a challenge to my own, which I will consider myself obliged either to defend or to withdraw. It does not seem an option for me to say, as the contextualist account would suggest I should: “Yes, you’re right, I didn’t know. Still, what I said was true, and I stick by it. I only meant that I could rule out the alternatives that were relevant then.” Similarly, the skeptic regards herself as disagreeing with ordinary knowledge claims—otherwise skepticism would not be very interesting. But if the contextualist is right, this is just a confusion (Ibid., p. 181; compare, Vogel 1990).

And here is where the special pleading comes in. The contextualist can attempt to say that our taking each other to agree/disagree with each other in the relevant kinds of cases is just a mistake of some sort. But, as MacFarlane sees it, this is a double edged sword: the more speaker error the contextualist must posit to explain the way we use “knows”, the less the contextualist can rely on the way we use “knows” to support contextualism. While contextualism does better than standard invariantism in that it avoids the dilemma raised to standard invariantism, standard invariantism makes better sense of disagreement.

By contrast with insensitive invariantism and contextualism, subject-sensitive invariantism (‘SSI’) might have the best offer to make yet. According to SSI, whether my utterance of “Archie knows that his car is in the parking lot” is true does depend on context, though in a different sense than it does for the contextualist: rather than depending on what alternatives I (the utterer of the sentence) can rule out (for example whether or not I know there are no thieves lurking nearby) what matters on SSI is whether Archie, the subject of the knowledge attribution, can rule out the alternatives relevant to his practical environment. This proposal has some advantages. For one thing, the ‘SSIist’ looks well-positioned to make sense of disagreement, given that ‘knows’ is not being treated like ‘tall’. Further, the SSIist unlike the insensitive invariantist can make sense of variability in willingness to attribute knowledge. Where the special pleading comes in concerns temporal and modal embedding.

The alleged problem (see, for example, Blome-Tillmann 2009) for SSIists is this: temporal and modal operators shift the circumstances of evaluation in such a way that, if SSI is true, we should expect that (in cases of temporal and modal embeddings of “know”) knowledge attributions will track whether the subject can rule out alternatives relevant in the subject’s practical environment in the (temporally or modally shifted) circumstance of evaluation. But this prediction doesn’t seem to pan out, as speakers are inclined to regard the same alternatives as relevant when evaluating non-embedded and embedded uses of “know”.

As MacFarlane sees it, I will not be inclined to say either of the following, which the SSIist predicts I should be willing to say:

Temporal embedding: I know that I had two dollars in my pocket after breakfast, but I didn’t know it this morning, when the possibility of counterfeits was relevant to my practical deliberations—even though I believed it then on the same grounds that I do now.

Modal embedding: I know that I have two dollars in my pocket, but if the possibility of counterfeiting were relevant to my practical situation, I would not know this—even if I believed it on the same grounds as now.

The moral of the story—though see Stanley (2016) for a reply on behalf of the SSIist—is supposed to be that, while each of the three leading competitor views does better than others in some respects, none of these views can make sense of our willingness to attribute knowledge without some sort of Achilles heel. And that is more or less MacFarlane’s defense of (1) in the master argument.

What about premise (2)? Premise (2) of the master argument, recall, says that:

(2) Relativism preserves the advantages while avoiding the disadvantages.

Toward the end of defending (2), MacFarlane suggests that what we want is a semantics for knowledge attributions that satisfies the following three key desiderata, desiderata such that (as he takes himself to have established in defending (1)) none of the three leading contender views can satisfy all of them:

Alternative-variation: It would explain how the alternatives one must rule out to count as knowing vary with context (otherwise, the view faces the dilemma facing insensitive invariantism, with respect to MacFarlane’s conundrum).

Alternative variation context ( use): the alternatives one must rule out to count as knowing must not vary with context of use (otherwise: disagreement cannot be preserved, a la contextualism).

Alternative variation context ( circumstances of subject): the alternatives one must rule out to count as knowing must not vary with circumstances of the subject to whom knowledge is ascribed (otherwise: temporal and modal embeddings cannot be made sense of, a la SSI).

Here is where the relativist is said to come to the rescue. The first step is to preserve alternative variation by taking the relevant alternatives to be determined by the context of assessment. As MacFarlane puts it:

The resulting view would agree with contextualism in its predictions about when speakers can attribute knowledge, since when one is considering whether to make a claim, one is assessing it from one’s current context of use. So it would explain the variability data as ably as contextualism does, and offer the same way of rescuing closure from the challenge posed by the conundrum. But it would differ from contextualism in its predictions about truth assessments of knowledge claims made by other speakers, and about when knowledge claims made earlier must be retracted. Moreover … it would vindicate our judgments about disagreement between knowledge claims across contexts (MacFarlane 2014, 188).

What about the temporal and modal embedding problem that faced SSI? Relativism, he argues, dodges this because a parameter for a set of contextually relevant alternatives is added to the index as a parameter distinct from world and time indices such that shifting the world and time indices (for example as when ‘knows’ is temporally or modally embedded) does not involve shifting also the relevant alternatives parameter (Ibid., 188).

the relation “knows” expresses does not vary with the context—there is just a single knowing relation—but the extension of that relation varies across relevant alternatives. As a result, it makes sense to ask about the extension of “knows” only relative to both a context of use (which fixes the world and time) and a context of assessment (which fixes the relevant alternatives). (Ibid., 189).

MacFarlane takes the view he hass proposed as one that escapes the sceptical conundrum while threading the gauntlet so as to avoid the disagreement problem that faces contextualists and the temporal and modal embedding problem that faces SSI. At this stage, we can see why MacFarlane thinks his view has all the advantages and none of the disadvantages. This concludes the presentation of MacFarlane’s defense of premise (2) of the master argument. And from (1) and (2) it follows that “knows” gets a relativist treatment.

6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology

Is MacFarlane’s argument sound? Interestingly, this is relatively new terrain. The above line of argument is from 2014, so there has yet to be substantial criticism in the literature to this new form of relativism. See, however, Carter (2016, Ch. 7) for criticisms of MacFarlane’s (2014) view to the effect that the view generates the wrong results in cases of environmental epistemic luck and normative defeaters.

In this section, however, the focus is on implications in epistemology for embracing an assessment-sensitive semantics for “knows.” MacFarlane concludes his 2009 defense of an assessment-sensitive semantics for “knows” with a section entitled “Questions for the Relativist.” One question he asks, in light of his recommendation to extend a truth-relativist semantics for “knows” is: “are there other expressions for which a relativist treatment is needed? How does know relate to them?” (MacFarlane 2009: 16). A more specific version of this question is: if “know” gets a truth-relativist semantics, then since knowledge relates intimately with other epistemic concepts, do any other epistemic concepts need a relativist treatment? This is an important question and one which has obvious implications for the wider shape new epistemic relativism would take.

In tracing out epistemological ramifications of a relativist treatment of ‘knows’ in epistemology, it is helpful to begin with especially tight conceptual connections (between knowledge and other epistemic standings) and move outward from there. This section takes as a starting point two such connections: namely, connections between propositional knowledge and (i) evidence; and (ii) knowledge-how (for a more detailed discussion, see Carter 2017).

Firstly, evidence. Consider, as an example case, Williamson’s (2000) knowledge-evidence equivalence: E=K. Suppose, for reductio, that E=K, and further, that the truth-conditions for E are not assessment sensitive, but the truth-conditions for K, are. The resulting tension would be untenable (at best), at worst, contradictory. While of course Williamson’s view is controversial, it seems that if Williamson is right that our evidence is what we know, and thus that S’s evidence includes E if, and only if, S knows E, then one who embraces a relativist semantics for (propositional) knowledge ascriptions should be willing to embrace the view that that evidence ascriptions are assessment-sensitive.

Of course, E=K is a controversial position. The above point however was meant to illustrate one very straightforward sense in which a commitment to giving a relativist treatment to “knows” would have a straightforward implication in epistemological theory.

Let us move from a straightforward equivalence thesis (as was E=K) to a reductivist thesis. We needn’t look further than the most standard contemporary version of intellectualism about knowledge-how. Reductivist versions of intellectualism (compare Bengson & Moffett (2011)) insist that knowing how to do something is just a species of propositional knowledge (Stanley 2010, 207). As Stanley puts it:

[…] you know how to ride a bicycle if and only if you know in what way you could ride a bicycle. But you know in what way you could ride a bicycle if and only if you possess some propositional knowledge, viz. knowing, of a certain way w which is a way in which you could ride a bicycle, that w is a way in which you could ride a bicycle (Ibid., 209).

Like Williamson’s E=K thesis, Stanley’s reduction of knowledge-how to a kind of knowledge-that is also controversial, though very much a live and increasingly popular view in contemporary epistemology. Suppose, for reductio, that knowing how to do something is (a la Stanley) just a kind of propositional knowledge, and further, that the truth-conditions for knowing how to do something (for example, as in the case of attributions of the form “Hannah knows how to ride a bike”) are not assessment sensitive, but the truth-conditions for proposition knowledge are, such that “Hannah knows p” is assessment-sensitive, where p is a proposition specifying of a way w which is a way in which Hannah could ride a bicycle, that w is a way in which Hannah could ride a bicycle. Again, the resulting tension would be untenable (at best), at worst, contradictory.

What the foregoing brief consideration of evidence and knowledge-how indicates is that, at least for those with certain substantive commitments in epistemology where epistemic standings other than knowledge are either identified with or in some way reduced to (a kind of) propositional knowledge, an extension of an assessment-sensitive semantics to these standings as well looks potentially unavoidable. One interesting future direction of research will be to trace out the implications of a relativist semantics for “knows” even further, by moving outward to epistemic standings with (perhaps) looser but not insignificant conceptual connections to knowledge, such as justification, rationality, understanding and intellectual virtue. See Carter (2014; 2015, Ch. 8) for some discussion here. A further complementary direction for future research will be to consider how other notions, besides “knows’ for which a relativist semantics has been proposed might have implications in epistemology. A natural candidate expression here is “ought” (for example, Kolodny and MacFarlane 2010; MacFarlane 2014, Ch. 11). In short, if the moral ought gets a relativist treatment, it is hard to see how the epistemic ought would not likewise. However, if the epistemic “ought” is relative, then this has ramifications for epistemic normativity more generally. For example, if whether one ought to believe something is a relative matter, then plausibly, whether one is justified in believing something is a relative matter. Likewise, if epistemic oughts are relative, then presumably so will the epistemic norms which generate epistemic oughts.

A relativist treatment of “knows” also stands to have interesting implications for epistemologists concerned with how the kind of function the concept of knowledge plays might potentially inform our theory of knowledge. A flourishing contemporary research program within mainstream epistemology, one which Robin McKenna (2013) has called the “functional turn” in epistemology, takes as a starting point that “a successful analysis of knowledge must also fit with an account of the distinctive function or social role that the concept plays in our community […] Call this the ‘functional turn’ in epistemology (McKenna 2013: 335-336). Participants in the functional turn in epistemology appeal to practical explications of the concept of knowledge, on the basis of which they identify a function, where that function is regarded as generating an ex ante constraint on an analysis of knowledge (or a semantics of knowledge attributions). Henderson (2009; 2011), McKenna (2013; 2014), Pritchard (2012) and Hannon (2013; 2014; 2015) have for instance defended views about the concept of knowledge (or knowledge ascriptions) inspired by Craig’s (1990) favoured account of the function of knowledge as identifying good informants. By contrast, Kappel (2010), Kelp (2011) and Rysiew identify closure of inquiry as the relevant function and regard this rather than Craig’s tracking-good-informants function as generative of an ex ante constraint for theorizing about knowledge and its truth-conditions. For Krista Lawlor (2013) the relevant function is identified (a la Austin) as that of providing assurance.

Can “knows”, given a relativist treatment, potentially play (any of) these widely identified functional roles— that is, of identifying reliable informants, marking the closure of inquiry or providing assurance? This is an open question for future research.

Finally, and much more generally, semantic (new) relativism about “knows” raises some interesting metaepistemological issues. Mainstream epistemologists, by and large, take for granted within epistemological theory that the explanandum under the description of “knowledge” is not relative. If the ordinary concept of knowledge, however, requires a relativist treatment, then this presses the complicated issue of whether the ordinary concept of knowledge and the concept of interest to epistemologists are the same, and (even more generally) just how knowledge attributions should inform the theory of knowledge.

7. References and Further Reading

  • Baghramian, Maria. Relativism. London: Routledge, 2004.
  • Baghramian, Maria. The Many Faces of Relativism. London: Routledge, 2014.
  • Baghramian, Maria and Carter, J. Adam. “Relativism.” Stanford Encyclopedia of Philosophy, 2015. http://plato.stanford.edu/entries/relativism/
  • Blome-Tillmann, Michael. “Contextualism, Subject-Sensitive Invariantism, and the Interaction of’ Knowledge’-Ascriptions with Modal and Temporal Operators.” Philosophy and Phenomenological Research (2009): 315-331.
  • Boghossian, Paul. “How are Objective Epistemic Reasons Possible?” Philosophical Studies 106, no. 1 (2001): 1-40.
  • Boghossian, Paul. “Epistemic Relativism.” The Routledge Companion to Epistemology, 2011. doi:10.4324/9780203839065.ch8.
  • Boghossian, Paul. Fear of Knowledge: against Relativism and Constructivism. Oxford: Clarendon Press, 2006.
  • Carter, J. Adam. Metaepistemology and Relativism. Palgrave Macmillan, 2016.
  • Carter, J. Adam. “Disagreement, Relativism and Doxastic Revision.” Erkenntnis 79, no. S1 (February 2013): 155–72. doi:10.1007/s10670-013-9450-7.
  • Carter, J. Adam. “Relativism, Knowledge and Understanding.” Episteme 11, no. 01 (April 2013): 35–52. doi:10.1017/epi.2013.45.
  • Carter, J. Adam. “Epistemological Implications of Relativism.” In J.J. Ichikawa (ed.) Routledge Handbook of Contextualism, 2017, London: Routledge.
  • Chrisman, Matthew. “From Epistemic Contextualism to Epistemic Expressivism.” Philosophical Studies 135, no. 2 (2006): 225–54. doi:10.1007/s11098-005-2012-3.
  • Cohen, Stewart. “How to Be a Fallibilist.” Philosophical Perspectives 2 (1988): 91. doi:10.2307/2214070.
  • Craig, Edward. Knowledge and the State of Nature: An Essay in Conceptual Synthesis. Oxford University Press, 1990.
  • Cuneo, Terence. The Normative Web: An Argument for Moral Realism. Oxford: Oxford University Press, 2007.
  • DeRose, Keith. The Case for Contextualism. Oxford: Oxford Univ. Press, 2009.
  • Derose, Keith. “Contextualism and Knowledge Attributions.” Philosophy and Phenomenological Research 52, no. 4 (1992): 913. doi:10.2307/2107917.
  • Derrida, Jacques. Of Grammatology. Baltimore: Johns Hopkins University Press, 1976.
  • Egan, Andy. “Epistemic Modals, Relativism and Assertion.” Philosophical Studies 133, no. 1 (2007): 1-22.
  • Gerken, M. “Discursive justification and skepticism.” Synthese, (2012). 189 (2), 373-394.
  • Gibbard, Allan. Wise Choices, Apt Feelings: A Theory of Normative Judgment, n.d.
  • Greco, John. “Reflective Knowledge and the Pyrrhonian Problematic.” Virtuous Thoughts: The Philosophy of Ernest Sosa, 2013, 179–91. doi:10.1007/978-94-007-5934-3_10.
  • Hacking, Ian. ‘Language, Truth and Reason.’ In Rationality and Relativism, 48–66. (1982).
  • Hales, Steven D. “Motivations for Relativism as a Solution to Disagreements.” Philosophy 89, no. 01 (September 2013): 63–82. doi:10.1017/s003181911300051x.
  • Hales, Steven D. Relativism and the Foundations of Philosophy. Cambridge, MA: MIT Press, 2006.
  • Harman, Gilbert. “Moral Relativism Defended.” The Philosophical Review 84, no. 1 (1975): 3. doi:10.2307/2184078.
  • Kaplan, David (1977). “Demonstratives.” In Joseph Almog, John Perry & Howard Wettstein (eds.), Themes from Kaplan. Oxford University Press 481-563.
  • Kölbel, Max. “III-Faultless Disagreement.” Proceedings of the Aristotelian Society (Hardback) 104, no. 1 (2004): 53–73. doi:10.1111/j.0066-7373.2004.00081.x.
  • Kolodny, Niko, and John MacFarlane. “Ifs and Oughts.” The Journal of philosophy 107, no. 3 (2010): 115-143.
  • Lammenranta, Markus. “The Pyrrhonian Problematic.” Oxford Handbooks Online, 2008. doi:10.1093/oxfordhb/9780195183214.003.0002.
  • Lasersohn, Peter. “Context Dependence, Disagreement, and Predicates of Personal Taste.” Linguistics and Philosophy 28 (6):643—686, 2005.
  • Lewis, David. “Index, Context, and Content.” In Stig Kanger & Sven Öhman (eds.), Philosophy and Grammar: Reidel (1980): 79-100.
  • MacFarlane, John. Assessment Sensitivity: Relative Truth and Its Applications. Oxford: Oxford University Press, 2014.
  • MacFarlane, John. ‘The Assessment Sensitivity of Knowledge Attributions.’ Oxford Studies in Epistemology 1: 197– 233 (2005).
  • Macfarlane, John. “Relativism and Knowledge Attributions.” The Routledge Companion To Epistemology, 2011. doi:10.4324/9780203839065.ch49.
  • Macfarlane, John. “Xiv *-Making Sense of Relative Truth.” Proceedings Of the Aristotelian Society (Hardback) 105, no. 1 (2005): 305–23. doi:10.1111/j.0066-7373.2004.00116.x.
  • McKenna, Robin. “Knowledge Ascriptions, Social Roles and Semantics. ”Episteme 10(4), 335-350. (2013).
  • Meiland, Jack and Michael Krausz. (eds). Relativism, Cognitive and Moral. Notre Dame, Indiana: University of Notre Dame Press, 1982.
  • Olson, Jonas. “Error Theory and Reasons for Belief.” Reasons for Belief, 2011, 75–93. doi:10.1017/cbo9780511977206.006.
  • Pryor, James. “What’s Wrong with Moore’s Argument?” Philosophical Issues 14, no. 1 (2004): 349-378.
  • Richard, Mark. “Contextualism and Relativism.” Philosophical Studies 119, no. 1 (2004): 215-242.
  • Rorty, Richard. Philosophy and the Mirror of Nature. Princeton: Princeton University Press, 1979.
  • Sankey, Howard. “Scepticism, Relativism and the Argument from the Criterion.” Studies In History and Philosophy of Science Part A 43, no. 1 (2012): 182–90. doi:10.1016/j.shpsa.2011.12.026.
  • Sankey, Howard. “Witchcraft, Relativism and the Problem of the Criterion.” Erkenn Erkenntnis 72, no. 1 (2009): 1–16. doi:10.1007/s10670-009-9193-7.
  • Seidel, Markus. Epistemic Relativism: A Constructive Critique. Palgrave MacMillan, 2014.
  • Seidel, Markus. “Scylla and Charybdis of the Epistemic Relativist: Why the Epistemic Relativist Still Cannot Use the Sceptic’s Strategy.” Studies in History and Philosophy of Science Part A 44, no. 1 (2013): 145–49. doi:10.1016/j.shpsa.2012.10.004.
  • Seidel, Markus. “Why The Epistemic Relativist Cannot Use the Sceptic’s Strategy. A Comment on Sankey.” Studies In History and Philosophy of Science Part A 44, no. 1 (2013): 134–39. doi:10.1016/j.shpsa.2012.06.004.
  • Stanley, Jason. “On a Case for Truth Relativism.” Philosophy and Phenomenological Research 92.1, 2016: 179-188
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism.” Episteme 4, no. 01 (2007): 93–114. doi:10.3366/epi.2007.4.1.93.
  • Williamson, Timothy. Knowledge and its Limits. Oxford: Oxford University Press, 2000.
  • Vogel, Jonathan. “Reliabilism Leveled.” The Journal of Philosophy 97 (2000): 602-623.
  • Vogel, Jonathan. “Are there Counterexamples to the Closure Principle?”. In Michael David Roth & Glenn Ross (eds.), Doubting: Contemporary Perspectives on Skepticism. Dordrecht: Kluwer (1990): 13-29.
  • Williams, Michael. Unnatural Doubts: Epistemological Realism and the Basis of Skepticism, Cambridge, MA: Blackwell, 1991.
  • Williams, Michael. Problems of Knowledge, Oxford and New York: Oxford University Press, 2001.
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism,’ Episteme 4 (2007): 93–114.
  • Wright, Crispin. “Fear of Relativism?” Philosophical Studies 141, no. 3 (2008): 379–90. doi:10.1007/s11098-008-9280-7.
  • Wright, Crispin. “The Perils of Dogmatism.” In Nuccetelli & Seay (eds.), Themes from G. E. Moore: New Essays in Epistemology. Oxford University Press
  • Wright, Crispin. “New Age Relativism and Epistemic Possibility: The Question of Evidence 1.” Philosophical Issues 17, no. 1 (2007): 262–83.

 

Author Information

J. Adam Carter
Email: jadamcarter@gmail.com
University of Edinburgh
United Kingdom

Tu Weiming (1940—)

Tu Weiming (pinyin: Du Weiming) is one of the most famous Chinese Confucian thinkers of the 20th and 21st centuries. As a prominent member of the third generation of “New Confucians,” Tu stressed the significance of religiosity within Confucianism. Inspired by his teacher Mou Zongsan as well as his decades of study and teaching at Princeton University, the University of California, and Harvard University, Tu aimed to renovate and enhance Confucianism through an encounter with Western (in particular American) social theory and Christian theology. His writings about Confucianism have served as critical links between Western philosophy and religious studies and the world of modern Confucian thought. Tu asserted that Confucianism can learn something from Western modernity without losing recognition of its own heritage. By engaging in such “civilizational dialogue,” Tu hoped that different religions and cultures can learn from each other in order to develop a global ethic. From Tu’s perspective, the Confucian ideas of ren (“humaneness” or “benevolence”) and what he calls “anthropocosmic unity” can make powerful contributions to the resolution of issues facing the contemporary world.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism has also generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than Tu. Nonetheless, Tu’s influence on contemporary Confucian philosophy cannot be overestimated, especially where its reception in the West is concerned.

Table of Contents

  1. Biography
  2. Confucianism as Religious Humanism
    1. Continuity of Being
    2. Anthropocosmic Unity
  3. Selfhood as Creative Transformation
    1. Ren
    2. Li
    3. The Junzi or Profound Person
    4. Fiduciary Community and Filial Piety
    5. Embodied Knowing
      1. Cheng
      2. Fourfold Human Nature
  4. Confucianism and Modernity
    1. Critique of Confucian Tradition
    2. Critique of Western Enlightenment Thought
    3. Civilizational Dialogue
  5. Influence and Criticism
  6. References and Further Reading

1. Biography

Tu was born to well-educated parents in Kunming, Yunnan Province, China in 1940. He has described the nanny who helped to care for him as uneducated, yet expressive of Confucian values in her daily words and actions. Thus, although Tu did not study the Confucian classics during his childhood, he was brought up immersed within a Confucian cultural environment. In 1949, he moved with his family to Taiwan and studied at Taipei Municipal Jianguo High School. At that time, the Taiwan government was promoting national moral education, which included a heavy emphasis on Confucianism—a subject about which the young Tu became enthusiastic. Among his teachers was Zhou Wenjie, a student of the “New Confucian” philosopher Mou Zongsan. After completing high school, Tu enrolled in Taiwan’s Tunghai University, where he studied directly with Mou as well as Mou’s fellow “New Confucian” thinker, Xu Fuguan. Tu’s undergraduate studies with Mou and Xu led to his being awarded a Harvard-Yenching Institute scholarship to study at Harvard University in the United States. Here, he completed courses taught by luminaries of Western social thought, such as the sociologists Talcott Parsons and Robert N. Bellah, and the historian of religions Wilfred Cantwell Smith, earning both an M.A. (1963) and a Ph.D. (1968) in East Asian studies. Beginning in 1967, Tu served as a faculty member in a series of prestigious U.S. universities, initially at Princeton University and later at the University of California at Berkeley, from which he went on to become a professor at Harvard University (1981-2010). As of 2016, Tu held two academic positions, serving as both the Chair Professor of Humanities and Founding Director of the Institute for Advanced Humanistic Studies at Beijing University and as Research Professor and Senior Fellow of the Asia Center at Harvard University.

2. Confucianism as Religious Humanism

Tu’s understanding of selfhood is intertwined with his understanding of religiosity. Indeed, one of the distinctive features of Tu’s Confucianism is his emphasis on the religiousness of Confucianism. Many Confucian scholars stress that Confucianism is a cultural tradition or a philosophy rather than a religion. However, Tu insists that Confucianism also involves religiosity. Unlike other religions, Confucianism is not an institutional religion; however, similar to other religions, Confucianism has its ultimate concern, which is creative self-transformation of the self. According to Tu, ethics (norms for behavior), aesthetics (theory of value), and religiosity (commitment to engagement with an ultimate concern) are inseparable from one another for Confucianism, and Chinese traditional thought in general. Thus, for Tu, being religious is equivalent to learning to be fully human. It is an infinite developmental process and the ultimate ideal is to achieve the oneness of Heaven (or Tian—the cosmic source of ethical and aesthetic values for Confucians) and humanity—in Chinese, Tianren heyi. This transformational ideal of self-cultivation is at the heart of Tu’s understanding of Confucianism as a form of religious humanism.

According to Tu, Confucian self-cultivation is a long and strenuous process; it is also a process of ceaseless creative self-transformation. What is important is one’s conscious determination to do so. In a key early work, Centrality and Commonality: an Essay on Confucian Religiousness (1989; later republished in a bilingual Chinese/English edition as Zhongyong dongjian [An Insight into Zhongyong, 2008], Tu highlights the distinction between “learning for the sake of the self” and “learning for the sake of others.” From a Confucian perspective, one’s motivation of self-cultivation should be for the sake of the self—that is, it should be seen as an intrinsic good, not undertaken for reasons that are purely external to oneself. As Tu states, “A decision to turn our attention inward to come to terms with our inner self, the true self, is the precondition for embarking on the spiritual journey of ultimate self-transformation”. Learning only for the sake of others is an inauthentic motivation, because it is driven by consideration of others’ opinions, rather than a genuine desire to cultivate oneself. However, Tu does not argue that Confucianism aims at cultivating a private ego. Rather, for Tu, Confucianism emphasizes the cultivation of a true self critically. Here, Tu sees himself as making a claim similar to the classical Confucian thinker Mencius (Mengzi), for whom the aim of knowing the true self is to recognize the inner “great body” (dati) which signifies the true self that can form a unity with Heaven, Earth and numerous other things. Thus, the ultimate aim of self-cultivation is the unity of Heaven and humans.

One important approach to recognizing the great body is to identify one’s sense of commiseration, that is, the sense of sympathy and empathy, as the unique characteristic of human nature distinguishing us from animals. To recognize the great body, as Tu understands this, also means to establish the will (lizhi), to make decisions to act in accordance with the great body. We must deliberately transcend the temptation of learning for the sake of others in order to learn for ourselves. The willingness to do this would help us to would help us to access the inexhaustible inner resources of self-transformation and to achieve a state of being truly “self-possessed” (zide) (to use Mencius’ term)—that is, the authentic way of learning to be human.

With his study of Mencius and other neo-Confucians, Tu argues that Confucianism encourages “human perfectibility through self-effort.” It means that the source of self-actualization is in humans, rather than through the mediation of some supernatural agent. It assumes that everyone possesses sufficient internal resources for ultimate self-transformation. And this assumption is further based on a Chinese cosmology that, in his Confucian Thought: Selfhood as Creative Transformation (1985), Tu calls the “continuity of being.”

a. Continuity of Being

Tu believes that the aim of unity of Heaven and humans is perceivable and realizable through self-effort because of what Confucians assume is a kind of continuity between the self, others, and Heaven and Earth, in which humans, by nature, share a certain reality of Heaven. Therefore, we are obligated to cultivate ourselves to actualize this moral ideal. Thus, Tu argues that Chinese cosmology does not reject the idea of a Creator, but that Confucianism, unlike Christianity, does reject the dichotomy of creator and creature. For Confucianism, it is inconceivable that man can be alienated from Heaven in any essential way. For this reason, Tu’s Confucian religiosity possesses no equivalent to the Christian idea of original sin and divine grace.

According to Tu, within Chinese culture the cosmos is perceived as an unfolding of all-embracing continuous creativity in which all modalities are organically connected. It consists of a dynamic energy field in which all parts of the cosmos mutually interact in a “spontaneously self-generating life process.” While the nature of the cosmic is impersonal, it is not inhuman. It is impartial to all modalities of being, and rejects a kind of anthropocentrism. While human beings are part of this, our minds and consciousness make us unique in that we are able to probe “the transcendental anchorage of our nature,” and to achieve “sympathetic accord with the myriad things in nature.” This sense of continuity of being also gives a sense of deep awe towards nature and Heaven within Chinese culture, an aspiration to sustain harmony with nature. Thus, Tu argues, self-knowledge is both a necessary and sufficient condition of knowing the Way (Dao), that is, the way of actualization of authentic human nature. As humans are endowed by Heaven with the “centrality” of the universe—the most refined quality of the universe—Heaven sends humans on the mission of transforming the cosmos into its actualization.

It is important to note that here Tu is taking liberties with the English translation of the Chinese phrase zhongyong, the title of one of the “Four Books” of classical Confucianism, often rendered as “doctrine of the mean” in scholarship prior to Tu’s work on the text. Instead, Tu understands zhong as “centrality” and yong as “commonality.”

“Centrality” denotes the most refined and irreducible quality inherent in human beings. Therefore, the way that comes from “centrality” cannot be separated from us. It is an ontological condition of being, in which one’s mind is unperturbed by external forces. According to ancient Chinese thought, human beings have embodied the centrality of Heaven and Earth. Therefore, human beings are united with Heaven and Earth by inherent centrality. Centrality is a state of the self in which emotions are not yet aroused. When our emotions are aroused to the due measure, it is called harmony. Thus, while centrality is the great foundation of the universe, harmony is an appropriate, unfolding expression of the inner self. Tu reminds us that although the great foundation is inherent in human beings, it does not guarantee that everyone can attain the harmonious state. It is a heavy burden and a long road to attain centrality and harmony; one needs the determination of self-cultivation. By “commonality,” Tu means activities that are ordinary and common, such as eating and walking, which are deployed to describe the Way in Zhongyong. Thus, Tu emphasizes, human beings must accept their embeddedness in the world as the condition of self-transcendence. As the Way (Dao) is inseparable from our ordinary life, we must realize our humanity through our ordinary daily existence. It is tantamount to fulfilling our Heavenly-ordained mission.

b. Anthropocosmic Unity

For Tu, the Heaven-human relationship is not simply that of creator and creature, but “one of mutual fidelity; and the only way for man to know Heaven is to penetrate deeply into his own ground of being.” Thus, Tu argues, the ideal of the Heaven-human relationship is neither theocentric (God-centered or, in Confucian terms, Heaven-centered) nor anthropocentric (human-centered), but “anthropocosmic unity.” By this, Tu means that through a continuous interaction between the two, “the human way necessitates a transcendent anchorage for the existence of man and an immanent confirmation for the course of Heaven.” Because of the mutuality of Heaven and humans, the Confucian Heaven-human relationship is also a kind of “transcendent as immanent.”

From the above analysis, we can see that Confucian religiousness involves three interrelated dimensions: (1) the self-transformation of the person, (2) the communal act of community and (3) the dialogical response to the transcendent. Confucianism aims at forming a kind of organismic unity between humans and Heaven. Thus, Tu stresses that Confucianism is a kind of “inclusive humanism” with “an anthropocosmic idea.” Unlike secular humanism, Tu’s Confucian inclusive humanism is very much concerned with the transcendent. Indeed, Tu claims that if Confucianism wants to be continuously developed it must develop its spiritual tradition in a religious form, so that it can continuously contribute to society. This is because Tu considers religiosity as that which provides human beings with a sense of deep awe towards Heaven and nature; without this sense of transcendence, human life would be shallow. Moreover, for Tu, although human beings by nature are earthbound, they have an aspiration to transcend themselves and to join with Heaven.

Given Tu’s ideas about the “continuity of being” that properly leads to the development of “anthropocosmic unity,” we can see why Tu’s thesis of “learning for the sake of the self” is not a kind of ego-centrism. As there is continuity of being between the self, the others and the cosmos, learning for the sake of the self can be said to be also for the sake of others, as well as the whole world. Furthermore, in the following section, it is explained that Tu does not reject the motivation of self-cultivation driven by our sincere feeling towards our family members. Thus, what Tu really wants to reject seems to be the motivation of “learning for the recognition from others” rather than “learning for the sake of others.”

3. Selfhood as Creative Transformation

While Tu’s Confucianism emphasizes learning for the sake of the self, it does not mean that one can learn to be fully human separated from the community. Tu asserts that people generally establish a “fruitful communication with the transcendent through communal participation.” Apart from communal participation, humans must develop a “constant dialogical relationship with Heaven” and are transformed through “a faithful dialogical response to the transcendent” in self-cultivation. For Tu, implicit in Confucian thought is a “covenant” with Heaven, for our moral duty is to achieve the highest human aspirations of forming a “trinity with Heaven and Earth” through self-realization and community-perfecting. Thus, on Tu’s account, Confucian self-cultivation is a gradual process of combining all levels of the community in the process of self-transformation, from the family to the neighborhood, clan, race, nation, world, and finally to the universe and the cosmos.

Following the work of his teacher, Bellah, Tu argues that one of the perennial human problems in modernity is individualism. In order to respond to individualism, Tu investigates the possibility of “a new vision of the self which is rooted in the reality of a shared life together with other human beings and inseparable from the truth of transcendence” (CT, 8) and his answer is “Confucian selfhood as creative transformation” (CT, 7). Tu stresses that the main concern of Confucianism is how to become a sage or how to become fully realized as an authentic human being. Confucianism believes that man is perfectible by self-effort and one has to achieve self-realization through self-cultivation.

a. Ren

Tu understands the classical Confucian term ren (benevolence or humaneness) as the highest human achievement of self-cultivation and the fullest manifestation of humanity. A man of humanity (renren), who embodies love in his daily life, represents the most authentic realization of human beings from Tu’s Confucian perspective. However, in actual practice, there is a gradual process of extension of love, and ren is most exemplified in our caring toward our relatives (qinqin). This is further discussed in the following section. For Tu, ren is primarily concerned with the self, although human relations are also crucial to it. Tu conceives ren as a concept of personal morality, as a principle of the inward process of self-fulfilment. It is not simply a personal virtue, but also a metaphysical moral mind that is, at the same time, equivalent to the cosmic mind. Thus, ren is both the moral and metaphysical foundation of self-cultivation. While we are earthbound and limited, we can also participate in the communal and divine enterprise of self-transformation, that is, to enlarge one’s humanity so that humanity as a whole shared by every human being will be enriched. The driving force of such anthropological and cosmological assumptions enable Confucianism to function as an ethico-religious system in Chinese society despite its lack of institutional religious character.

b. Li

According to Tu, li (ritual) is the externalization of ren in a specific context. Li implies the existence of human relationships. For Tu as a Confucian, the self and sociality are not separable. Society is conceived as an extended self. Confucian self-transformation must be manifested in the context of sociality. And li points to a concrete manifestation by which one enters into proper relations with others. Tu conceives li as a dynamic process of humanization. It is a process of individual development from (1) cultivating personal life to (2) regulating familial relations, and then (3) ordering the social affairs, before finally (4) bringing peace to the world. It assumes forms of integration between personality, family, state, and the world. Thus, for Tu, Confucian self-cultivation is a gradual process of inclusion. Without li, there would be a lack of a concrete manifestation of ren. However, without ren as the internal basis, li could become coercive and distort human nature. While ren denotes metaphysical reality, li means the standard of this world. Thus, Tu argues, there is a creative tension about Confucian concepts of ren and li, and it is important to seek the balance between ren and li in a dynamic process.

c. The Junzi or Profound Person

Just as with the phrase zhongyong, in another case Tu uses license in translating a classical Confucian term. Junzi, which typically is rendered as “gentleman” or “superior person” in other English-language scholarship, becomes “profound person” in Tu’s usage of the term. This translation shows Tu’s particular concern regarding the inner nature of morally superior persons. Basically, junzi is the paradigmatic model of Confucian personality. For Tu, self-knowledge of the profound person is deeper than that of average people. As the Way (Dao) is inherited in human nature, the actualization of the Way very much depends on our self-knowledge. While the Way is deeply rooted in Heaven-endowed human nature, it is manifested in ordinary life. Thus, the profound person must be sensitive to one’s interiority through which true humanity is manifested as the Way. However, following Confucian tradition, Tu claims that only a few have the inner strength to fully actualize what is inherent in them.

To be a profound person means that one can be (1) sensitive to the outside world, (2) at ease with oneself and (3) courageous. The important way of self-cultivation to achieve self-knowledge is “vigilant solitariness” (shentu) or “self-watchfulness when alone,” that is, a kind of continuous vigilance on one’s own. It is “a process toward an ever-deepening subjectivity.” On the one hand, through continuous critical self-examination, one is sensitive to the subtle manifestation of one’s inner feelings. On the other hand, by such inner penetration of the self, one is able to reach the reality underlying common humanity and to realize the true nature of human-relatedness. Tu argues that in practising vigilant solitariness, one can hear one’s authentic self as expressing the quality of Heaven-ordained nature. As our innermost core is common to all human beings, our self-understanding of Heaven-ordained nature would help us to know the “great foundation” (tapen) of the cosmos.

The uniqueness of the profound person lies in how he integrates the Way with his daily life; therefore, the way of a profound person is also understood as the common way. However, such commonness cannot be defined in terms of an abstract formula by which everyone can learn to become a profound person. Thus, while the Way is common, it is also dynamic in nature. As Tu states, “its meaning can never be fully comprehended and its potential never exhausted. There is always something ‘hidden’ in its commonness.” In the face of constantly changing social situations, a profound person can still be at peace with themselves. This is because he “rectifies himself and seeks nothing from others”, and can recognize the possibilities in his fate and bring himself into harmony with the situation. Thus, it is a kind of creative transformation not only of the self, but also of the world in the spirit of Zhongyong. Therefore, in practising vigilant solitariness, one must also be courageous in order to be truthful to oneself and to resist the temptation of seeking recognition from others as a motivation for self-cultivation. One must possess an inner strength to continue the long and strenuous task of self-cultivation at one’s own pace, undisturbed by the changing environment.

d. Fiduciary Community and Filial Piety

As the Way (Dao) starts with human-relatedness, the idea of community is also an important concern for Confucianism. For Tu, the ideal Confucian society is conceived as “a fiduciary community based on mutual trust.” Thus, the goal of politics is not simply the establishment of law and social order, but also the development of a fiduciary community through moral persuasion by rectifying the ruler’s moral character as a moral leader.

The self, in the process of creative transformation, embodies the network of continuous expanding and deepening human relationships. According to Tu, the network is “a series of concentric circles” which involves different kinds of structural limitations, such as gender, race, and social background. However, through self-cultivation, the structural limitations of each circle can be transformed into an instrument of self-transcendence, and we can extend ourselves continuously to achieve unity with Heaven and Earth and other numerous things. When we have achieved unity with the most generalized commonality, we also reconfirm the centrality of the self. Tu states, “This broadening and deepening of the self can be characterized, in Mencian terminology, as the manifestation of the ‘great self’ and the concomitant dissolution of the ‘small self’.”

In Confucianism, filial piety is considered the prime virtue of the community underlying the anthropocosmic vision. With most other Confucians, Tu defines filial piety in terms of “transmission and continuity.” He understands exercising filial piety as honoring one’s parents by transmitting the wisdom and values exemplified by their parents, and continuing their unfinished tasks. The underlying idea is that their developments are indebted to the tradition, that is, the origin of their existence; thus they are responsible for transmitting the wisdom of the old and honor their forefathers in their ancestral line. As Tu points out,

The centrality of filial piety in Confucian ethics is predicated on the belief that human beings become aware of themselves by responding naturally to the loving care of those around them. Such a reciprocal response, laden with rich symbolic significance for the transmission and continuity of humanity, is seen by the Confucians as the way to provide a solid basis for personal growth: filial piety and brotherly love are roots (pen) of humanity. (C&C, 140)

Tu states that Confucians reject the idea of identity formation through isolation, because our sincere feeling towards our family members can provide motivation for our self-transformation. When we have nurtured our mind/heart to be capable of regulating our family, we will then open ourselves and transcend our egocentrism. However, familialism may degenerate into nepotism and may become a kind of structural limitation. Thus, for Tu, we have to establish meaningful relationships with people outside our familial members and transcend the nepotism of familialism. Furthermore, in order to avoid being a narrow-minded nepotist, ren must go with righteousness (yi)—that is, a sense of judgment and honoring the worthy. Thus, li is not only the externalization of ren; it also signifies a structure of yi realized in the context of human relations. It is primarily concerned with establishing an authentic way of human-relatedness.

Indeed, Tu is thoroughly Confucian insofar as he considers filial piety to be the foundation of political virtue. Traditionally, a filial son is perceived as likely to be vigilant about his personal conduct, diligent about family affairs, compassionate in regard to social obligations and, therefore, qualified for political assignments. Historically, filial sons often end up becoming loyal ministers. According to Tu, Confucianism often uses family as a metaphor for country and the world, addressing the emperor as son of Heaven, and the magistrate as the “father-mother official,” because Confucianism perceives a kind of transcendent vision implicated by familial nomenclature. It implies that the self is not egoistic; rather the enrichment of the self is through cultivating one’s relationship with the family. The emphasis of family is not equivalent to nepotism; rather it is concerned with a developmental process towards organismic unity with the country and the world. Thus, Tu claims that Confucianism perceives self-cultivation and regulation of the family as the root, and governing the country and peace of the world as the branches. And “the dichotomy of root and branch conveys the sense of a dynamic transformation from self to family, to community, to state, and to the world as a whole.” (C&C, 148).

Rituals and ceremonies (li), as manifestations of the ethico-religiosity of community, are also important elements of moral education and social solidarity. In particular, through ancestral worship, the old are respected and the dead are honored for their past contributions; their yearning for the forefathers also establishes their communal identity. Indeed, with the anthropocosmic vision, we are not only responsible to our ancestors, but also to future generations, for they will also transmit our values and continue our tasks.

It is well-known that Confucianism is very much concerned about five cardinal human relationships. According to Tu, reciprocity (shu) is the fundamental principle of five cardinal human relationships and human-environmental relationships. Reciprocity helps us to harmonize social relationships, interact sympathetically with nature and establish a dialogical relationship with Heaven. “Through reciprocity, humanity becomes interfused with the cosmic transformation and thus, as a co-creator, forms a trinity with Heaven and Earth. Humanity, in this perspective, stands as the filial son and daughter of the cosmos.” (C&C, 134).

Another important virtue is reverence toward Heaven. Tu states that filial piety and reverence toward Heaven are “parallel principles in the Confucian anthropocosmic worldview.” Both do not simply serve political purposes, but fundamentally bear a cosmological concern, that is, a concern of “bringing peace and harmony to the universe.” Thus, these two virtues attempt to establish “a pattern of mutual dependence and organismic unity” between Heaven and humans.

e. Embodied Knowing

For Tu, moral reasoning is a kind of “embodied knowing” (tizhi). Following Neo­Confucianism, Tu argues that Confucian ontology rejects Kantian metaphysics which assumes the objectivity of the moral will, excluding any emotional dimension. Tu’s Confucianism also rejects the study of humanity by means of the scientific method. Following Zhang Zai, Tu argues for the distinction between moral knowledge and empirical knowledge. While empirical knowledge derives ideas from our detached observations, moral knowledge cannot be grasped in a disengaged way. Rather, moral knowledge is a kind of bodily experiential knowledge based on reflection on one’s bodily practice and experience. Crucial to Tu’s understanding of embodied knowing are his interpretations of two key classical Confucian concepts, cheng (sincerity) and the fourfold division of human nature into body (shen), heart/mind (xin), soul (ling) and spirit (shen).

i. Cheng

Unlike Christianity, Tu points out, Confucian thought does not regard revelation as the foundation of perceiving the moral order, but rather looks to “our common experience.” In order to perceive this ultimate reality and to attain our true humanity, people must learn to be cheng, that is, to be sincere, true, and real, to our common experience. According to Zhongyong, cheng leads to enlightenment (ming). Cheng also allows us to fully realize ourselves and to understand that Heaven is inherent in human nature. Tu stresses that the possibility of being cheng is not because of divine grace; rather it is based on the idea of heavenly endowed human nature, by which the identification of human nature with reality of Heaven becomes possible. According to Tu, cheng not only means sincerity, but also authenticity. To be cheng does not only signify what a person should be in an ultimate sense, but also signifies a process of actualizing the ultimate reality in ordinary life. The program of self-cultivation that leads to the development of cheng is “a process toward an ever-deepening subjectivity” which involves “a deep penetration into one’s own ground of existence” and includes and embraces others as “an integral part of one’s quest for self-realization.” For this reason, says Tu, Confucianism gives enormous value to the practice of vigilant solitariness.

Ontologically speaking, the expression of the moral subject is a priori true and sincere. However, in reality, if we do not maintain the exercise of self-cultivation, we cannot be cheng enough, and the moral knowledge of the subject would finally be exhausted. This is because our moral knowledge and action are intertwined. Our bodily experiential knowledge can lead to self-transformation, that is, the enhancement of moral knowledge and the renovation of one’s disposition. Thus, for Confucianism, embodied knowing must necessarily lead to the enhancement of moral practices. The highest human achievement by moral self-cultivation is to become an authentic moral person, which is what Tu calls “renren,” that is, a man who embodies ren; such embodiment must have a concrete manifestation by the observance of rites (li).

The ultimate concern of self-cultivation is not simply about the self, but also to manifest our humanity. As human nature shares the ultimate reality of Heaven, human nature is potentially a manifestation of the reality of Heaven. The manifestation of authentic humanity implies that human beings, as co-creators, participate in the creative process of the cosmos. Tu stresses that such a process is not creation ex nihilo (out of nothing). Rather, humans are “capable of assisting the transforming and nourishing process of heaven and earth.” In short, the creative transformation of sage is derived from human inner nature by focusing on our common human experience. As Tu states, “The way of the sage therefore is centered on the commonality of human nature.”

Moreover, Tu’s vision of Confucian embodied knowing and self-cultivation is to be realized through social practice in a complicated social network. The aim of Confucian self-cultivation is not only to establish oneself, but also to establish others. Thus, our moral knowledge is not simply derived through our own self-understanding, but also by knowing others. Knowing others is not through a kind of disengaged introspection, but rather though participating in a network of mutual trust, knowing others’ dispositions and characters through dialogue and interaction with them. Thus, embodied knowing is also a kind of empathic sensual perception; it rejects the objectification of others, things or humans. And it can accommodate and integrate everything in the world, letting all these things become something that is non-objectified in our minds. Finally, as there is continuity between the being of Heaven and that of humans, embodied knowing is also related to the framework of unity and harmony between humans and Heaven (Tianren heyi). Everyone can be connected to Heaven by embodied knowing of one’s own nature.

ii. Fourfold Human Nature

Tu’s idea of embodied knowing is based on his fourfold division of human nature, perhaps better understood as four levels of subjectivity. According to Tu, the foundation of Confucian morality is an embodied person with sensitivity and emotion. It includes body (shen), heart/mind (xin), soul (ling) and spirit (shen), that is, four different levels. Our embodied knowing is an experiential knowledge based on the integration of these four different levels.

Unlike self-cultivation in some religions which are concerned about nurturing the human soul only, Confucian self-cultivation very much concerns the human body. Tu reminds us that Confucian self-cultivation literally means “nourishing the body” (xiushen). As our body includes five sense organs, Confucian teaching traditionally emphasizes nourishing our bodily senses through the Six Arts (liuyi) – the six ancient disciplines of ritual, music, archery, charioteering, calligraphy, and mathematics, which were foundational to the classical Confucian curriculum. The aim is to aestheticize human life through the practice of rituals (li) and music (yue), to cultivate one’s disposition, to facilitate one’s thinking and emotion-controlling, and finally to achieve one’s embodiment of virtues (yishen tizhi).

While our body contains sense organs, our heart/mind is a rational faculty integrating our different sensual experiences. Here, Tu is building on the Neo­Confucian interpretation of Mencius’ anthropology as the study of heart/mind (xin). Our heart/mind includes spiritual resources endowed by Heaven as the defining feature of being human. It also provides theoretical and practical foundations of our self-cultivation. The spiritual resources are four germinations (siduan) and the power of the will for self-realization. Four germinations are four kinds of universal predispositions. They are shown in our sense of (1) commiseration, (2) shame, (3) reverence, and (4) rightness and wrongness. By cultivating these four germinations, we can acquire four Confucian virtues: benevolence (ren), propriety (yi), observance of rites (li), and wisdom (zhi). By doing our best with our mind, we can extend our virtuous sense not only towards another person, but it also “flows abroad, above and beneath, like that of Heaven and Earth” (Mencius 7A:13). The power of will is what Mencius calls, “vast, flowing qi” (haoranzhiqi). This power is great and strong; it can be forever latent, but is never totally lost. If one can nourish this power with uprightness, “it will fill the space between Heaven and earth” (Mencius 2A: 2). Thus, profound persons should focus on tapping their own internal energy in the process of realizing humanity.

Apart from body and mind, there are also soul and spirit in human beings. Tu claims that the soul is the extension of the mind, a kind of awareness in the existential situation. Spirit is the state of transcendence, the ultimate goal of self-cultivation. The relation between soul and spirit is just like that of body and mind, where the former is concrete, definite, and with fixed shape, and the latter is indefinite and hardly traceable. Tu argues that we can find Confucian stages of development of self-transcendence in the Mencius. The stage from goodness to realness belongs to the ascending level from body to mind. Beauty to greatness is at the rise from the mind to the soul. Sage-spirit is about the ascending status from the soul to the spirit. Thus, one self-elevating process inevitably involves one’s embodiment through ritual practice, in which one can learn about one’s mind, and then be aware of one’s soul, and finally rise to the level of spirit. By such Confucian self-cultivation, we can experience a kind of union with the cosmos; therefore, the embodiment of virtue and a ritual-musical cultivated harmonious world can finally be ratified. Here and throughout Tu’s presentation of Confucianism, the role of religiosity in the Confucian humanistic project remains central.

4. Confucianism and Modernity

In Confucian China and its Modern Fate (1958), Joseph R. Levenson argues that due to the degeneration of feudal society which nourished Confucianism, Confucianism inevitably faded into the background of modernity. However, Tu disagrees with Levenson and criticizes Levenson’s analysis for failing to distinguish between “Confucian China” and “Confucian tradition.” For Tu, “Confucian China” denotes the politicalized Confucian ideology of traditional Chinese feudal society. However, by “Confucian tradition,” Tu means the main spirit of Confucian Chinese culture by which Chinese people traditionally have governed their lives. To a certain extent, Confucian tradition is responsible for the problem of Confucian China. However, Tu believes that as long as we can eliminate the adverse influences of Confucian China, the Confucian tradition can be revived.

In order to refute Levenson’s claims for Confucianism’s demise, Tu explores the possibility of the “third epoch” of Confucianism. Tu considers Confucianism from earliest times through the Han dynasty (202 B.C.E.-220 C.E.) as the first epoch, Neo-Confucianism from the Song (960-1279 C.E.) through the Ming (1368-1644 C.E.) dynasties as the second epoch, and Confucian thought from the May Fourth New Cultural Movement (1919 C.E.) to the present as the third epoch. Historically, the development of Confucianism occurred through dialogue and debate with other traditions in China. For Tu, the challenges to Confucianism in its third epoch come not only from Western science, democracy, psychology, and religiosity, but also from the more universal question of how Confucianism as an inclusive humanism can address the perennial human problems of the world–that is, how Confucianism can be a new philosophical anthropology for humanity as a whole. Tu is not unique in formulating contemporary Confucianism’s intellectual and spiritual situation in such terms, as similar formulations can be found in the work of his teacher, Mou, as well as the writings of Mou’s associates, Tang Junyi and Xu Fuguan. Where Tu distinguishes himself from other “New Confucians” is in his insistence that contemporary Confucians must not only (1) reflect on the past problems of traditional Confucianism, but also (2) communicate with different civilizations so that Confucian thinkers can benefit from such dialogues and thus contribute to the global world.

a. Critique of Confucian Tradition

Regarding the controversies about traditional Confucianism as it existed prior to modernity, Tu thinks that one crucial problem was its political integration with despotic regimes. He calls this kind of politicized Confucianism “Confucian China,” by which he means a conservative political ideology used by the powerful to control and oppress people, including the internalized oppression whereby Chinese minds became accustomed to rejecting any proposals for cultural change or reform. Tu’s critique thus is very much in sympathy with the May Fourth New Cultural Movement, but he argues that some May Fourth intellectuals went too far in their criticisms of Confucianism by uncritically embracing Western Enlightenment thought, which had problems of its own. Pointing out that the West has its vices and China has its virtues, culturally and intellectually speaking, Tu suggests that Chinese interaction with Western culture should be grounded in a deep understanding of the Chinese cultural heritage and a realistic awareness of the pitfalls of Western-style modernity. Finally, Tu insists that there can be different kinds of modernity; Western modernity is not the only way, especially for China.

b. Critique of Western Enlightenment Thought

Tu is very much appreciative towards values and institutions of Western modernity brought to us by the Enlightenment, values such as reason, freedom, equality, individuality, rule of law, democracy, science, and capitalism. He thinks that China can learn from these in its modernization. However, Tu criticizes Enlightenment for inducing scientism, anthropocentrism, and individualism. Anthropocentrism and scientism bring to us a kind of disenchantment and finally lead to the exploitation of nature and destruction of the environment. Anthropocentrism with the rise of capitalism also leads to the marketization of the economy, politics, academia, and education, and, what is deemed as worse by Tu, the marketization of religions. Capitalist society has been dominated by the instrumental reason which, for Tu, is disembodied, unsympathetic, and even cruel. Individualism also caused the prevalence of egocentrism and the disintegration of community. Thus, Enlightenment has finally precipitated the antagonism among humans and antagonism between humans and nature. In these respects, Tu thinks that Confucian values, such as sympathy, distributive justice, strong sense of communal responsibility, propriety, and the sense of collectivism and emphasis of self-restraint, should also be universalized and be able to contribute to the modern world.

c. Civilizational Dialogue

In response to the many Western social theorists who regard Western modernity as the future trend of human development, Tu points out the fallacy of Eurocentric chauvinism present in their work. Inspired by the existentialist philosopher Karl Jaspers’ concept of the “Axial Age”—whereby modern civilizations have been shaped by different cultural and intellectual traditions that arose between the 700s and the 200s B.C.E., including Confucianism—Tu argues that the rise of East Asian industrialization has dispelled the myth of Westernization as the single model of modernization. In the case of Singapore, for example, Tu cites how Confucian virtues—such as frugality, industriousness, self-discipline, loyalty, and active participation in collective welfare—have proven to be constitutive to the success of the ethnically Chinese city-state. In Tu’s view, Singapore—and by extension all of industrialized East Asia—could not have proceeded to modernity by a route other than its own (Chinese) cultural traditions, and cannot be judged by the standards of other cultural traditions. Thus, for Tu, the development of modernity is a pluralistic phenomenon that proceeds by different cultural pathways.

Moreover, Tu identifies the twenty-first century cultural-historical moment as a kind of new “Axial Age,” in which conditions of cultural and religious pluralism can foster constructive dialogue between traditions and civilizations. Throughout this conversation, each civilization should open itself to learning from others, on the one hand, while at the same time maintaining a strong sense of self-recognition so that it will develop, rather than dissipate, as a result of dialogue. Tu’s hope is that such civilizational dialogue will aid in the search for a global ethic. In Way, Learning, and Politics: Essays on the Confucian Intellectual (1993, hereafter abbreviated as WLP), Tu writes: “If the well-being of humanity is its central concern, Confucian humanism in the third epoch cannot afford to be confined to East Asian culture. A global perspective is needed to universalize its concerns. Confucians can benefit from dialogue with Jewish, Christian, and Islamic theologians, with Buddhists, with Marxists, and with Freudian and post-Freudian psychologists.” (WLP, 158-9).

This emphasis on inter-cultural, inter-religious, inter-disciplinary, and inter-civilizational dialogue is one of the defining features of Tu’s Confucianism. It also represents a response to the theory of the “clash of civilizations” proposed by Samuel P. Huntington. Huntington argues that international conflict in the post-Cold War era was primarily caused by conflicts between people’s cultural and religious identities. Huntington anticipates that the conflict between the West and what he calls Confucian and Islamic civilizations will be the greatest conflict in the future. Tu does not deny the existence of a civilizational clash; nevertheless, Tu criticizes Huntington’s civilizational clash as one-sided and static in its assumptions about the structure of civilizations. In contrast, Tu finds that the development of civilizations is dynamic and mutually influential, and he argues that the trend of global development should be the promotion of civilizational dialogue rather than the anticipation of civilizational conflict. From Tu’s Confucian perspective, this posture reflects a distinctively Chinese faith that true harmony may be achieved by respecting differences, as stated in Lunyu (Analects) 13:23: “The superior man is conciliatory but does not identify himself with others.” Such civilizational dialogue is crucial for the future development of Confucianism as Tu envisions it, just as Confucianism in the past developed out of dialogue with Buddhism in China. If contemporary Confucians want their tradition to continue to grow, argues Tu, then they must face the challenge of Western civilization. Tu goes so far as to say that, without such civilizational dialogue, there can be no future for Confucianism as a living tradition.

But the value of civilizational dialogue does not lie only with its benefits to the Confucian tradition. In Tu’s mind, Confucianism is not simply for people in China or East Asia, but also for the whole world. Tu thinks that the third epoch of Confucianism must respond to four aspects of challenges from the West: (1) the spirit of scientific inquiry, (2) democracy, (3) Western religiosity and its sense of transcendence, and (4) the Freudian psychological exploration of human nature. Thus, Confucianism must open itself, reach out to the world, and learn from the West about the ideas of freedom, equality, science, democracy, human rights, and rule of the law, while rejecting the Western tendencies toward either radical individualism or radical collectivism. Confucianism must reject its own past elements of authoritarianism, hierarchalism, and androcentrism (male-centeredness) while conserving the value to be found in traditional Confucian aesthetics, morality and religiosity. Through such civilizational dialogue, Tu believes that Confucianism can both renew itself and become a valuable resource for the world.

5. Influence and Criticism

One of the important and influential contributions of Tu’s Confucianism is his ongoing dialogue with non-Confucian religions and social theories. Tu’s many years of living in the United States make such dialogue possible and have enabled him to become the foremost spokesperson for Confucian thought in the West. However, his prolonged expatriate status also renders him vulnerable to criticisms from Confucian thinkers in China and elsewhere outside of the West.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism also has generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than a diaspora thinker such as Tu. For example, some Chinese Confucian thinkers have pointed out that the idea of a covenant with Heaven can scarcely be found in Confucian texts, even in implicit terms. Others have questioned whether Confucianism truly possesses a concept of dialogue between human beings and Heaven, given that Confucius is recorded as having taught that Heaven does not say anything (Analects 17:19). Finally, Tu’s philosophical anthropology—particularly his distinction between soul (ling) and spirit (shen), which is not always clear in his writings—has come under fire from some other Confucian critics.

Despite such controversies, however, Tu’s enormous impact and legacy as a modernizer of Confucian thought and champion of Confucian engagement with non-Confucian traditions must be acknowledged.

6. References and Further Reading

  • Hung, Andrew T. W. “Tu Wei-Ming and Charles Taylor on Embodied Moral Reasoning.” Philosophy, Culture, and Traditions 3 (2013): 199-216.
  • Huntington, Samuel P. “The Clash of Civilizations?” Foreign Affairs 72/3 (Summer 1993): 22-49.
  • Levenson, Joseph R. Confucian China and its Modern Fate: The Problem of Intellectual Continuity, Volume 1. Berkeley: University of California Press, 1958.
  • Tu, Wei-ming. Neo-Confucian Thought in Action: Wang Yang-ming’s Youth (1472-1509). Berkeley: University of California Press, 1976.
  • Tu, Wei-ming. Humanity and Self-Cultivation: Essays in Confucian Thought. Berkeley: Asian Humanities Press, 1978.
  • Tu, Wei-ming. Confucian Ethics Today: The Singapore Challenge. Singapore: Federal Publications, 1984.
  • Tu, Wei-ming. Confucian Thought: Selfhood as Creative Transformation. Albany: State University of New York Press, 1985.
  • Tu, Wei-ming. Centrality and Commonality: an Essay on Confucian Religiousness, Albany: State University of New York Press, 1989. Later published as Zhongyong dongjian (An Insight into Zhongyong), bilingual (Chinese and English) edition, trans. Duan Dezhi (Beijing: People’s Publishing House, 2008).
  • Tu, Wei-ming. Way, Learning, and Politics: Essays on the Confucian Intellectual. Albany: State University of New York Press, 1993.
  • Tu, Wei-ming, and Mary Evelyn Tucker, eds. Confucian Spirituality. 2 vols. New York: The Crossroad Publishing Company, 2003-04.

 

Author Information

Hung Tsz Wan Andrew
Email: ccandrew@hkcc-polyu.edu.hk
Hong Kong Community College, The Hong Kong Polytechnic University
China

Thick Concepts

A term expresses a thick concept if it expresses a specific evaluative concept that is also substantially descriptive. It is a matter of debate how this rough account should be unpacked, but examples can help to convey the basic idea. Thick concepts are often illustrated with virtue concepts like courageous and generous, action concepts like murder and betray, epistemic concepts like dogmatic and wise, and aesthetic concepts like gaudy and brilliant. These concepts seem to be evaluative, unlike purely descriptive concepts such as red and water. But they also seem different from general evaluative concepts. In particular, thick concepts are typically contrasted with thin concepts like good, wrong, permissible, and ought, which are general evaluative concepts that do not seem substantially descriptive. When Jane says that Max is good, she appears to be evaluating him without providing much description, if any. Thick concepts, on the other hand, are evaluative and substantially descriptive at the same time. For instance, when Max says that Jane is courageous, he seems to be doing two things: evaluating her positively and describing her as willing to face risk. Because of their descriptiveness, thick concepts are especially good candidates for evaluative concepts that pick out properties in the world. Thus they provide an avenue for thinking about ethical claims as being about the world in the same way as descriptive claims.

Thick concepts became a focal point in ethics during the second half of the twentieth century. At that time, discussions of thick concepts began to emerge in response to certain disagreements about thin concepts. For example, in twentieth-century ethics, consequentialists and deontologists hotly debated various accounts of good and right. It was also claimed by non-cognitivists and error-theorists that these thin concepts do not correspond to any properties in the world. Dissatisfaction with these viewpoints prompted many ethicists to consider the implications of thick concepts. The notion of a thick concept was thought to provide insight into meta-ethical questions such as whether there is a fact-value distinction, whether there are ethical truths, and, if there are such truths, whether these truths are objective. Some ethicists also theorized about the role that thick concepts can play in normative ethics, such as in virtue theory. By the beginning of the twenty-first century, the interest in thick concepts had spread to other philosophical disciplines such as epistemology, aesthetics, metaphysics, moral psychology, and the philosophy of law.

Nevertheless, the emerging interest in thick concepts has sparked debates over many questions: How exactly are thick concepts evaluative? How do they combine evaluation and description? How are thick concepts related to thin concepts? And do thick concepts have the sort of significance commonly attributed to them? This article surveys various attempts at answering these questions.

Table of Contents

  1. Background and Preliminaries
  2. Significance of Thick Concepts
    1. Foot’s Argument against the Is-Ought Gap
    2. McDowell’s Disentangling Argument
    3. Williams on Ethical Truth
    4. Thick Concepts in Normative Ethics
  3. How Do Thick Concepts Combine Evaluation and Description?
    1. Reductive Views
    2. Non-Reductive Views
  4. How Do Thick and Thin Differ?
    1. In Kind: Williams’ View
    2. Only in Degree: The Continuum View
    3. In Kind: Hare’s View
  5. Are Thick Terms Truth-Conditionally Evaluative?
    1. Pragmatic View
    2. Semantic View
  6. Broader Applications
  7. References and Further Reading

1. Background and Preliminaries

Bernard Williams first introduced the phrase ‘thick concept’ in his 1985 book, Ethics and the Limits of Philosophy. Williams used this phrase to classify a number of ethical concepts that are plausibly controlled by the facts, such as treachery, brutality, and courage. But his use of the phrase was assimilated from Clifford Geertz’ notion of a thick description—an anthropologist’s tool for describing “a multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another” (1973:). Incidentally, Geertz borrowed the phrase ‘thick description’ from Gilbert Ryle, who took thick description to be a way of categorizing actions and personality traits by reference to intentions, desires, and beliefs (1971). Although Geertz’ and Ryle’s notions of thick description influenced Williams’ terminology, their notions did not necessarily involve evaluation. By contrast, Williams’ notion of a thick concept is bound up with both evaluation and description. Or, in Williams’ terms, thick concepts are both “action-guiding” and “guided by the world”. They are action-guiding in that they typically indicate the presence of reasons for action, and they are world-guided in that their correct application depends on how the world is (1985: 128, 140-41).

Although the phrase ‘thick concept’ first appeared in Williams’ Ethics and the Limits of Philosophy, there was a distinction between thick and thin that predated Williams’ 1985 book. In R.M. Hare’s The Language of Morals, published in 1952, Hare distinguished between primarily evaluative words and secondarily evaluative words (121-2). Hare later identified the former with thin terms, and the latter with thick terms (1997:54). So, the idea of a thick term was present in ethics well before Williams’ terminology.

Hare’s distinction between thick and thin is explicitly about words, and it makes no mention of concepts.  But, in general, the literature on the thick speaks about both thick concepts and thick terms.  Very roughly, concepts are on the level of propositions and meanings (broadly construed), whereas terms are the linguistic entities used to express these items.  In this entry, expressions with single-quotes, for example ‘chaste’, will be used to designate terms.  Italicized expressions, for example chaste, will be used to designate concepts.  Thick concepts, then, can approximately be seen as the meanings of thick terms.

Thick and thin terms are two distinct subclasses of the evaluative. However, readers should exercise caution when encountering the phrase ‘thin term’, since some theorists allow wholly descriptive terms like ‘red’, ‘grass’, and ‘green’ to count as thin (for example, Elgin 2008: 372). Their usage of ‘thin’ diverges from prevailing philosophical jargon, where the thin is seen as a subclass of the evaluative. This article uses the prevailing jargon: on this way of speaking, there are no wholly descriptive thin terms.

It is typically claimed that thick terms are in some sense evaluative and descriptive, but what do these notions mean? The evaluative and the descriptive are normally meant to distinguish between two classes of terms. Descriptive terms can be illustrated with words like ‘red’, ‘solid’, ‘small’, ‘tall’, ‘water’, ‘cat’, and ‘hydrogen’. Paradigmatic evaluative terms include thin terms like ‘good’, ‘bad’, and ‘best’, as well as normative words like ‘ought’, ‘should’, ‘right’, and ‘wrong’. Although some theorists deny that there is any substantive difference between the descriptive and the evaluative (for example, Jackson 1998:120), on the face of it there is a difference. Various attempts have been made to account for this putative difference. Two general approaches are relevant for our purposes.

One approach stems from traditional non-cognitivism. On this view, descriptive terms express beliefs and are capable of picking out properties and facts. But paradigmatic evaluative terms, such as ‘good’ and ‘right’, have neither of these features. These terms do not express beliefs and are incapable of picking out properties and facts. Instead, the function of an evaluative term is to express and induce attitudes, or to commend, condemn, and instruct. Basically, for traditional non-cognitivism, descriptive expressions are capable of representing properties and facts, whereas evaluative one’s express attitudes or imperatives that cannot represent properties or facts. Since this version of the distinction denies that evaluations can be factual, we can call it the strong distinction.

The strong distinction is also known as the fact/value distinction. Thick terms are often seen as a problem for this distinction because they seem both descriptive and evaluative. Indeed, Williams holds that the world-guidedness of thick concepts “is enough to refute the simplest oppositions of fact and value” (1985:150).

There is also a weak distinction between description and evaluation, which is neutral on the question of whether evaluations can be factual. Proponents of this weak distinction may agree with the strong distinction regarding the primary function of evaluative terms. For example, they might agree that evaluative terms function to express and induce attitudes, or to commend, condemn, and instruct. However, they do not rule out the possibility that evaluations are also factual. What then distinguishes the evaluative from the descriptive? Simply put, descriptive terms are all the other predicates within a language—that is, descriptive terms just are non-evaluative. Since this distinction allows that evaluations can be factual, we can call it the weak distinction.

Thick terms are seen as significant because they straddle the above distinctions—they have something in common with both the evaluative and the descriptive. Consequently, thick terms raise interesting questions about whether there is value in the world and whether value claims can be inferred from factual ones. The main arguments and views in this vicinity are from Philippa Foot, John McDowell, and Bernard Williams, which are discussed next.

2. Significance of Thick Concepts

a. Foot’s Argument against the Is-Ought Gap

David Hume is often interpreted as holding that one cannot derive an ‘ought’ from an ‘is’, or more generally, that one cannot derive an evaluative statement from a purely descriptive statement. This view has some intuitive appeal. The basic thought is that evaluative statements can condemn, commend and instruct, whereas descriptive statements can do none of these things. So any inference from purely descriptive premises to an evaluative conclusion would involve a conclusion with content nowhere expressed in its premises. Hence, the inference as a result must be invalid. If we do have a valid inference to an evaluative conclusion, then the premises must somehow involve evaluative content, perhaps covertly. This claim that there are no conceptually valid inferences from purely descriptive premises to evaluative conclusions is known as the “is-ought gap”.

The is-ought gap may seem plausible when the evaluative conclusion employs thin terms like ‘right’ and ‘wrong’. Philippa Foot rejects the is-ought gap by focusing instead on an evaluative conclusion that employs a thick term: ‘rude.’ She points out that ‘rude’ should count as evaluative, because it seems to express an attitude, or to condemn, much like ‘bad’ and ‘wrong’. But, according to Foot, this evaluation can be derived from a description. Consider the description D1: that x causes offence by indicating a lack of respect. Can one accept D1 as true, but deny that x is rude? Foot thinks this denial would be inconsistent. If she is right, then a thick evaluative claim—that x is rude—can be derived from a descriptive claim (Foot 1958).

Foot’s argument is primarily aimed at non-cognitivists, like Hare.  But Hare replies by considering an analogous inference involving a racial slur.  To demonstrate Hare’s point, consider a racial slur like ‘gringo’, ‘kraut’, or ‘honky’.  Most of us disagree with the attitude of contempt that is expressed by the slur ‘kraut’.  But, according to Hare, an analogous inference would logically require us to accept that attitude—that is, to despise Germans.  And this is absurd.  Consider the descriptive claim D1*: that x is a native of Germany.  Is it logically consistent for one to accept D1* as true but deny that x is a kraut?  If the denial in Foot’s example is inconsistent, then, according to Hare’s thinking, it should also be inconsistent in this example.  So, by Foot’s reasoning, D1* should entail ‘x is a kraut’.  Moreover, ‘kraut’ is a term of contempt, which means that this conclusion entails that one must despise x.  By the transitivity of entailment, it follows that an acceptance of D1* requires one to despise x, which is an unintuitive result.  According to Hare, the two inferences are “identical in form.”  So, there must be something wrong with both inferences (1963:188).

Where do the above inferences go wrong? Hare holds that people who reject the attitude associated with ‘kraut’ will substitute it with an evaluatively-neutral expression, such as ‘German’, which does not commit them to the attitude of contempt. So, people who accept D1* are not required to use the evaluative word ‘kraut’ in expressing the conclusion of the inference. Analogous points hold for ‘rude’. Hare concedes that we rarely have evaluatively-neutral expressions corresponding to paradigmatic thick terms, but he thinks such expressions are at least possible. After all, we could use ‘rude’ with a certain tone of voice or with scare-quotes around it, thereby indicating that we mean it in a purely descriptive sense (1963:188-89).

Hare’s response to Foot assumes that slurs are evaluative in the same way as thick terms. This, however, has been taken by some to be unintuitive. But Hare could modify his reply: instead of employing slurs he could use thick terms like ‘chaste’, ‘blasphemous’, ‘perverse’, and ‘lewd’, which are often called “objectionable thick terms”. Objectionable thick terms are terms that embody values that ought to be rejected. It is, of course, a matter of debate whether these thick terms really are objectionable. But Hare could run his argument by using thick terms that are commonly regarded as objectionable. Such terms seem to be evaluative in much the same way as ‘rude’. And, much like slurs, there are many people who reject the values embodied by such terms; these people are consequently reluctant to use the term in question. Notice that arguments like Foot’s would require such people to accept the values embodied by the thick terms they regard as objectionable, and this seems equally implausible. So, Hare’s basic reply need not assume any fundamental similarity between thick terms and slurs.

However, it is unclear that Hare has shown what he needs to show—that the relevant is-ought inferences are invalid. In particular, he has not shown that it’s possible for D1 to be true while ‘x is rude’ is false, or for D1* to be true while ‘x is a kraut’ is false. The mere fact that reluctant speakers will substitute the evaluative conclusions with neutral ones does not show that the evaluative conclusions are false. For instance, you may hate the word ‘prune’ and prefer to substitute it with ‘dried plum’, but that doesn’t mean it’s false that the thing in question is a prune (Foot 1958:509).

Hare could claim that the is-ought gap only exists on the level of concepts or propositions, not on the level of terms or sentences. This makes a difference because Hare holds that the evaluations of thick terms are detachable in the sense that there could be evaluatively-neutral expressions that are propositionally equivalent to sentences involving thick terms. Detachability is the upshot of Hare’s view that ‘German’ can be substituted for ‘kraut’. If the evaluations of thick terms are detachable, then they only attach to the terms, but are not entailed by the propositions expressed by such terms. So, there is no breach of the is-ought gap on the level of propositions. From D1, one can infer the proposition that x is German. And this proposition can also be expressed by using the term ‘kraut’. But one cannot infer a negative evaluation from this proposition; the negative evaluation is only inferable from uses the term ‘kraut’, not from the proposition expressed by such uses.

Hare’s view that the evaluations of thick terms are detachable has led to debates over how exactly thick terms are evaluative. For example, is the evaluation merely pragmatically associated with the term in a way that would make it detachable? Or are these evaluations part of its truth-conditions? These debates are discussed in section 5.

b. McDowell’s Disentangling Argument

Even if Foot is right that there are descriptions that are sufficient for the correct application of thick terms, it need not be the case that these descriptions are necessary. John McDowell’s Disentangling Argument is believed to show that there could not be a description that is both necessary and sufficient for the correct application of a thick term. In this argument McDowell is primarily arguing against non-cognitivists, such as Hare, who accept the strong distinction between description and evaluation.

Before diving into the argument, recall that thick terms seem to straddle the strong distinction. For example, the claim that OJ committed murder seems to aim at stating a fact, which is a feature of descriptive claims (on the strong distinction). But this claim also seems evaluative; by calling it murder, rather than killing, we seem to be evaluating OJ’s action negatively. Thus, thick terms such as ‘murder’ call into doubt the strong distinction because they seem to be both descriptive and evaluative. This issue does not present any obvious challenge to those who accept only the weak distinction, since a term’s being evaluative on the weak-distinction does not preclude it from being fact-stating. How do proponents of the strong distinction meet this challenge?

A.J. Ayer is one non-cognitivist who holds that thick terms like ‘hideous’, ‘beautiful’, and ‘virtue’ are solely on the evaluative side of the strong distinction—they are purely non-factual, evaluative concepts (1946:108-13). Ayer’s view is counterintuitive, and, if generalized, would oddly entail that there is no fact as to whether OJ committed murder.

Most non-cognitivists disagree with Ayer, and claim that thick terms have hybrid meanings that contain two different kinds of content: a descriptive content and an evaluative content. This kind of view is called a Reductive View because it reduces the meaning of a thick term to a descriptive content along with a more basic evaluative content (for example, a thin concept).

McDowell’s Disentangling Argument targets a specific kind of Reductive View, one that is coupled with the strong distinction between description and evaluation. We may thus call his target “the Strong Reductive View”. McDowell assumes that Strong Reductive Views must hold that the thick concept’s descriptive content completely determines the thick concept’s extension. It does so by identifying a property that completely determines what does and does not fall within the thick concept’s extension. The evaluative content plays no role in determining what property the thick concept picks out, but is instead an attitudinal or prescriptive tag that explains the concept’s evaluative perspective.

McDowell’s argument against Strong Reductive Views invites us to consider the epistemic position of an outsider who does not share the evaluative perspective associated with a given thick term. Consider, for example, someone who fails to understand the sexual mores associated with ‘chaste’. Will this person be able to anticipate what this term applies to in new cases? Initially, one might think this is possible: is this not what anthropologists are trained to do? Williams, a proponent of McDowell’s argument, says that anthropologists must at least “grasp imaginatively” the evaluative point of ‘chaste’ (1985:142). She must imagine that she accepts the evaluative point of this term, at least for the purposes of anticipating its usage. Even this might be a problem for Strong Reductive Views. If a Strong Reductive View is correct, then there would be no need for an outsider to grasp the evaluative content of ‘chaste’, even imaginatively. After all, the descriptive content is supposedly what drives the extension of ‘chaste’, which means that an unsympathetic outsider could master its extension just by grasping the descriptive content and observing that it applies to all and only the features that the insiders call ‘chaste’. So, the Strong Reductive View seems to predict that an unsympathetic outsider could anticipate the insider’s usage of ‘chaste’. Many find this implausible.

McDowell’s argument has two premises: (1) If a Strong Reductive View is true of ‘chaste’, then an unsympathetic outsider could master the extension of ‘chaste’ (that is, she could know what things ‘chaste’ would apply to in new cases) without having any grasp of its evaluative content. But, (2) an unsympathetic outsider surely could not achieve this—she could not anticipate its usage if she stands completely outside the evaluative perspective of those who employ the concept. Therefore, the Strong Reductive View is not true of ‘chaste’ (1981:201-3). This sort of argument could be advanced with respect to any thick term and perhaps even thin ones.

The Disentangling Argument is sometimes thought to be a distinctive problem for all Reductive Views, not just Strong Reductive Views. But this is a mistake. Consider Reductive Views that accept only a weak distinction between description and evaluation: call such views “Weak Reductive Views”. Weak Reductive Views can allow that the evaluative content of ‘chaste’ picks out a property, and can therefore allow that this evaluative content plays a role in determining the extension of ‘chaste’. For example, if morally good is the evaluative content associated with ‘chaste’, and morally good picks out a property, then morally good can also play a role in determining the extension of ‘chaste’. This means that its extension need not be completely determined by its descriptive content. In this case, an unsympathetic outsider would be a very strange person—that is, someone who does not accept the evaluative point of morally good, even imaginatively. But such a person does not seem impossible.

To be sure, Weak Reductive Views could be vulnerable to the Disentangling Argument if they accept an additional claim, namely, that chaste is coextensive with a descriptive concept that is perhaps not encoded within the content of chaste. Consider an analogy: it is plausible that water and H2O are coextensive, even though neither concept is encoded within the other. If Weak Reductive Views hold that this situation is true of chaste and some descriptive concept D, which is not encoded in the content of chaste, then the Disentangling Argument could be run against these views. This type of Weak Reductive View predicts that an outsider could master the extension of ‘chaste’ just by grasping D and observing that insiders apply ‘chaste’ to all and only things that are D. Thus, the Disentangling Argument could be run against Weak Reductive Views if they accept the additional claim that chaste is coextensive with a descriptive concept.

However, the same problem also arises for Non-Reductive Views that accept this additional claim. Non-Reductive Views hold that thick concepts cannot be divided into distinct contents (more on this in section 3). And, strictly speaking, Non-Reductive Views are compatible with the additional claim just mentioned—that chaste is coextensive with a descriptive concept. If Non-Reductivists accept this additional claim—which would be uncharacteristic, though not inconsistent— then the combined view would also be vulnerable to the Disentangling Argument.

So, the Disentangling Argument can be used to target any view, Reductive or Non-Reductive, that holds thick concepts to be coextensive with descriptive concepts. It is thus a mistake to think the Disentangling Argument is a problem for all and only Reductive Views. The reason McDowell’s argument targets Strong Reductive Views is that these views appear fit to accept the problematic claim—that thick concepts are coextensive with descriptive concepts.

Most opponents of the Disentangling Argument reject premise (1), by showing that Strong Reductive Views can allow that an unsympathetic outsider could not master the extension of thick terms. This approach is discussed in section 3a. But Hare takes a different approach. He accepts the Strong Reductive View but rejects premise (2). Recall Hare’s way of arguing that there could be a descriptive concept that is extensionally equivalent to a thick concept. One can express this descriptive concept by muting the thick term’s evaluative content in one of two ways: either by using the thick term with a certain tone of voice or by placing scare-quotes around the term. Suppose that these methods successfully show that there is a purely descriptive concept—call it des-chaste—which is coextensive with chaste. In this case, an outsider could employ des-chaste to track the insider’s usage of ‘chaste’.

One might object that Hare’s two methods of uncovering des-chaste reveal that this concept cannot be grasped without already grasping chaste. So, the interpreter in question would not be a genuine outsider. However, although Hare’s methods of uncovering des-chaste require a grasp of chaste, there is no automatic reason to assume that there could not be another method of uncovering des-chaste without grasping chaste—for example, by learning des-chaste independently of any encounter with insiders or their value system.

Would the outsider’s grasp of des-chaste help her anticipate the insider’s use of ‘chaste’ in new cases? Hare thinks so. According to Hare, the outsider could anticipate their use in new cases because she could observe similarities between the old cases and the new cases, and infer based on those similarities that ‘chaste’ would or would not apply in new cases (1997: 61). Of course, McDowell and followers would not be convinced by this claim, since they hold that the similarities between such cases are evaluative. In other words, they accept what is known as “the shapelessness hypothesis”—that the extensions of thick terms are only unified by evaluative similarity relations.

The fundamental disagreement between Hare and McDowell concerns whether the shapelessness hypothesis is true. Is there any reason to accept shapelessness? This hypothesis is sometimes supported by the fact that it can explain why premise (2) of the Disentangling Argument is plausible—that is, it can explain why an unsympathetic outsider could not master the extension of ‘chaste’ (Roberts 2013: 680). Of course, this idea won’t convince someone like Hare who rejects premise (2). Something more should be said. In section 5b, further support for shapelessness is discussed.

It is worth noting that there is a common thread running through Hare’s replies to both Foot and McDowell. In both replies Hare claims that there could be a descriptive expression that is extensionally equivalent to a thick term. Without this claim he could not hold that des-chaste is coextensive with chaste. He also could not escape Foot’s objection to the is-ought gap by claiming that the evaluations of thick terms are detachable.

c. Williams on Ethical Truth

If successful, McDowell’s argument would show that there could not be a wholly descriptive expression coextensive with a thick term. Furthermore, if we assume that utterances involving thick terms are sometimes true, McDowell’s argument might show that there are evaluative facts—facts that can only be characterized in evaluative terms. But this is too quick. After all, sentences involving thick terms might only be true in a minimalist sense. On the minimalist theory of truth, to say that ‘lying is dishonest’ is true is equivalent to saying simply that lying is dishonest. This is all that can be significantly said about the truth of this sentence. Since nothing more can be said, its mere truth does not entail the existence of a fact to which the sentence corresponds. So, even if McDowell’s argument succeeds, the truth of sentences involving thick terms does not guarantee that there are facts that can only be characterized in evaluative terms.

To get a fuller picture of how the truth of such sentences might support the existence of evaluative facts, we must turn to Bernard Williams. According to Williams, utterances involving thick terms show promise of being more than just minimally true, whereas utterances involving thin terms do not. The main difference, according to Williams, is that thick terms bear a close connection to the concept of knowledge and to the notion of a helpful advisor.

Consider the connection between knowledge and thick concepts. There is a precedent for thinking that certain epistemic difficulties arise for thin concepts but not thick ones. For example, how exactly can one come to know that lying is sometimes wrong? This plausible truth, which involves a thin concept, seems to be neither analytic nor a posteriori. Some ethicists have thus held that it is synthetic a priori and is knowable by a special faculty of the mind, such as moral intuition. But many ethicists find this view implausible and have instead turned to thick concepts for an account of ethical knowledge. It may seem more plausible that we can know a posteriori that a thick concept applies, for example, that a certain action is cowardly. According to Mark Platts, we can know such truths “by looking and seeing,” without any special faculty, such as moral intuition (1988: 285).

Williams agrees that thick ethical knowledge is more feasible than thin ethical knowledge. His reasons, however, are different from Platts’. Williams holds that the concept of knowledge is associated with the notion of a helpful informant or advisor, and that there are only such advisors with regard to the application of thick concepts, not thin ones. According to Williams, a helpful advisor is someone who is better than others at seeing that a certain outcome, policy, or action falls under a concept. And Williams holds that there are helpful advisors with regard to thick concepts. For example, the advice that a certain action would be cowardly “can offer the person who is being advised a genuine discovery” (1993: 217). Are there helpful advisors with regard to thin concepts? Not according to Williams—“not many people are going to say ‘Well, I didn’t understand the professor’s argument for his conclusion that abortion is wrong, but since he is qualified in the subject, abortion probably is wrong’” (1995: 235). Thus, according to Williams, utterances involving thick terms show promise of being more than minimally true, given that thick terms have this association with knowledge and helpful informants.

Even though Williams holds that utterances involving thick terms can be more than minimally true, he does not think these utterances can be objectively true—that is, true independently of particular perspectives. To illustrate this, Williams asks us to compare ethics with science. Although there are disagreements in science, there is at least some chance of scientists converging on a perspective-free account of the world, and this convergence would be best explained by the correctness of that account. But Williams thinks our ethical opinions stand no chance of converging on an account of how the world really is independently of particular perspectives—at any rate, if they do converge, this will not be because these opinions have tracked how the world is independently of perspective (1985: 135-6). So, on Williams’ view, the truth of ethical opinions is dependent upon perspective, and hence, not objectively true.

How exactly is the truth of utterances involving thick terms dependent upon perspective? Williams illustrates his view by asking us to envision a hyper-traditional society which is maximally homogenous and minimally reflective. Williams holds that ethical reflection primarily employs thin concepts, and that this hyper-traditional society is unreflective because it only employs thick concepts. According to Williams, their utterances involving thick terms can be true in their language L, which is distinct from our language since L does not express thin concepts whereas our language does. Williams thinks it is undeniable that the thick concepts expressed in L need not be expressible with our language, which means that we may be unable to use our language to assert or deny what insiders say with their thick terms. Of course, it is possible for a sympathetic outsider, such as an anthropologist, to understand and speak L. But, according to Williams, the outsider cannot formulate an equivalent utterance in his own language because “the expressive powers of his own language are different from those of the native language precisely in the respect that the native language contains an ethical concept which his doesn’t” (1995: 239).

To explain Williams’ view further, we can borrow an example from Allan Gibbard (1992). Imagine that gopa is a positive thick concept expressible in L but not expressible in our language. Although a reflective outsider cannot assert that x is gopa in her own language, she can likely reject the proposition that x is good, which involves a thin concept. And if the local’s thick concept gopa entails good, then the outsider could reject the insider’s statement as false by denying that x is good. So, it looks like the insider’s statement can be assessed as false from an outside perspective. However, Williams does not accept that the insider’s concept gopa entails good. A judgment involving a thin concept, such as good, “is essentially the product of reflection” which comes about “when someone stands back from the practices of the society and its use of the concepts and asks… whether these are good ways in which to assess actions…” (1985:146). But this hyper-traditional society is unreflective, which means they do not employ the thin concept good. So, according to Williams, there’s no reason to assume that their concept gopa entails good, which means the outsider’s denial that x is good poses no clear threat to the truth of ‘x is gopa’.

On Williams’ view, if a person from the hyper-traditional society has knowledge that x is gopa, but later reflects and draws the conclusion that x is good, this reflection may unseat his previous knowledge by making it so that this person no longer possesses the traditional concept gopa (1995: 238). In this way, “rejection can destroy knowledge,” because the one who reflects may thereby cease to possess their traditional thick concepts (1985, 148).

Williams has here outlined a possibility in which utterances involving thick terms could be true in a way that is dependent upon perspective—in particular, the perspective of a person who speaks a certain ethical language, such as L. Opposition to Williams comes from at least two fronts.

First, McDowell (1998) and Hilary Putnam (1990) have both objected to Williams’ conception of science as providing a perspective-free account of the world. They hold that science is perspective-dependent. Although this objection would destroy Williams’ contrast between science and ethics, it would not mean that ethical truth is perspective-free, but only that science and ethics are both perspective-dependent, which leaves ethics in good company.

A second source of opposition is known as Thin Centralism—the view that thin concepts are conceptually prior to, and independent of, thick concepts. If good is conceptually prior to gopa, then the locals cannot grasp gopa without also grasping good. This would mean that Williams is wrong to claim that the locals may lose the concept gopa when they draw the reflective inference from x is gopa to x is good. Furthermore, the outsider who denies that x is good is required, by way of this inference, to deny that x is gopa. So, the truth or falsity of ‘x is gopa’ would not depend on what is knowable solely from the local’s perspective, contrary to Williams’ view. It may depend partly on whether x is good, which may be discernible from the outsider’s perspective.

Williams rejects Thin Centralism, though he does not give any arguments against it (1995: 234). He is plausibly a Thick Centralist, holding that thick concepts are conceptually prior to, and independent of, thin concepts. That is, one cannot grasp a thin concept without grasping some thick concept or other, but not vice versa. This view can be understood by way of a color analogy. The concept color is a very general concept that, according to Susan Hurley, cannot be understood independently of specific color concepts, such as red, green, etc. (1989: 16). And according to Thick Centralism, thin concepts like good cannot be grasped independently of specific thick concepts like courageous, kind, and so on.

It might be true that the grasp of color requires the grasp of some specific color concept (for example, red), but is the opposite also true? Does the grasp of red require a grasp of color? If so, then the color analogy would actually support what is known as the No-Priority View—thick and thin concepts are conceptually interdependent with neither one being prior to the other (Dancy 2013). It is worth noting that the No-Priority View is not available to Williams, since this view would mean that the local’s grasp of gopa requires the grasp of a thin concept, and this presents the same problem that Thin Centralism presents for Williams’ view.

d. Thick Concepts in Normative Ethics

It is often urged that ethicists should stop focusing as much on thin concepts and should expand or shift attention towards the thick (Anscombe 1958; Williams 1985). As a result, there has been much attention paid to thick concepts within meta-ethics, primarily regarding the issues discussed above. Have thick concepts also played a substantive role in normative ethics? They have to some extent. Normative ethics is partly concerned with the question of what kind of person one should be. And the virtue and vice concepts, which are paradigmatic thick concepts, have played a significant role in these discussions.

However, normative ethics is also concerned with the question of how one should act, and in this context it is common to focus on thin concepts, like right, wrong, and good. Of course, there are some thick concepts, such as just and equitable that figure into these discussions, but it is not immediately clear why it would matter whether these concepts are thick, rather than thin or purely descriptive. There is at least one attempt at giving thick concepts a substantive role in a theory of how to act. This comes from Rosalind Hursthouse’s virtue theory of right action.

Virtue theory is sometimes criticized for being unable to provide a theory of right action. The mere fact that virtues are character traits of persons does not mean that virtue concepts cannot be applied to actions. Actions can also be honest, courageous, patient, and so on. The problem is that these characterizations of action do not clearly tell us anything about rightness, which would be a major flaw of a normative ethical theory.

Hursthouse meets this criticism by providing a theory of right action in terms of virtue. She holds that an action is right just in case it is what a virtuous agent would characteristically do in the circumstances (1999: 28). The virtuous agent is one who has the virtuous character traits and exercises them. And a virtue is a character trait that a human being needs to flourish or live well. These particular virtues must be enumerated, but the list typically includes paradigmatic thick terms, such as ‘courage’, ‘honesty’, ‘patience’, ‘generosity’, and so on. Hursthouse explicitly claims that the virtue terms are thick (1996: 27). Does it matter for her view whether the virtue terms are thick? Hursthouse’s theory faces an objection, and it is in response to this objection that it might matter.

The objection alleges that the virtue theory of right action cannot provide clear action-guidance, whereas rival normative theories, such as deontology and utilitarianism, can provide clear action-guidance by generating rules, such as “Don’t lie” or “Maximize happiness.” According to this objection, the virtue theory of right action can only generate a very unhelpful rule: “Do what a virtuous person would do.” This rule is not likely to provide action-guidance. If you are a fully virtuous person, you will already know what to do and so would not require the rule. If you are less than fully virtuous, you may have no idea what a virtuous person would do in the circumstances, especially if you don’t know of anyone who is fully virtuous (indeed, such a person might be purely hypothetical). So, according to this objection, the virtue theory of right action cannot provide action-guidance.

In response, Hursthouse points out that every virtue generates positive instruction on how to act—do what is honest, charitable, generous, and so on. And every vice generates a prohibition—do not do what is dishonest, uncharitable, mean, and so on (1999: 36). So, one can get action-guidance without reflecting on what a hypothetical virtuous agent would do in the circumstances. According to Hursthouse, “the agent may employ her concepts of the virtues and vices directly, rather than imagining what some hypothetical exemplar would do” (1991: 227). For example, the agent may reason “I must not tell this lie, since it would be dishonest.” And since dishonesty is a vice, which no virtuous person would have, this agent will be directed towards right action.

Thus, it’s important for Hursthouse’s view that the virtue concepts are at least action-guiding. After all, imagine that the virtue concepts were wholly descriptive concepts of character traits, like slow, calm, or quiet. These descriptive concepts would not generate any prohibitions or positive instruction.

Does it matter whether the virtue concepts are thick rather than thin concepts? Hursthouse does not speak directly to this question, though she does claim that, if we are unclear on what to do in a circumstance, we can seek advice from people who are morally better than ourselves (1999: 35). And, here, Williams’ point about helpful advisors might be useful. If the virtue concepts were thin, then on Williams’ view there would be no helpful advisor with regard to whether the virtue concepts apply. But such advice is possible if the virtue concepts are thick. In short, it is important for Hursthouse that the virtue concepts are action-guiding. And, if Williams is right, it may also matter whether the virtue concepts are thick.

One potential challenge to Hursthouse’s reply might contest the traditional list of virtues, and claim that there is no reason to think this list, when properly enumerated, will contain thick action-guiding concepts. For example, why should we think that courageous will be on the list of virtues rather than a similar concept that rarely generates positive instruction (for example, gutsy)? In considering this objection, readers are advised to consult Hursthouse’s approach to enumerating the virtues (1999: Ch. 8).

Another potential challenge may come from Thin Centralism. Suppose that right is conceptually prior to, and independent of, courageous. In this case, it might be argued that the positive instruction generated by courage (for example, “Do what is courageous”) is wholly due to the action-guidingness of right. The latter is precisely what we wanted to explain, which means that Hursthouse’s reply might be uninformative. However, Hursthouse does not account for particular virtue concepts in terms of right. Furthermore, even if Thin Centralism is true, it could still be claimed that some other thin concept, such as good, is conceptually prior to thick virtue concepts. So, Hursthouse’s account cannot be deemed uninformative merely on the basis of Thin Centralism.

3. How Do Thick Concepts Combine Evaluation and Description?

Thin Centralists typically accept Reductive Views of the thick, which aim to analyze the meanings of thick terms by citing more fundamental concepts (for example, thin concepts and descriptive concepts). Proponents of these Reductive Views often aim at escaping the Disentangling Argument. In particular, they aim to reject premise (1) of that argument by showing that Reductive Views can consistently claim that an outsider could not grasp the extension of a thick term. This strategy proceeds by providing different versions of the Reductive View, which shall be discussed below.

It is worth noting that Reductive Views are typically neutral on whether the weak or strong distinction ought to be accepted. They also tend to be neutral on whether cognitivism or non-cognitivism is true. To be sure, Reductivism is often associated with non-cognitivists, like Hare, but there are some traditional cognitivists, like Henry Sidgwick and G.E. Moore, who hold Reductive Views of the thick (Hurka 2011: 7).

Those who reject Thin Centralism and accept the Disentangling Argument normally accept Non-Reductive Views, holding that the meanings of thick terms are evaluative and descriptive in some sense, though cannot be divided into distinct contents. The basic disagreement between Reductive and Non-Reductive views is on whether thick concepts are fundamental evaluative concepts or are complexes built up from more fundamental concepts (for example, thin concepts). These two approaches are compared in the following sections.

a. Reductive Views

In general, Reductive Views understand the meaning of a thick term as the combination of a descriptive content with an evaluative content. Different Reductive Views can be distinguished based on how they specify this general account. There are three main types of Reductive Views: (i) some views specify the sort of descriptive content within the analysis; (ii) some views specify the relation between evaluative and descriptive contents; and (iii) other views specify what the evaluative content is. There are also various ways of combining (i)-(iii).

Consider type (i) first. Daniel Elstein and Thomas Hurka provide two patterns of analysis that explain the descriptive content of a thick term in two different ways. On their first pattern of analysis, the descriptive content of a thick term is not fully specified within the meaning of the thick term. The meaning of the thick term may only specify that there are some good-making descriptive properties of a general type, without specifying exactly what these good-making properties are. For example, on their view, ‘x is just’ means ‘x is good, and there are properties XYZ (not specified) that distributions have as distributions, such that x has XYZ and XYZ make any distribution that has them good’. Elstein and Hurka hold that this kind of Reductive View is not a Strong Reductive View, because the thick concept does not have a fully specified descriptive content that determines the thick concept’s extension. Still, their view is available to non-cognitivists who accept the strong distinction. Most importantly, Elstein and Hurka believe their view allows non-cognitivists to claim that an outsider could not grasp the extension of ‘just’. Grasping that extension requires determining which properties of the general type are the good-making ones, and doing this requires evaluative judgments that the outsider is not equipped to make (2009: 521-2).

Elstein and Hurka’s second pattern of analysis involves an additional evaluation, which is embedded within the descriptive content. Many virtue and vice concepts are supposed to fit into this second pattern of analysis. For example, on their view, ‘an act x is courageous’ means roughly ‘x is good, and x involves an agent’s accepting risk of harm for himself for the sake of goods greater than the evil of that harm, where this property makes any act that has it good’ (2009: 527). The reference to goods is an embedded evaluation, and it is impossible to determine the extension of ‘courageous’ without determining what can count as goods—but determining this requires an evaluation which the outsider is not equipped to make (2009: 526).

Stephen Burton offers an account of type (ii) by clarifying the relationship between descriptive and evaluative contents of a thick concept. A simple way of expressing the relationship between a thick term’s evaluative and descriptive contents is as follows: ‘x is D and therefore x is E’, where D is a description and E is an evaluation that follows from that description. The trouble is that this simple formula entails that D is coextensive with the thick term itself, and this makes the simple formula vulnerable to the Disentangling Argument. So, Burton modifies the account so that the thick term is not coextensive with D. Burton proposes that a thick term’s meaning can be analyzed as follows: ‘x is E in virtue of some particular instance of D’. For example, ‘courageous’ means ‘(pro tanto) good in virtue of some particular instance of sticking to one’s guns despite great personal risk’. Here, the thick term only groups together those cases in which a thing is E in virtue of some particular instance of D. But D does not entail E, and so is not coextensive with the thick term. Thus, an outsider’s ability to track D will not be enough for her to track the insider’s use of the thick term. But what does it mean for E to depend upon a particular instance of D? For Burton, this means that E “depends on the various different characteristics and contexts” of D, and so D alone is not sufficient for E. Various different characteristics and contexts, which are not encoded in the meaning of the thick term, also need to obtain (1992: 31).

Now consider a view of type (iii). Most Reductive Views hold that thick concepts inherit their evaluative-ness from a constituent thin concept. However, Christine Tappolet proposes that they are instead evaluative on account of specific affective concepts, like admirable, pleasant, desirable, and amusing. These concepts are not thin concepts, but Tappolet holds that they are the basic evaluative constituents of thick concepts, like courageous and generous. For example, Tappolet’s analysis of courageous goes like this: ‘x is courageous’ means ‘x is D and x is admirable in virtue of this particular instance of D’, where D is a description. Essentially, Tappolet accepts Burton’s account of the relation between descriptive and evaluative contents, but modifies the account so that it incorporates affective concepts instead of thin concepts. In doing so, she parts company with other Reductivists by rejecting Thin Centralism. She rejects Thin Centralism because she holds that understanding a thin concept, such as good, requires an understanding of certain specific concepts such as pleasant and admirable (2004: 216).

An objection may arise: affective concepts are also thick concepts, but they do not fit into Tappolet’s analyses of thick concepts. This is because one affective concept, such as admirable, cannot be defined in terms of another, such as pleasant. How then should we account for these affective concepts? Tappolet’s answer is that affective concepts are to be treated differently from other thick concepts, like courageous. In particular, she treats positive affective concepts as determinates of the determinable good, and she holds that determinates cannot be analyzed in terms of their determinables. Roughly, the determinable/determinate relation is a relation of general concepts to more specific ones, where the general determinables are common to each specific determinate, but there is nothing distinguishing the determinates from each other except for the determinates themselves—for example, the only thing that distinguishes red from other colors is redness itself.

Edward Harcourt and Alan Thomas (2013) have pointed to a tension between Tappolet’s treatment of affective concepts and her treatment of other thick concepts. What reason is there to think courageous is analyzable but not admirable? Tappolet holds that admirable is unanalyzable because there is no way of stating the relevant descriptive content associated with admirable (2004: 217). In response, Harcourt and Thomas claim that this is just as much a problem for her analyses of other thick concepts. For example, it is far from clear what should be substituted for ‘D’ within Tappolet’s analysis of courageous. This objection leads Harcourt and Thomas to a Non-Reductive View, according to which all thick concepts are treated as determinates of thin concepts like good and bad (2013: 25-9).

One problem is that there is reason to think that both parties to this dispute are mistaken in claiming that affective concepts cannot be analyzed. There is a simple Reductive account of the meaning of ‘admirable’, which is not represented by any of the above views—‘admirable’ just means ‘worthy of admiration’. Similar accounts can be given for other affective concepts. If this simple analysis is correct, then Tappolet and Harcourt and Thomas are mistaken about the unanalyzability of thick affective concepts.

Some of the analyses provided above may not withstand potential counterexamples. But it is worth pointing out that our inability to state an adequate analysis for a given thick term does not show that its meaning is unanalyzable. Analyses can only be attempted by using a language, and it is possible that our language’s vocabulary does not contain the expressions needed for providing an adequate analysis of the thick term’s meaning. Reductive Views are only committed to the view that the meanings of thick terms involve appropriately related evaluative and descriptive contents; they are not committed to there being any actual language that can express these contents in a way that counts as a satisfactory analysis.

What then is the point in providing these patterns of analysis? The point is to illustrate the general ways in which descriptive and evaluative contents can be combined within the meanings of thick terms. Typically, Reductive Views only commit to the possibility of there being a certain general type of analysis and do not commit to the particular details of their sample analyses (for example, Elstein and Hurka, 2009: 531).

Are there any advantages to Reductivism about the thick? According to Hurka, Reductivism allows cognitivists to explain the difference between virtues and their cognate vices (2011: 7). For example, both courage and foolhardiness involve a willingness to face risk for a cause. What then differentiates courage from foolhardiness? It is plausible that courage requires that the cause be good enough to justify the risk, whereas foolhardiness does not require this. This explanation appeals to a thin concept—good—that many Reductivists are perfectly willing to cite as a constituent of courage. However, there is nothing forbidding Non-Reductivists from also claiming that courage requires a good enough cause, provided they do not take this content to be a constituent of courage. So, this may be no clear advantage for Reductivism.

Another potential advantage is that Reductivism allows us to explain a wide variety of evaluative concepts by recognizing only a few basic ones, such as ought or good. Moreover, if a successful analysis can be achieved, then Non-Reductivists are committed to positing two meanings where Reductivists can posit only one. For example, if the meaning of ‘admirable’ can be analyzed with ‘worthy of admiration’, then Reductivists can claim that the meanings of these two expressions are identical, whereas Non-Reductivists must hold that these meanings are distinct. Lastly, Reductive Views can explain how a thick term is both evaluative and descriptive, since the evaluative-ness of a thick term’s meaning is inherited from a constituent content that is paradigmatically evaluative (for example, a thin concept); and the descriptiveness of its meaning is inherited from a constituent descriptive content. In the next section, we shall examine whether Non-Reductivists can provide a comparable explanation.

b. Non-Reductive Views

Non-Reductive Views hold that the meanings of thick terms are both descriptive and evaluative, although these features are not due to constituent contents within the meanings of thick terms. In slogan form, thick concepts are irreducibly thick. For example, the thick term ‘brutal’ expresses a sui generis evaluative concept, which is not a combination of bad or wrong along with some descriptive content. The challenge is for Non-Reductive Views to explain how these meanings are both evaluative and descriptive. As noted, Reductive Views explain this in terms of constituent contents. The challenge is for Non-Reductive Views to explain how the meanings of thick terms are both descriptive and evaluative without appealing to constituent contents.

This challenge should be weakened in light of the fact that our notions of the descriptive and the evaluative are theoretically-loaded. Non-Reductive theorists do not accept the strong distinction between description and evaluation, because they hold that thick terms are both evaluative and capable of picking out properties. The strong distinction precludes this possibility, unless the content of the thick term is built up from constituents, which Non-Reductivists reject. Non-Reductivists typically accept some version of the weak distinction, but the present challenge cannot be framed in terms of this distinction. On the weak distinction, the descriptive is identical to the non-evaluative. This means that Non-Reductivists are being asked to explain how the meanings of thick terms are both evaluative and not evaluative, which is plainly contradictory. How then are we to understand the challenge faced by Non-Reductive Views?

The challenge can be framed in a two-fold way: (I) Non-Reductive Views need to explain what the meanings of thick terms have in common with the meanings of thin terms—this would explain the evaluative-ness of the thick term’s meaning. And (II) they also need to explain what the meanings of thick terms have in common with the meanings of paradigmatic descriptive terms—this would explain the descriptiveness of the thick term’s meaning.

Starting with (I), Jonathan Dancy holds that both thick and thin terms express concepts that have “practical relevance,” a feature that is lacked by descriptive concepts. To see what he means, consider how thick and thin concepts differ from descriptive concepts like water. The latter can make a practical difference in some circumstances: water may be something to seek when stranded in a desert. But in this case, we must explain the practical relevance of water by citing other properties in the particular situation, such as being thirsty, in a desert, and so forth. By contrast, there is nothing to be explained when a thick or thin concept makes a practical difference, since their practical relevance “is to be expected.” For example, it is expected that courage is something to aspire for and admire, and this does not require explanation by citing other concepts. Dancy expands upon this by claiming that competence with a thick concept requires not only an ability to determine when the concept applies, but also an ability to determine what practical relevance its application has in the circumstances. Competence with a descriptive concept requires only the former, not the latter (2013: 56).

At this point, Reductive theorists may emphasize a potential benefit of their view—they have a simple explanation for why competence with a thick concept requires an ability to determine its practical relevance. In particular, competence with a thick concept requires an ability to determine its practical relevance because its constituent thin concept is practically relevant. But Dancy and other Non-Reductivists cannot appeal to this explanation. How then can they explain the practical relevance of thick concepts?

Conceptual competence can surely be explained without appealing to constituent concepts, otherwise competence with a simple concept would be inexplicable. One potential explanation, which does not appeal to constituent concepts, comes from Harcourt and Thomas (2013: 24-7). Harcourt and Thomas hold that thick concepts are related to good and bad analogously to how red is related to colored. On their view, colored is not a constituent of red, since there are no other concepts that can be combined with colored to yield red. Instead, red is a determinate of the determinable color. Similarly, the thin concept bad is not a constituent of the thick concept brutal—according to Harcourt and Thomas, there is no other concept that can be combined with bad to yield brutal. Instead, brutal is a determinate of the determinable bad. Moreover, given that brutal is a determinate of bad, it can be claimed that the practical relevance of brutal is inherited from the practical relevance of bad, even though the latter is not a constituent of the former.

Debbie Roberts provides another explanation of what the meanings of thick terms have in common with thin terms. Many ethicists claim that thick and thin terms express and induce attitudes, or condemn, commend, and instruct. Roberts takes a different approach. On her view, a concept is evaluative in virtue of ascribing an evaluative property. A concept ascribes a property if and only if the real definition of the property it refers to is given by the content of that concept. What then is an evaluative property? According to Roberts, a property P is evaluative if (i) P is intrinsically linked to human concerns and purposes; (ii) there are various lower-level properties that can each make it the case that P is instantiated (that is, P is multiply-realizable); but (iii) these lower-level properties do not necessitate that P is instantiated (that is, other features must also obtain). Roberts holds that both thick and thin concepts ascribe properties that satisfy (i)-(iii) (2013).

One potential problem is that there might be some paradigmatically descriptive properties that satisfy (i)-(iii).  Consider a particular mental state with moral content, such as the belief that lying is wrong.  The property of being in this state is intrinsically linked to human concerns and purposes, since it is a moral belief.  And if belief-states are multiply realizable, then this property will satisfy (ii) as well.  And finally, if there are lower-level brain states that make it the case that someone has this belief, without necessitating it, then (iii) will be satisfied as well.  Thus, certain mental properties may satisfy (i)-(iii), even though they seem descriptive. Roberts could reply by holding that the above-mentioned moral belief is not linked to human concerns and purposes in the right sort of way.

Turning to (II): What do the meanings of thick terms have in common with paradigmatic descriptive terms? Recall that a key point about paradigmatic descriptive terms is that these terms are capable of representing properties. Non-Reductive theorists can point out that thick terms also seem capable of representing properties. This, in fact, was the fundamental motivation for focusing on thick terms to begin with. And nearly all ethicists (except for Ayer) would agree that this is true. It plainly seems true that ‘courage’ is capable of picking out a property, and in this way ‘courage’ shares something in common with paradigmatic descriptive terms like ‘red’ and ‘water’.

Another key point about descriptive terms is that they are intuitively different from thin terms like ‘wrong’ and ‘good’. Indeed, a central motivation for classifying terms as descriptive is to exclude thin terms like ‘good’ and ‘wrong’ from paradigmatically descriptive expressions. How then do thick terms share this feature with the descriptive—that of being different from thin terms? There are two general answers that Non-Reductivists provide. On one approach, thick and thin differ in kind. On the other, thick and thin differ only in degree but not in kind. These general approaches are discussed in the next section. Reductivist theories are also discussed under each approach.

4. How Do Thick and Thin Differ?

a. In Kind: Williams’ View

Williams is a Non-Reductive theorist who holds that thick and thin differ in kind. On his view, thick terms are both world-guided and action-guiding. For Williams, a world-guided term is one whose usage is “controlled by the facts”—that is, there are conditions for its correct application and competent users can largely agree that it does or does not apply in new situations. An action-guiding term is one that is “characteristically related to reasons for action” (1985: 140-1). For Williams, thick terms are both world-guided and action-guiding, whereas thin terms are action-guiding but “do not display world-guidedness” (1985: 152).

There are some potential problems for Williams’ distinction. First, Williams’ claim that thin terms “do not display world-guidedness” seems to commit him to something controversial—namely, that non-cognitivism is true of thin terms. Other Non-Reductivists accept Williams’ characterization of thick terms, but hold that thin terms are also world-guided and action-guiding (Dancy 2013: 56). If they are right, then Williams’ distinction between thick and thin is compromised.

Nevertheless, there is a straightforward way of distinguishing between thick and thin, which does not assume non-cognitivism about thin terms. On this view, thin terms express wholly evaluative concepts, whereas thick terms express concepts that are partly evaluative and partly descriptive. This straightforward distinction gives us a difference in kind between thick and thin. The trouble is that it too appears to be theoretically loaded (much like Williams’ distinction). This straightforward distinction presupposes a Reductive View, since it holds that thick concepts are built up from evaluative and descriptive components. Another potential problem is that it is not clear whether thin concepts are wholly evaluative. For example, it looks as though the thin concept ought implies the descriptive concept can, assuming the ought-implies-can principle (Väyrynen 2013: 7). Of course, as Dancy points out, the mere fact that one concept entails another does not mean that the latter is a constituent of the former—cow entails not-a-horse, but neither is a constituent of the other (2013: 49).

A second potential problem for Williams’ view, and the straightforward view just mentioned, comes from Samuel Scheffler. Scheffler points out that there are many evaluative terms that are hard to classify as either thick or thin. Consider ‘just’, ‘fair’, ‘impartial’, ‘rights’, ‘autonomy’, and ‘consent’. Upon reflecting on such concepts, Scheffler suggests that world-guidedness is a matter of degree and that a division of ethical concepts into thick and thin is a “considerable oversimplification” (1987: 417-8).

In a later essay, Williams replies to Scheffler by agreeing that thickness comes in degrees, and that “there is an important class of concepts that lie between the thick and the thin” (1995: 234). This reply, however, does not entail that Williams must reject his earlier view.  Assume that thickness and thinness each come in degrees, and that thick and thin do not exhaust all evaluative concepts. These two claims do not entail that the difference between thick and thin is merely a matter of degree. This can be seen via analogy: belief that P and disbelief that P are exclusive categories that each come in degree, and which do not exhaust all doxastic states since suspension of judgment is also possible. But the difference between belief that P and disbelief that P is not merely a matter of degree. These states are different in kind, assuming the former is about the affirmative proposition P while the latter is about the negation ¬P. Similarly, thick and thin could also differ in kind, even if they are exclusive degree categories that do not exhaust all evaluative concepts. Thus, Scheffler’s considerations and Williams’ concessions do not entail that Williams’ earlier view is false.

b. Only in Degree: The Continuum View

Still, many theorists have seized upon Scheffler’s point and have claimed that thick and thin differ only in degree, not in kind. Some consider this to be the standard view (Väyrynen 2008: 391). On this view, thin and thick lie on opposite ends of a continuum of evaluative concepts, with no sharp dividing line between them. For example, good and bad might lie on one end of the continuum, with kind, compassionate, and cruel on the other end. There are at least two gradable notions that can serve to distinguish the ends of this continuum—degrees of specificity or amounts of descriptive content. Greater specificity, or greater amounts of descriptive content, provides a thicker concept with a narrower range of application. Non-Reductive theorists typically focus on the greater specificity of thick terms. Reductive theorists can choose either path; indeed, they can explain the greater specificity of a thick concept in terms of how much descriptive content it has as a constituent. In general, a concept must have enough specificity, or enough descriptive content, for it to reside on the thicker end of the continuum.

Support for the continuum view may come from several considerations. First, consider that some thin concepts have narrower ranges of application than other thin concepts. For example, good can apply to actions, people, food, cars, and so on, whereas right cannot apply to all these things. This may suggest that there are degrees of thinness. Second, as already noted, some thin concepts have descriptive entailments—for instance, the thin concept ought entails the descriptive concept can. Even if can is not a constituent of ought, this entailment at least narrows down the range of application for ought, which could bring it closer to the thick end of the spectrum, even if it is still fairly thin. Thirdly, there seems to be a vague area between thick and thin—for example, it is not clear whether just has enough specificity or enough descriptive content for it to count as thick, but it is also hard to classify this concept as thin. So, perhaps just is a borderline case between thick and thin.

Given these considerations, one may be tempted to hold that thick and thin do not differ in kind. But the above considerations do not strictly entail this. Again, analogous considerations hold for both belief and disbelief—some beliefs have narrower ranges of application than other beliefs (for example, rabbits cannot have complex mathematical beliefs though they can have perceptual beliefs). There is also a vague area between belief and disbelief, yet these two doxastic states differ in kind. So, these considerations only seem to support the Continuum View if there is no way of drawing a distinction in kind. But Hare has provided a distinction in kind that has largely escaped notice.

c. In Kind: Hare’s View

Hare is a Reductivist who holds that thick and thin are distinct in kind, not merely in degree. He holds that thick terms have both descriptive and evaluative meanings associated with them. Interestingly, Hare holds that this is also true of thin terms. Thus, for Hare, thin terms are not wholly evaluative, contrary to the straightforward view mentioned in 4a.

What then is the difference between thick and thin? The difference has to do with the relationship that the two meanings bear to the term in question. A thin term is one whose evaluative meaning is “more firmly attached” to it than its descriptive meaning. And a thick term is one whose descriptive meaning is “more firmly attached” than its evaluative meaning (1963: 24-5). Although Hare agrees that being firmly attached is “only a matter of probability and degree” (1989: 125), this does not mean that the distinction between thick and thin is only a matter of degree. Indeed, Hare’s phrase “more firmly attached” actually marks out a difference in kind. Consider an analogy: a child who is more firmly attached to her mother than to her father is different in kind from a child who is more firmly attached to her father than to her mother. Both are different in kind from a child who is equally attached to both parents. So, the language that Hare uses actually suggests three possible categories of evaluative terms—thick, thin, and neither. Although Hare never mentions the third category, it is at least a potential category for Scheffler’s examples of the neither thick nor thin.

What does Hare mean by “more firmly attached”? For Hare, the more firmly attached meaning is the one that is less likely to change when language users alter their usage of the term. For example, it is less likely that ‘right’ will eventually be used to evaluate actions negatively (or neutrally) than that it will be used to describe lying, promise-breaking, killing, torture, and so forth. The reason is that, if we start using ‘right’ to evaluate actions negatively (or neutrally), there is a great chance that we will be misunderstood or accused of misusing the word. In this sense, the evaluative meaning of ‘right’ is more firmly attached than its descriptive meaning. But just the opposite is the case for thick terms like ‘generous’. If we start using ‘generous’ to evaluate actions negatively, we will not be misunderstood (for example, Ebenezer Scrooge could use ‘generous’ negatively and we would still understand him). Yet, if we started using ‘generous’ to describe selfish acts, for example, then we will be misunderstood or accused of misusing the term. In this sense, the descriptive meaning of ‘generous’ is more firmly attached than its evaluative meaning (1989: 125).

Hare frames his distinction in terms of descriptive and evaluative meanings, which assumes a Reductive View. But his distinction and thought experiment can be formulated without assuming a Reductive View. Rather than talking about descriptive and evaluative meanings, we could instead speak of two different speech acts—describing and evaluating—that are commonly performed through ordinary uses of the terms. Hare’s thought experiment can be formulated by changing the speech acts that we typically perform with the term. For example, although we ordinarily use ‘generous’ to perform a speech act of positive evaluation, a speaker who uses it to evaluate negatively would still be understood.

At the outset it was said that thick concepts are evaluative concepts that are substantially descriptive. Thin concepts, by contrast, are not substantially descriptive. Exactly what ‘substantially descriptive’ means can now be clarified, depending on which of the above three views is accepted. On Williams’ view, being substantially descriptive is matter of being world-guided. On the Continuum View, being substantially descriptive is a matter of having enough specificity or enough descriptive content. On Hare’s view, being substantially descriptive is a matter of having a descriptive meaning that is more firmly attached than its evaluative meaning.

5. Are Thick Terms Truth-Conditionally Evaluative?

The putative significance of the thick depends upon a crucial assumption about how thick terms are evaluative. Several of the arguments and hypotheses discussed in 2.a-c assume that thick terms are evaluative as a matter of truth-conditions—that is, the conditions that must obtain for utterances involving thick terms to express true propositions.

To see how this assumption is made, first recall Foot’s argument. If ‘x is rude’ were not evaluative as a matter of truth-conditions, then its truth would not require anything evaluative, and there would not be anything evaluative following from the purely descriptive claim that x causes offense by indicating lack of respect. Hare’s response, that the evaluation of ‘rude’ is detachable, is a denial of the assumption that ‘rude’ is evaluative in its truth-conditions. Consider McDowell’s premise (2) of the Disentangling Argument. It’s often assumed that the only reason an outsider could not master the extension of ‘chaste’ must be that the truth-conditions associated with ‘chaste’ incorporate something evaluative, which the outsider cannot track. Moreover, the shapelessness hypothesis states that the extensions of thick terms are only unified by evaluative similarity relations. This suggests that something evaluative must obtain for utterances involving thick terms to express true propositions.

Nevertheless, it is controversial that thick terms are evaluative as a matter of truth-conditions. Generally, ethicists agree that thick terms are somehow associated with evaluative contents, but not all agree that these contents are part of the truth-conditions of utterances involving thick terms. How else can a thick term be associated with evaluative content, if not by way of truth-conditions?

Our use of language can communicate lots of information that is not part of the truth-conditions of what we say. In each of the following cases, a speaker B communicates a proposition that is not part of the truth-conditions of B’s utterance. In this first example, the proposition is communicated by way of presupposition:

B: “I don’t regret going to the party.”

Presupposition: that B went to the party.

Plausibly, B’s utterance could express a true proposition even if its presupposition is false—one way to have no regrets about going to a party is by simply not going. This presupposition can plausibly be apart of the background of the conversation at hand, but not part of the truth-conditions of B’s utterance.

Now consider a slightly different example, involving a phone conversation between A and B. In this case, B communicates a proposition by way of conversational implicature:

A: “Is Bob there?”

B: “He’s in the shower.”

Conversational Implicature: that Bob cannot talk on the phone right now.

This proposition is not part of the truth-conditions of B’s utterance—it is obviously possible that Bob can talk on the phone while in the shower. Instead, this proposition is inferred from B’s utterance by relying on conversational maxims and observations from context (for example, that A and B are having a phone conversation, and that B would not provide irrelevant information about Bob’s showering unless he is trying to convey that Bob cannot talk).

Now consider a third example, where B communicates a proposition by way of conventional implicature:

B: “Sue is British but brave.”

Conventional Implicature: that Sue’s bravery is unexpected given that she is British.

The proposition communicated in this example is not part of the truth-conditions of B’s utterance. One way to see this is by comparing B’s utterance with “Sue is British and brave.” These two utterances would seem to be true in all the same circumstances. But the latter does not communicate the implicature in question. This implicature is detachable, in the sense that a truth-conditionally equivalent statement need not have the implicature in question. Although Hare does not mention conventional implicature, his view about the detachability of a thick term’s evaluation could be explained in terms of conventional implicature.

In short, there are many ways to communicate information without it being part of the truth-conditions of the utterance. There are three widely-discussed pragmatic mechanisms—presupposition, conversational implicature, and conventional implicature—and there are others as well. Some hold that utterances involving thick terms do not convey evaluations that are part of their truth-conditions, but instead convey them via some pragmatic mechanism. This view is known as the Pragmatic View. It should be noted that some proponents of the Pragmatic View write as though their view entails that there are no thick concepts (Blackburn 1992). In other words, if the only evaluation associated with courage is pragmatically associated with it, then these philosophers will say that courage is not really a thick concept. Still, others who accept the Pragmatic View are happy to talk of these concepts as thick (Väyrynen 2013). This article does so as well.

The traditional view, however, is that thick terms are evaluative as a matter of their truth-conditions—this view is known as the Semantic View. The follow two sections discuss the Pragmatic and Semantic View, respectively.

a. Pragmatic View

Sometimes the Pragmatic View is supported by the idea that thick terms are variable in what evaluations they express. Typically, a given thick term conveys a particular evaluation that is either positive or negative, but not both. It is natural to assume that the term conveys this evaluation (whichever it is) in all assertive contexts. But it turns out that many paradigmatic thick terms can be used to evaluate something negatively in some contexts while positively in others. There are two ways of illustrating this variability.

The first involves combining a thick term with comparative constructions, such as ‘too’ or ‘not…enough’. For example, ‘lewd’ is typically negative, but it appears to convey something positive in the following quote: “this year’s carnival was not lewd enough” (Blackburn 1992: 296). Similarly, ‘tidy’ is typically positive, but can be used to convey something negative if one is criticized as “too tidy” (Hare 1952: 121). These considerations may be taken to show that thick terms are not evaluative as a matter of truth-conditions—if these thick terms have an evaluation as part of their truth-conditions, one might think ‘not lewd enough’ and ‘too tidy’ should be semantically awkward, but they are not.

Opponents to this argument may claim that the atypical evaluation in each case can be explained solely by reference to ‘too’ and ‘not… enough’, without claiming that ‘lewd’ and ‘tidy’ express an atypical evaluation. Consider that ‘too F’ and ‘not F enough’ are evaluative even when F is a wholly non-evaluative expression—for example, one might say that a color sample is too red, which seems to characterize the sample negatively. Here, the negative evaluation is solely because of ‘too’, and so it should be no surprise if this word generates a negative evaluation when combined with ‘tidy’. Furthermore, there are contexts where F is clearly seen as a positive quality even when it is combined with ‘too’. Borrowing an example from Väyrynen, a military commander could count a soldier as too courageous to waste on a simple mission, and instead select him for a more formidable mission where his courage would be needed (2011: 7). Thus, it appears the atypical evaluation can be attributed to the modifiers ‘too’ and ‘not…enough’ rather than the thick term itself.

The second sort of example pertains to utterances that convey an atypical evaluation without employing a comparative construction. For example, even though ‘cruel’ typically conveys a negative evaluation, it might be that the cruelty of an action was “just what made it such fun” (Hare 1981: 73). Or, even though ‘frugal’ typically conveys a positive evaluation, a person could be condemned as frugal if his “main job is dispensing hospitality” (Blackburn 1992: 286).

A worry associated with examples of this second sort is that the atypical evaluation can be explained in ways that are consistent with the Semantic View. For instance, it might be claimed that the examples involve non-literal uses of the thick term, or that they only convey the alternative evaluation by way of speaker meaning, and not word meaning. Alternatively, one could hold that thick terms are context sensitive, and that there are several different evaluations conveyed by the thick term depending on the context of utterance. In this case, the thick term would be evaluative as a matter of truth-conditions —it is just that those truth-conditions incorporate different evaluations in different contexts (Väyrynen 2011: 8-14).

Another argument for the Pragmatic View comes from Pekka Väyrynen (2013), who focuses on objectionable thick terms. Recall that objectionable thick terms embody values that ought to be rejected. Potential examples include ‘lewd’, ‘perverse’, and ‘blasphemous’, and ‘chaste’. The last of these terms seems to embody the view that a certain kind of sexual restraint is praiseworthy. Those that reject this view regard ‘chaste’ as objectionable—one can refer to such individuals as chastity-objectors. Chastity-objectors tend to exhibit interesting linguistic behavior. They would obviously be reluctant to assert that, say, John is chaste; but they are also reluctant to utter non-affirmative sentences like the following:

(a) John is not chaste.

(b) Is John chaste?

(c) Possibly, John is chaste.

(d) If John is chaste, then so is Mary.

None of these utterances imply that the truth-conditions of ‘chaste’ are satisfied. So, if chastity-objectors are reluctant to utter (a-d), this leads us to expect that the evaluation projects outside of the truth-conditions of ‘chaste’. It is worth noting that this argument is not restricted to examples like ‘chaste’, ‘perverse’, ‘lewd’, and ‘blasphemous’. According to Väyrynen, virtually any thick term could be regarded as objectionable, at least in principle, which means that his argument should extend even to examples like ‘courageous’ and ‘murder’.

In addition to arguing against the Semantic View, proponents of the Pragmatic View need to explain what pragmatic mechanism is responsible for the evaluations of thick terms. The three pragmatic mechanisms cited above can provide potential explanations, but Väyrynen rejects these explanations in favor of an alternative view. He proposes that the evaluative implications of paradigmatic thick terms are “not-at-issue” in normal contexts. Roughly, an implication is at-issue if it is part of the main point of the conversation at hand, and it is not-at-issue if it is part of the background (2013: ch. 5). Väyrynen takes this pragmatic view to be “superior to its rivals by standard methodological principles” from the philosophy of language and linguistics (2013: 10).

Proponents of the Semantic View can provide at least two lines of response to Väyrynen’s argument involving objectionable thick terms. For the first, it is important to note that Väyrynen explains the objector’s reluctance by holding that (a-d) all project the same evaluation beyond the truth-conditions of ‘chaste’. But one might hold that there is no single evaluation projected by all of (a-d)—instead, there are at least two different claims implied throughout (a-d). For example, just as ‘not happy’ conversationally implicates ‘unhappy’, it is equally plausible that ‘not chaste’ conversationally implicates ‘unchaste’. Since chastity-objectors clearly do not want to imply that John is unchaste, they are reluctant to assert (a). Moreover, (b-d) conversationally imply (or assert) that John might be chaste. If chastity-objectors believe it is impossible for anyone to be chaste, then they will be reluctant to assert (b-d). This piecemeal approach calls into doubt Väyrynen’s claim that a single evaluation projects beyond the truth-conditions of ‘chaste’. Moreover, these ways of explaining the reluctance of chastity-objectors are perfectly consistent with the Semantic View (Kyle 2013a: 13-19).

For a second response, it can be pointed out that even Väyrynen’s preferred explanation is consistent with the Semantic View (Kyle 2015). The mere fact that an evaluation projects outside of the truth-conditions of ‘chaste’ does not entail that there is no evaluation within those truth-conditions. For example, it is possible that ‘John is chaste’ conveys two evaluations, one that is part of its truth-conditions, and another that projects outside of them. This possibility can be illustrated with the affirmative sentence ‘It is good that Sue is moral’, which has an evaluative content within its truth-conditions and also projects one outside those truth-conditions. Consider the corresponding non-affirmative sentences:

(a′) It is not good that Sue is moral.

(b′) Is it good that Sue is moral?

(c′) Possibly, it’s good that Sue is moral.

 (d′) If it is good that Sue is moral, then we should applaud her.

An evaluative content—that Sue is moral—is implied by each of (a′-d′), as well as the affirmative sentence. So this evaluation projects much like the evaluation that Väyrynen thinks is projected by ‘chaste’. But none of this precludes the affirmative statement from having a different evaluation as part of its truth-conditions, namely the evaluation associated with ‘good’. Of course, this doubling of evaluation will have no purchase unless there is reason to think there is an evaluation within the truth-conditions of ‘chaste.’

b. Semantic View

What reason is there to think the evaluations of thick terms might be part of their truth-conditions? One potential reason stems from considering additional linguistic data. Notice that the following claim seems highly awkward:

(e) Sue is generous and not good in any way.

Similar statements can be provided using negative thick terms and ‘not bad in any way’. The Semantic View provides a straightforward explanation of the awkwardness of (e). This view can claim that (e) is a contradiction, assuming goodness-in-a-way is a part of the truth-conditions associated with ‘generous’ (Kyle 2013a). This, of course, is only one potential explanation—there may be other ways of explaining the awkwardness of (e), for example, by claiming that ‘generous’ presupposes or conventionally implicates an evaluation that incorporates goodness-in-a-way. Just as before, the issue must be decided by figuring out which view is the best explanation of this and other linguistic data. The matter is up for debate.

Still, one might object that the Semantic View is ill-suited to explain the oddity of (e), since this view mistakenly predicts that (e) would sound odd in every context, yet there are some unusual contexts in which (e) would not sound odd. For example, imagine Ebenezer Scrooge uttering (e) in a context where generosity is seen as a bad thing. (e) might not seem awkward in this context. However, it is a mistake to think that the Semantic View predicts that (e) is awkward in all contexts. Consider that its second conjunct involves a quantifier expression—‘any’—and quantifiers are notoriously context-sensitive. In one context, it might be true to say ‘O.J. Simpson is not good in any way’; but in other contexts, where being a good athlete is relevant to discussion, an utterance of the same sentence could be false. Similarly, the second conjunct in (e) is true or false relative to context. And it is only in contexts where generosity is a relevant way of being good that (e) should sound contradictory. In contexts where generosity is not a relevant way of being good—such as Scrooge’s context—(e) should not sound contradictory. In those contexts, the first part of (e) could be true while the second part is false (Kyle 2013a).

It’s worth emphasizing that the linguistic data about thick terms does not strictly entail the Semantic View, or the Pragmatic View. Rather, the proponents of such views only claim that their respective view is part of the best explanation of a wide-body of linguistic data involving thick terms. This matter has only been explored in recent years, with proponents on each side (Kyle 2013a; Väyrynen 2013).

Another way of supporting the Semantic View stems from the shapelessness hypothesis, that the extensions of thick terms are only unified by evaluative similarity relations. If the shapelessness hypothesis is true, then the truth-conditions associated with thick terms must be at least partly evaluative. But what reason is there to accept the shapelessness hypothesis?

The main support for shapelessness comes from the idea that thick terms seem to “outrun” any descriptive characterizations we can give to the items in their extensions (Kirchin 2010). Consider the various types of action that can be considered kind—shoveling snow for a neighbor, giving chocolate to a child, adopting a stray cat, standing up to someone’s bully, paying a complement to a friend, and so on. Furthermore, there are some actions that would be considered kind in some circumstances even though the opposite action would be considered kind in other circumstances—for example, telling the truth is sometimes kind but so is telling a white lie. Can these various actions be descriptively classified in a way that allows us to correctly characterize kind actions in new cases? The descriptive classification might be a long disjunction of unrelated features, a shapeless classification that would be unhelpful in confronting new cases.

One way of opposing this argument is to show that the various actions mentioned above can be unified under a shapely descriptive classification. For example, each of the above actions seems to benefit others by treating them as ends in themselves. This shapely classification helps us classify at least some new cases of kind action (for example, giving food to a homeless person). And benefit can be understood in purely descriptive terms—for example, as the increasing of happiness. So, it is not obvious that shapelessness is actually supported by the outrunning data given above, although other data could perhaps be provided.

Suppose that thick terms do outrun our ability to provide wholly descriptive characterizations of their extensions. Väyrynen argues that this does not support the Semantic View, because our inability to give a descriptive classification for a thick term can be explained even if the Semantic View is false. Consider that, for some terms T, the extension of T cannot be unified under relations that are expressible in independently intelligible T-free terms (that is, terms that can be understood independently of T). For example, it might be that the extension of ‘pain’ across all sentient animals cannot be unified without employing the word ‘pain’ itself. Now return to the question of what explains our inability to give a wholly descriptive characterization of the extension of ‘kind’. It might be that the only way to characterize its extension is by employing the word ‘kind’ itself. But this would not be a descriptive classification, if the Pragmatic View were true of ‘kind’. It is just that the evaluative-ness of ‘kind’ would be explained by a pragmatic mechanism, rather than its truth-conditions. So, if ‘kind’ is a term like T, then one could not give a wholly descriptive classification of the extension of ‘kind’, even if the Semantic View is false (2013 193-201).

6. Broader Applications

Thick concepts have been an interest primarily among ethicists, although these concepts have made an entrance into discussions in other areas, such as aesthetics, metaphysics, philosophy of law, moral psychology, and epistemology. This section focuses mainly on epistemology’s recent discussions on thick concepts, since these discussions have been the most extensive (outside of ethics). But the discussions from the first four areas shall be briefly summarized.

In aesthetics, there is much discussion about thick aesthetic concepts, like gaudy, elegant, delicate, and brilliant. However, many of these discussions are centered on the question of whether there are any thick aesthetic concepts at all. In this context, it is often assumed that an aesthetic concept is not thick if it is only pragmatically associated with evaluative content (recall that this assumption is sometimes made in ethics as well). So, the discussions over whether there are any thick aesthetic concepts often mirrors the discussions in ethics on whether thick concepts are only pragmatically evaluative, or whether they are evaluative as a matter of truth-conditions (Bronzon 2009; Zangwill 2001). It is worth noting that Burton’s Reductive View (discussed in 3a) is explicitly aimed at accounting for thick aesthetic concepts, as well as ethical ones.

In metaphysics, Gideon Yaffe criticizes two competing views on the nature of freedom of will—one that equates freedom of will with self-expression and one that equates it with self-transcendence. Yaffe then holds that the debates between these approaches have proceeded from the (at least implicit) assumption that freedom of will is a descriptive concept. He argues that there are facts about freedom of will that are best explained if freedom of will is instead assumed to be thick. According to Yaffe, the descriptive content of this concept would correspond to the features that make the agent either self-expressive or self-transcendent. But, according to Yaffe, this is not enough to account for freedom of will. We must also determine whether “the agent has choices that come about through worthwhile processes, processes possessing a certain kind of value” (2000: 219-20). And it is this kind of value that corresponds to the evaluative content of freedom of will.

In the philosophy of law David Enoch and Kevin Toh (2013) point out that legal statements often straddle the divide between the descriptive and the evaluative. They put forth the hypothesis that many legal statements express thick concepts. Potential examples of such thick concepts may include crime, constitutional, inheritance, and infringement, though Enoch and Toh focus primarily on the concept legal, which they argue is a thick concept. The descriptive content of legal consists in its representation of certain social facts, and its evaluative content is a kind of endorsement. They do not assert that the thickness of legal can by itself settle debates over the nature of law. But they hold that its classification as thick can situate these debates within a broader philosophical context analogous to that of ethics, and can introduce new options for thinking about the nature of law.

Thick concepts also play a role in Gabriel Abend’s critique of current moral psychology and neuroscience. Abend argues that moral psychologists and neuroscientists unwarrantedly restrict their research to thin ethical concepts, but ignore thick ones. In particular, these scientists attempt to understand the psychological or neural bases for moral judgments, and they do so by testing various subjects’ judgments involving thin concepts like right and permissible. But judgments involving thick ethical concepts, like cruel and courageous, have scarcely been featured in these experiments. This is no small oversight, given “that thick concepts appear in some or much of people’s moral lives” (2011: 150). Abend also argues that this problem cannot be fixed merely by expanding out psychological and neurological research to include thick ethical judgments, because thick concepts “challenge the conception of a hardwired and universal moral capacity in a way that thin concepts do not” (2011: 145-6). In advancing this last point, Abend relies on the Disentangling Argument and the shapelessness hypothesis, as well as the claim that thick concepts presuppose institutional and cultural facts that do not hold universally.

Outside of ethics, the most extensive discussions on thick concepts occur in epistemology. In 2008, a special issue of the journal Philosophical Papers (vol. 37 no. 3) was devoted to thick concepts in epistemology. Examples of thick epistemic concepts include concepts like intellectual curiosity, truthfulness, open-mindedness, and dogmatic; these are contrasted with thin epistemic concepts, which are typically illustrated with concepts like justification, rationality, and knowledge. The editors of this issue hold that “traditional epistemology has tended towards using the thin concepts in theorizing,” but “these thin epistemic concepts are far less prevalent in everyday discourse than the thick epistemic” (2008: 342). The overarching question of this collection is whether epistemology would benefit from substantive investigations of thick epistemic concepts.

One way of addressing this question is simply to do a substantive investigation of a particular thick epistemic concept, and to show that epistemological theories are enhanced by the investigation. Two contributors to the Philosophical Papers collection take this approach. Catherine Elgin focuses on the concept trustworthy. She argues that trustworthiness does not reduce to justified or reliable true belief, but can help to explain why justified or reliable true beliefs are valuable (2008: 371-87). Harvey Siegel considers whether education is an epistemic virtue concept, and whether it makes sense to classify it as thick. Siegel is skeptical about the helpfulness of the thin/thin distinction, but, to the extent that this distinction is viable, he maintains that education is “more thick than thin.” He also seeks to clarify the relationship between education and virtue epistemology (2008: 467).

The other contributors focus on general issues concerning thick epistemic concepts. Heather Battaly’s contribution seeks to address an objection advanced by Simon Blackburn against Non-Reductive Views. According to Blackburn, Non-Reductive Views mistakenly imply that the differences in how we respond to, say, lewdness would not count as genuine disagreements, because the disputants would be employing different concepts and therefore talking past one another. For example, a person who thinks the carnival was not lewd enough might be employing a different concept from someone who disvalues lewdness, because there would be “no detachable description, no ‘semantic anchor,’ that they can share” (1992: 297-99). (This is an expanded version of the variability objection discussed in section 5a). Battaly responds by arguing that certain thick epistemic concepts, such as open-minded, are subject to combinatorial vagueness—these concepts have several independent conditions of application, but there is no sharp distinction between the conditions that are necessary or sufficient and those conditions that are neither. Battaly holds that disputants can share the same vague concept, while disagreeing over which conditions are necessary and/or sufficient. Battaly maintains that this allows them to have genuine disagreements about whether the concept refers to an epistemic virtue. According to Battaly, this at least shows that virtue epistemologists can disagree about the epistemic virtues without talking past one another (2008: 435-54).

Väyrynen’s contribution focuses on whether thick and thin epistemic concepts can be distinguished in ways comparable to thick and thin ethical concepts, and on whether a focus on thick epistemic concepts can lead to a preferable epistemology. Regarding the first issue, he argues that the way thick and thin concepts are typically distinguished in ethics provides no straightforward distinction between thick and thin epistemic concepts (2008: 390-95). Regarding the second, he argues that neither semantics nor substantive epistemological theory provides a basis for assigning thick epistemic concepts theoretical priority over thin epistemic concepts (2008: 395-408). Väyrynen concludes by claiming that we so far lack good reasons for taking a theoretical turn to a thick epistemology.

Despite Väyrynen’s final conclusion, the Philosophical Papers collection contains an explicit defense of the view that epistemology should expand its focus to thick epistemic concepts. Bernard Williams pushed for a similar expansion in the ethical sphere. And the comparison here brings up an important question, which is the main question of Alan Thomas’ contribution: Can Williams’ treatment of thick ethical concepts be applied analogously in the epistemic sphere? Recall that Williams holds that utterances involving thick ethical concepts cannot be objectively true. Thus, if epistemologists want to model their theory of thick epistemic concepts after Williams’ view in ethics, it appears they will not be able to claim that there are objectively true claims involving thick epistemic concepts. Thomas, however, points out that Williams’ non-objectivism in ethics “is based on the assumption that there are a variety of social worlds, structured by plural sets of thick ethical concepts.” But Williams’ view of thick epistemic concepts, such as truthfulness, allows for the possibility of “only one epistemic world.” According to Thomas, truthfulness is “such a central need of human life that it can be abstractly modeled in a way that… [is] culturally invariant” (2008: 368).

Guy Axtell and J. Adam Carter focus on outlining a positive account for how thick epistemic concepts could play a central role in epistemological theory. The account begins by claiming that epistemic value should be a central focus in epistemology, and that not all epistemic values can be reduced to the value of truth (or to some other single epistemic good). Other values, such as open-mindedness, “can be useful in articulating our epistemic aims,” even if they cannot be thusly reduced (2008: 418). Axtell and Carter also reject Thin Centralism in the epistemic sphere—the view that “general concepts like ‘justified’ and ‘ought’ are logically prior to and independent of specific reason-giving thick epistemic concepts of virtue and vice” (2008: 418). In the epistemic sphere, ‘justified’ and ‘ought’ are primarily used to evaluate beliefs, but the rejection of Thin Centralism allows that there could be fundamental ways of evaluating agents with virtue and vice concepts, and that these evaluations may not be reducible to belief evaluations. The above tenets open up the possibility of, what Axtell and Carter call, a “second-wave of virtue epistemology.” This contrasts with the “first-wave” which takes thick epistemic concepts to be theoretically important primarily because they play a role in analyzing knowledge. On the second wave of virtue epistemology, thick epistemic concepts are “a subject for research in their own right, apart from whatever role they might have in explaining knowledge” (2008: 427).

Many authors in the Philosophical Papers collection take knowledge to be a paradigmatic example of a thin epistemic concept (Battaly 2008: 435; Axtell and Carter 2008: 427; Thomas 2008: 363; Väyrynen 2008: 392; Kotzee and Wanderer 2008: 339). It is not immediately clear whether their arguments rely substantively on this assumption, but the assumption has been contested in a separate context. Brent Kyle argues that knowledge is actually a thick concept. According to Kyle, knowledge is best accounted for as a close relation between a descriptive content—true belief—and an evaluative content—justification. If successful, this argument would establish that traditional epistemology has already focused on at least one thick concept—namely knowledge. But Kyle’s main goal is not to defend the traditional focus of epistemological theories. Instead, he aims to argue that the thickness of knowledge can explain why the Gettier Problem arises. He does so by arguing that the Gettier Problem is a specific instance of a general problem about analyzing thick concepts. It is worth noting that his argument takes no stand on whether thick concepts can be analyzed, or on whether the Gettier Problem is resolvable (Kyle 2013b).

Generally speaking, thick concepts have become a source of optimism for many philosophers who find traditional research within normative disciplines to be myopic, stagnant, or misdirected. Nevertheless, it is still a matter of debate whether a plausible theory of thick concepts actually has the implications typically hoped for. In particular, the literature on thick concepts still contains lively debates regarding fundamental issues such as the Disentangling Argument, the shapelessness hypothesis, non-reductivism, and the Semantic View. And if proponents of the significance of thick concepts make assumptions regarding these controversial issues, then their views will be met with significant opposition, at least until these issues are resolved. But, on the flip side, if opposing theorists account for all normativity with thin concepts, and take these concepts to be non-factual, they too will meet significant opposition. The recent debates about thick concepts are largely responsible for this. Ultimately, whatever approach one takes to these fundamental issues, it is clear that theories of value and normativity cannot be complete unless they give some attention to the thick.

7. References and Further Reading

  • Abend, G.  2011, “Thick Concepts and the Moral Brain,” European Journal of Sociology 52, 143-72.
  • Anscombe, E.  1958, “Modern Moral Philosophy,” Philosophy 33, 1-19.
  • Axtell, G. and A. Carter, 2008, “Just the Right Thickness: A Defense of Second-Wave Virtue Epistemology,” Philosophical Papers 37, 413–434.
  • Ayer, A.J.  1946, Language, Truth, and Logic, Dover, New York.
  • Battaly, H.  2008, “Metaethics Meets Virtue Epistemology: Salvaging Disagreement about the Epistemically Thick,” Philosophical Papers 37, 435–454.
  • Blackburn, S.  1992, “Through Thick and Thin,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Bronzon, R.  2009, “Thick Aesthetic Concepts,” The Journal of Aesthetics and Art Criticism 67:2, 191-99.
  • Burton, S.  1992, “‘Thick’ Concepts Revisited,” Analysis 52, 28-32.
  • Dancy, J.  2013, “Practical Concepts,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Dancy, J.  1995, “In Defense of Thick Concepts,” Midwest Studies in Philosophy 20, 263-79.
  • Elgin, C.  2008, “Trustworthiness,” Philosophical Papers 37, 371-87.
  • Elstein, D. and T. Hurka, 2009, “From Thick to Thin: Two Moral Reduction Plans,” The Canadian Journal of Philosophy 39, 551-36.
  • Enoch, D. and K. Toh 2013, “Legal as a Thick Concept,” in W.J. Waluchow and S. Sciaraffa (eds.), Philosophical Foundations of the Nature of Law, Oxford University Press, Oxford.
  • Foot, P.  1958, “Moral Arguments,” Mind 67, 502-13.
  • Geertz, C.  1973, “Thick Description: Toward and Interpretive Theory of Culture,” The Interpretation of Cultures: Selected Essays, Basic Books, New York, 3-30.
  • Gibbard, A.  1992, “Thick Concepts and Warrant for Feelings,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Harcourt E. and A. Thomas, 2013, “Thick Concepts, Analysis, and Reductionism,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Hare, R.M.  1952, The Language of Morals, Oxford University Press, Oxford.
  • Hare, R.M.  1963, Freedom and Reason, Oxford University Press, Oxford.
  • Hare, R.M.  1989, Essays in Ethical Theory, Oxford University Press, Oxford.
  • Hare, R.M.  1997, Sorting Out Ethics, Clarendon Press, Oxford.
  • Hurka, T.  2011, “Common Themes from Sidgwick to Ewing,” in T. Hurka (ed.) Underivative Duty, Oxford University Press, Oxford, 6-25.
  • Hurley, S.  1989, Natural Reasons, Oxford: Oxford University Press.
  • Hursthouse, R. 1991, “Virtue Theory and Abortion,” Philosophy and Public Affairs 20, 223-46.
  • Hursthouse, R.  1996, “Normative Virtue Ethics,” in Roger Crisp (ed.) How Should One Live?  Oxford University Press, Oxford, 19-36.
  • Hursthouse, R. 1999, On Virtue Ethics, Oxford University Press, Oxford.
  • Jackson, F.  1998, From Metaphysics to Ethics: A Defence of Conceptual Analysis, Oxford University Press, Oxford.
  • Kirchin, S.  2010, “The Shapelessness Hypothesis,” Philosophers Imprint 10, 1-28.
  • Kotzee, B. and J. Wanderer, 2008, “Introduction: A Thicker Epistemology?” Philosophical Papers 37, 337-343.
  • Kyle, B.  2013a, “How Are Thick Terms Evaluative?” Philosophers’ Imprint 13, 1-20.
  • Kyle, B.  2013b, “Knowledge as a Thick Concept: Explaining Why the Gettier Problem Arises,” Philosophical Studies 165, 1-27.
  • Kyle, B. 2015, “Review of ‘The Lewd, the Rude, and the Nasty: A Study of Thick Concepts in Ethics’ by Pekka Väyrynen,” The Philosophical Quarterly 65, 576-82.
  • McDowell, J.  1981, “Non-Cognitivism and Rule-Following,” in S. Holtzman and C. Leich (eds.) Wittgenstein: To Follow a Rule, Routledge, London.
  • McDowell, J. 1998, “Aesthetic Value, Objectivity, and the Fabric of the World,” in Mind, Value, and Reality, Harvard University Press, Cambridge, Mass.
  • Platts, M.  1988, “Moral Reality,” in G. Sayer-McCord (ed.) Essays on Moral Realism, Cornell University Press, Ithaca, NY.
  • Putnam, H. 1990, “Objectivity and the Science/Ethics Distinction,” in J. Conant (ed.) Realism with a Human Face, Harvard University Press, Cambridge, Mass.
  • Roberts, D.  2011, “Shapelessness and the Thick,” Ethics 121, 489-520.
  • Roberts, D.  2013, “It’s Evaluation, Only Thicker,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Ryle, G.  1971, “The Thinking of Thoughts: What is ‘Le Penseur’ Doing?” in Collected Papers 2, Routledge, London, 480-83.
  • Scheffler, S.  1987, “Morality through Thick and Thin: A Critical Notice of Ethics and the Limits of Philosophy,” Philosophical Review 96, 411-34.
  • Siegel, H.  2008, “Is ‘Education’ a Thick Epistemic Concept?”  Philosophical Papers 37, 455-469.
  • Tappolet, C.  2004, “Through Thick and Thin: Good and Its Determinates,” Dialectica 58, 207-21.
  • Thomas, A.  2008, “The Genealogy of Epistemic Virtue Concepts,” Philosophical Papers 37, 345–369.
  • Väyrynen, P.  2008, “Slim Epistemology with a Thick Skin,” Philosophical Papers 37, 389–412.
  • Väyrynen, P. 2011, “Thick Concepts and Variability,” Philosophers’ Imprint 11, 1-17.
  • Väyrynen, P.  2013, The Rude, the Lewd, and the Nasty: A Study of Thick Concepts in Ethics, Oxford University Press, Oxford.
  • Williams, B.  1985, Ethics and the Limits of Philosophy, Harvard University Press, Cambridge, Mass.
  • Williams, B.  1993, “Who Needs Ethical Knowledge?” in A Griffiths, Royal Institute of Philosophy, suppl. vol. 35, 213-22.
  • Williams, B.  1995, “Truth in Ethics,” Ratio 8(3), 227-42.
  • Yaffe, G.  2000, “Free Will and Agency at Its Best,” Philosophical Perspectives 14, 203-29.
  • Zangwill, N.  2001, “The Beautiful, the Dainty, and the Dumpy,” in Metaphysics of Beauty, Cornell University Press, Ithaca, NY.

 

Author Information

Brent G. Kyle
Email: brent.kyle@usafa.edu
United States Air Force Academy
U. S. A.

Desiderius Erasmus (1468?—1536)

ErasmusDesiderius Erasmus was one of the leading activists and thinkers of the European Renaissance. His main activity was to write letters to the leading statesmen, humanists, printers, and theologians of the first three and a half decades of the sixteenth century. Erasmus was an indefatigable correspondent, controversialist, self-publicist, satirist, translator, commentator, editor, and provocateur of Renaissance culture. He was perhaps above all renowned and repudiated for his work on the Christian New Testament. He was not a systematic thinker, and he did not found a system or school of philosophy. In fact, his profound contempt for the scholastic philosophers of the Middle Ages and Renaissance puts him at odds with the institution of philosophy. Perhaps Erasmus’ most important role in the history of philosophy is to have challenged and expanded the disciplinary boundaries of the field. He did so by propounding his philosophy of Christ, which displays some affinities for prior traditions including Platonism and Epicureanism, but which depends primarily on the understanding that philosophy is not an exclusive university discipline, but rather a moral obligation incumbent upon all believers.  In this context he founded an ethics of speech to guide himself and others to what he regarded as the true love of wisdom.

Table of Contents

  1. Life and Works
  2. Erasmus and Philosophy
    1. Humanism
    2. Platonism
    3. Philosopher Kings
    4. Paradox
    5. Epicureanism
  3. The Word
    1. Humanist Theology
    2. The Ethics of Speech
  4. A Controversial Legacy
    1. Canon Formation
    2. Censorship
    3. Scholarship
  5. References and Further Reading
    1. Editions
    2. Studies

1. Life and Works

Erasmus was born in the city of Rotterdam in the late 1460s and was educated by the Brethren of the Common Life, first at Deventer and then at s’Hertogenbosch. Orphaned at an early age, he took monastic vows and entered the Augustinian monastery at Steyn in 1486. In 1492 he was ordained a priest and in 1493 he entered the service of Hendrik van Bergen, the Bishop of Cambrai, who had just been named chancellor of the order of the Golden Fleece by the court of Burgundy. Service as secretary to an ambitious prelate delivered Erasmus from the tedium of monastic life and offered the prospect of travel and advancement. When the bishop’s career stalled, Erasmus left to study theology at the University of Paris in 1495, where he remained long enough to contract a lifelong aversion to the professional study of theology and its addiction to dialectic.

It was in Paris that Erasmus became attached to his first important patron, William Blount, Lord Mountjoy, whom he accompanied to England as tutor in 1499. This first English sojourn, though brief, proved crucial to Erasmus’ subsequent career since it was during this visit that he became acquainted with Thomas More and John Colet, founder of St. Paul’s School in London. Erasmus returned to Paris in 1500 to publish his first collection of proverbs, the Adagiorum Collectanea, whose dedicatory epistle, addressed to Mountjoy, remains a crucial statement of Erasmian poetics. After further itineracy in France and the Low Countries, he returned to England, where he was the guest of Thomas More, with whom he collaborated on a translation of selected dialogues by Lucian of Samosata. He embarked in 1506 on a long awaited voyage to Italy. In Venice,Erasmus worked with the humanist printer Aldus Manutius to publishthe first great collection of adages, the Adagiorum Chiliades in 1508. It was completed with the generous collaboration of numerous Italian humanists, as gratefully recorded in the adage Festina lente. From Italy, he went back to England, where he stayed long enough to compose the Praise of Folly (1511) and several educational writings including the De ratione studii of 1511, a preliminary version of his manual on letter writing De conscribendis epistolis, which was not published until 1522, and the completed version of De copia or On abundance in style (1512).

Having returned to the European continent in 1514, Erasmus began his association with the Swiss printer Johann Froben, for whom he prepared an expanded version of the adages in 1515. The following year brought forth from the Froben press (of Basel, Switzerland) the two works which Erasmus regarded as the twin masterpieces of his career. First came the Novum Instrumentum consisting of the Greek text and Erasmus’ own Latin translation of the New Testament, with Erasmus’ annotations keyed to the Latin text of the Vulgate. Next came, in nine volumes, the complete works of Saint Jerome, including four volumes of Jerome’s letters, edited by Erasmus himself and dedicated to William Warham, the Archbishop of Canterbury. In the beginning of Volume One stands Erasmus’ Life of Jerome. In 1517 Erasmus took up residence in Louvain. There he quickly became embroiled in a controversy with the faculty of theology at the university, over the role of the three languages–Greek, Latin, and Hebrew–in the study of theology. This was upon the immediate occasion of the foundation of a trilingual college in Louvain from the bequest of Erasmus’ friend Jérôme de Busleyden. Erasmus championed humanist theology, based on study of ancient languages, against the reactionary stance of the Louvain theologians who were intent on preserving their professional prerogatives. At the same time Erasmus launched another important scholarly venture, the Paraphrases on the New Testament, starting with the Epistle to the Romans in 1517.

In 1521, Erasmus moved to Basel where he collaborated closely with the Froben press on a succession of expanded editions of the Adages while continuing the Paraphrases on the New Testament. As the decade wore on Erasmus became involved in a reluctant and debilitating quarrel with Martin Luther over the competing doctrines of free will and predestination. Erasmus published his Diatribe on Free Will in 1524, to which Luther answered in 1525 with his treatise The Enslaved Will, which elicited from Erasmus the Hyperaspistes or Shieldbearer issued in two parts in 1526 and 1527. From this quarrel, Richard Popkin dated the advent of modern skepticism in his authoritative History of Scepticism. Having alienated many Catholic clerics with his trenchant criticism of Church hierarchy and Catholic devotion, Erasmus refused to join the Protestant reformers and found himself increasingly isolated as an advocate of Church unity through conciliation rather than persecution or reform. Towards the end of the decade, in the wake of the 1527 Sack of Rome, the Spanish Inquisition convened a conference in the city of Valladolid in order to deliberate on suspect passages in Erasmus’ work and to determine whether his books should be banned in Spain. Though the plague interrupted the Conference of Valladolid in August 1527 before it could reach a verdict, this did not deter Erasmus from composing a lengthy Apology addressed to the Spanish monks who had challenged his orthodoxy. The following year, 1528, Erasmus published his dialogue Ciceronianus (1528), attacking the pagan instincts of Cicero’s strictest humanist disciples, thereby ensuring himself continued notoriety among European men of letters. Erasmus finally left Basel in 1529 when the city officially declared its allegiance to the Reform, and took up residence in the Catholic city of Freiburg. In the few moments of leisure left to him by his interminable polemics and his voluminous correspondence, Erasmus composed his last masterpiece, his treatise on the rhetoric of preaching, entitled Ecclesiastes, which he completed and published in 1535. Erasmus’ travels came to an end on July 12, 1536 in Basel, where he had stopped on his way back to the Low Countries.

2. Erasmus and Philosophy

Scholarship has long recognized Erasmus’ problematic standing in the history of philosophy. For Craig Thompson, Erasmus cannot be called philosopher in the technical sense, since he disdained formal logic and metaphysics and cared only for moral philosophy. Similarly, John Monfasani reminds us that Erasmus never claimed to be a philosopher, was not trained as a philosopher, and wrote no explicit works of philosophy, although he repeatedly engaged in controversies that crossed the boundary from philosophy to theology. His relation to philosophy bears further scrutiny.

a. Humanism

To evaluate Erasmus’ relationship to philosophy, we have to understand his identity as a humanist. One of his earliest works, begun in his monastic youth, though not published until 1520, was the Antibarbari. It proposes a defense of the humanities, then essentially the study of classical languages and literature, against detractors who were scorned as barbarians. One of the key themes of the work is the vital role of classical culture in a Christian society, and this theme entails a redefinition of philosophy, in contrast to the prevailing university discipline of philosophy. Everything of value in pagan culture, insists the main interlocutor in the dialogue, was intended by Christ to enrich Christian society, and this includes philosophy, since Christ himself was “the very father of philosophy” (CWE 23:102; ASD I-1:121). The philosophy he fathered is the philosophy that Erasmus professed throughout his life and work, the philosophia Christi. This philosophy is not so much a set of dogmas as it is a way of life or an ethical commitment.

b. Platonism

One of Erasmus’ first published works and one of his most enduringly popular ones was the Enchiridion militis Christiani, or Manual of the Christian Soldier, of 1503. Written to an anonymous friend at court who had asked Erasmus to compose for him a guide to life, or ratio vivendi, that would lead him to a state of mind worthy of Christ. The Enchiridion gained immediate notoriety for its repudiation of monasticism and its insistence that true piety consists not in outward ceremonies but inward conversion. In the course of events, these themes would become associated with the Protestant Reformation. The Enchiridion espouses a philosophy of duality, the duality of body and soul, letter and spirit, that is explicitly modeled on Platonism.  The author deplores the fact that professional philosophers, obsessed with Aristotle, have banished Platonists and Pythagoreans from the classroom, and he cites approvingly St. Augustine’s preference of Plato over Aristotle. Of the two classical philosophers, Plato’s doctrines are closer to Christianity and his allegorical style is better suited to the exposition of scripture.

When the Enchiridion defines philosophy, it invokes a Socratic precedent. Socrates, in Plato’s dialogue Phaedo, said that philosophy was nothing other than meditation on death because it gradually alienates us from the material and corporeal concerns of life. Philosophy takes us out of the world, the mundus, and into Christ. This withdrawal from the world is not just for religious professionals, such as priests or monks, but for everyone in every walk of life. Philosophy is a spiritual state rather than a professional identity.

c. Philosopher Kings

Among the many Platonic topoi that appealed to Erasmus, none feature more consistently in his work than the ideal of the philosopher king, first introduced in Plato’s Republic. Often this serves as an epideictic topos, as when Erasmus hails Charles V’s brother Ferdinand as a philosopher king (in the conclusion to his treatise on the Christian concord De sarcienda ecclesiae concordia) (ASD V-3:313), or when he similarly acclaims King Sigismund of Poland in a letter to a Polish correspondent (Ep. 2533). Yet, the topic also sponsors some provocative thinking on philosophy. When Erasmus published his Education of a Christian Prince in 1516, he dedicated the treatise to Prince Charles, King of Spain, who was soon to succeed his grandfather Maximilian as Holy Roman Emperor, under the title Charles V. The dedicatory epistle evokes the familiar Platonic claim that no republic will be fortunate until philosophers are kings or kings embrace philosophy. By philosophy, Erasmus understands not the Aristotelian physics and metaphysics that dominated the university curriculum, but rather that kind of philosophy that frees our mind from errors and vices, and demonstrates correct government on the model of the eternal power. This model, or aeterni numinis exemplar, (ASD IV-1:133) is Christ, and the new philosopher king is a disciple of Christ.

Erasmus returns to the philosopher king in the text of his treatise on political philosophy when he again qualifies the meaning of “philosophy.”  To be a philosopher does not mean to be skilled at dialectic or physics, but rather to prefer truth to illusion. In sum, to be a philosopher is the same as to be a Christian: “idem esse philosophum et esse Christianum” (ASD IV-1:145). This is the most compact and emphatic statement of Erasmus’s philosophy of Christ. His most mature and complete statement can be found in the Paraclesis, one of the forewords or prefaces he composed for his edition of the New Testament, also from 1516. In its opening lines, the Paraclesis exhorts all mortals to the holiest and most salubrious study of Christian philosophy, insisting that this type of wisdom can be learned from fewer books and with less effort than the arcane doctrines of Aristotle. The philosophy of Christ is a straight road open to all who are endowed with pure and simple faith, and not an exclusive discipline reserved for specialists. In this context, Erasmus adds a controversial endorsement of vernacular translations of the Bible, so that everyone can share in the message of Christ.

d. Paradox

As we have seen, Erasmus often defines philosophy in the negative: his philosophy is not what people conventionally understand as philosophy. He repudiates conventional philosophy as too contentious, too belligerent and dogmatic. He prefers instead to experiment with a non-assertive form of philosophy that relies on paradox and on the neutralizing force of opposing arguments. His best known experiment in extended paradox, and his best claim to permanence in school curriculum, is the Praise of Folly, first published in Paris in 1511, and accompanied in subsequent editions by a commentary attributed to Gerhard Lister but thought to have been dictated by Erasmus himself. Folly, or Moria, delivers her own encomium, proudly invoking the inspiration of the ancient Greek sophists and seemingly disqualifying her every claim, except perhaps for her satire of the clergy and the learned professions; this is followed by a deceptively earnest exposition of Pauline spirituality, where we detect the same stylistic devices and profusion of proverbs as in the rest of the text. Erasmus labeled his text a declamation, in the sense of a thesis meant to provoke a counter-thesis rather than to assert a dogma. After all, the speaker is Folly, a notoriously unreliable authority on all matters secular and religious and from whom it is no dishonor to differ in views. Rather, we should be ashamed to agree with her. By resorting to this subterfuge, rather than positively asserting his beliefs in his own name, Erasmus was able to intervene in a number of intellectual, political, and spiritual debates in contemporary Christendom without affirming or denying anything.

Though voiced by Folly herself, the critique of church and clergy contained in the Praise of Folly provoked a bitter resentment among doctors of theology, as we know from correspondence between Erasmus and the Louvain theologian Martin Dorp. In a letter dated September 1514, Dorp testifies to the hostile reaction of the professional theologians to the Moriae Encomium and their apprehension of Erasmus’ new project to edit the New Testament, of which word had already begun to circulate largely due to Erasmus’ own public relations campaign. Erasmus’ response to Dorp in epistle 337 recapitulates many of the themes of his life-long tirade against scholasticism. One of the key themes here is the stark contrast drawn between the recent style of theology exemplified by the scholastics, and the old style of theology associated with the Church Fathers, many of whose works Erasmus edited. The passage from old to new has hardly been an improvement. The upstarts or recentiores are so engrossed in their factional disputes and dialectical quibbles that they do not have time to read the Bible. Erasmus makes his case most succinctly when he asks, “What does Aristotle have to do with Christ?” In effect, he accuses the schoolmen of idolatry in their substitution of pagan for Christian authority. The letter to Dorp, which was revised for publication with the Praise of Folly beginning with the Basel edition of 1516, offers a powerful and insidious repudiation of university theology and philosophy.

e. Epicureanism

Erasmus thus had little enthusiasm for the various philosophic orthodoxies prevailing among the ancients or the moderns. However, fairly late in life, and only belatedly acknowledged by criticism, he did turn in sympathy to one of the Hellenistic schools of philosophy, namely Epicureanism. Erasmus never espoused Epicureanism as a comprehensive system of thought, and he could not endorse the central tenet of the mortality of the human soul. He was, however, attracted to the Epicurean ideal of peace of mind through the retreat from worldly cares and the cultivation of a clear moral conscience. As Reinier Leushuis writes, Erasmus’ very last familiar colloquy, The Epicurean, published in 1533, contains the startling claim that no one better deserves the name “Epicurean” than the revered prince of Christian philosophy, Christ himself (CWE 40:1086; ASD I-3:731). Christ teaches his followers how to attain a state of complete tranquility, and freedom from the torments of a guilty conscience, that corresponds to the Epicurean ideal of ataraxia. This ideal, it is worth noting, is not the same as Stoic apathy, which Erasmus carefully disassociates from Christianity in various places including the Ecclesiastes. He points out that apathy would defeat the purpose of the Christian preacher who tries to arouse the emotions of the audience. The Christian ought to combine compassion with a clear conscience in order to achieve tranquility. This understanding of Christianity has little in common with the sterner tenor of Protestant thought, and may explain why Martin Luther labeled Erasmus an Epicurean in the vulgar sense of an atheist or unbeliever. Finally, it may not be out of place here to point out that, through his mature, non-dogmatic embrace of Epicureanism, Erasmus shows some affinity for the late Renaissance prose writer Michel de Montaigne.

3. The Word

Speech, for Erasmus, is not only a defining attribute of humanity but also a key to the relation between humanity and divinity, which was a central preoccupation of his thought. The Gospel of John declares that in the beginning was the logos, which Erasmus famously, or infamously, retranslated as sermo in preference to the reading of the Vulgate, verbum. Erasmus felt justified in changing the reading of the Vulgate, but only in the second edition of his New Testament published in 1519, because the received text of scripture is not divine. The words are human, or rather, a mediation between the human and the divine. The Bible is speech, and as such it must be read, interpreted, and understood according to the arts of speech. Moreover, the arts of speech are the only means we have of approaching divinity. Speech brings man close to God.

a. Humanist Theology

It is not entirely clear under what circumstances Erasmus first took an interest in biblical scholarship. His meeting with John Colet in England is known to have kindled his interest. Another factor, and perhaps an independent factor, was his discovery, in the summer of 1504 at the Praemonstratensian abbey of Parc just outside the walls of Louvain, of the manuscript of Lorenzo Valla’s Annotations on the New Testament, completed fifty years earlier. Erasmus prepared the editio princeps in Paris in 1505 with a prefatory epistle addressed to Christopher Fisher, papal protonotary and doctor of canon law. This preface in defense of Valla can be read as a sort of paradoxical encomium, or praise of invective, since Valla had a controversial reputation as a harsh critic and bitter polemicist. Erasmus compares Valla to Zoilus, a proverbial figure of odious slander for his presumptuous critique of Homeric epic, as we are reminded in the Adages; but here, in the preface he carries a positive connotation as a heroic censor. In his masterpiece on the elegance of the Latin language, Valla rescued literature from barbarity by administering the harsh medicine of criticism to the inveterate disease of scholastic Latin. Erasmus published an epitome of Valla’s work in 1531 based on an abstract he had composed many years earlier in the monastery at Steyn.  Valla had collated the Vulgate with the Greek text of the New Testament and emended the translation according to grammatical criteria, including the criterion of elegance. Surely, Erasmus asks, translation, whether of sacred or profane texts, is the purview of grammar. A translator is not a prophet. The prophet requires the gift of the Holy Spirit, but the translator needs grammar and rhetoric, the arts of language. In effect, since the word of God reaches us through human intermediaries using human language, theology cannot dispense with skill in language or linguarum peritia. This is the program of humanist theology.

The fervor with which the letter to Fisher defends Valla’s biblical philology suggests that Erasmus had already begun a similar enterprise himself. Indeed, the text that Erasmus edited in 1505 was the model and impulse for his own Annotations on the New Testament first published as part of the Novum Instrumentum in 1516. Jacques Chomarat has demonstrated convincingly how much Erasmus owed to Valla’s precedent and how little he acknowledged it, often expressing severe criticism of Valla’s choices, which can be taken as a sort of mimetic tribute to Valla, the modern Momus or god of criticism. Even the substitution of sermo for verbum, which embroiled Erasmus in such endless quarrels, and earned him a denunciation from the pulpit of St. Paul’s Cathedral in London, was anticipated by Valla in his notes on the Gospel of John. Often the Erasmian Annotations revisits the themes first broached in the prefatory epistle to Fisher, including the notion that the Bible, like all human language, is immersed in historical time.

Chomarat draws our attention to Erasmus’ annotation on Acts 10.38, where the apostle Peter is preaching to the Roman soldier Cornelius. Peter tells Cornelius that God anointed Jesus, which the Vulgate renders with a turn of phrase that elicits a long commentary from Erasmus, lengthened in successive editions. First of all, he observes, though the New Testament is written in Greek, it’s not clear whether Peter was speaking Greek or Hebrew, that is to say the local dialect of Hebrew. If he spoke Greek, he spoke it as a foreign language inflected by his own vernacular, which accounts for stylistic irregularities. After all, the apostles were only human, subject to error and ignorance like the rest of us. This got Erasmus in a lot of trouble. He rounds off his note with a disclaimer: “I’m not the oracle; you can take my opinion or leave it.”  This pronouncement coincides with similar disclaimers throughout his biblical scholarship and his apologies. To his implacable foe Edward Lee, Erasmus insists that his opinions are not dogmas to be taken on faith but only ideas to be debated: excutienda propono non sequenda (ASD IX-4:46). In humanist theology, both the text and the exegete are fallible.

b. The Ethics of Speech

Erasmus’ main statement on human speech comes at the beginning of his Paraphrase on the Gospel of John, published in 1523 between the third and fourth editions of his Annotations. Speaking in the voice of the evangelist, he acknowledges that the nature of God surpasses human understanding, and exceeds all of our powers of representation. Consequently, it is better to believe in God than to try to understand him through human reason. Christian philosophy is a kind of fideism, and not a speculative theology. But, in order to convey some understanding of things that are neither intelligible to us nor explainable by us, we have to use names of things that are familiar to our senses, though there is nothing in reality that is truly analogous to God. Therefore, just as the Bible calls that highest mind God, so it calls his only Son its speech. For the Son, though not identical to the Father, nevertheless resembles him through perfect similitude. In the same way, speech is the true mirror of the mind. In this sense, John’s logos is the paradigm of human speech. There is something miraculous about human speech which, arising from the inner recesses of the mind and passing through the ears of the listener, by a kind of occult energy transfers the soul of the speaker to the soul of the listener. This “occult energy” is not divine but nevertheless insinuates the proximity of the human and the divine. Christ is called the logos because God wanted to make himself known to us through him, so that we might be saved. Speech is the gateway to eternity.

The topos of the mirror of speech runs throughout Erasmus’ work as a guiding thread of his ethical and religious thought. It is the figure of speech that animates his philosophy. We find the mirror in the Praise of Folly, in several adages, and in the Lingua. In the latter, it is associated with a saying of Socrates, “Speak that I may see you,” that reappears in the Apophthegmata, whose prefatory epistle to William of Cleves cites Plutarch’s claim that, more than any deeds, sayings are the true mirror of the mind. Erasmus’ last work, the Ecclesiastes devotes a rather important development to the exalted role of speech as the true mirror and image of the soul. In all these instances, we should recognize the expression not of a linguistic theory but rather of a moral imperative. Our society and our salvation depend, according to Erasmus’ favorite figure of speech, on the coincidence of our words and thoughts.

Erasmus brings out this dimension of his theme best in the Lingua of 1525, which identifies the tongue as the source of our greatest benefits and ills. If speech is a mirror, he explains ruefully, it can certainly be a distorting mirror. Christ wanted to be called the logos and the truth, so that we would be ashamed of lying, but now even Christians have become so accustomed to lying that they don’t even realize they are lying. This sad state of affairs is the occasion for a very solemn admonition: “Once lying becomes acceptable, then we can have no more trust, and without trust we lose all human society” (CWE 29:316; ASD IV-1A:83). True speech is the foundation of society, and once this foundation is cracked, the social edifice collapses. At the end of the sixteenth century Michel de Montaigne will make the very same claim in the same prophetic tones, first in his essay “On liars” and again in the essay “On giving the lie.” Both writers experienced their age as a crisis of truth and language. If we want to do justice to the prolific and multifarious achievement of Desiderius Erasmus, we might say that he devoted his life to the ethics of speech.

4. A Controversial Legacy

a. Canon Formation

Since his death in 1536, Erasmus can hardly be said to have rested in peace. His adversaries and detractors were unappeased by his earthly disappearance, and his partisans and disciples shared some of his enthusiasm for polemic. One way to assess Erasmus’ legacy is to trace the publication history of his work, or what we might call the making of his canon. Erasmus himself was thinking about an edition of his complete works as early as 1523, as we know from a letter he sent to Johann von Botzheim in which he drew up a preliminary catalogue of his works to be edited in nine separate tomes with a tenth tome pending. The first tome was to include everything that concerns literature and education including all his translations from Lucian, Euripides, and Libanius. Tome two was reserved for the adages, and tome three for his correspondence. Volume four would be devoted to moral philosophy, including his various translations from Plutarch’s Moralia, and his original works such as the Praise of Folly, the Education of a Christian Prince, and the Complaint of Peace. Volume five was to handle works of religious instruction such as the Enchiridion, his psalm commentaries, and all the prefaces to his New Testament. Eventually, his Ecclesiastes would be included in this group. Volume six would consist of the New Testament and the Annotations, while volume seven was for the Paraphrases. Volume eight was supposed, on a conservative estimate, to hold all the Apologies, or polemical writings. Volume nine was for the Letters of St. Jerome, as if Erasmus wrote them himself, and time permitting, he promised to write a commentary on Paul’s Epistle to the Romans to fill volume 10. Erasmus revised his plan in a letter addressed to the Scottish historian Hector Boece in 1530, partly to accommodate all he had written in the intervening seven years. As a practical measure, the works are now distributed in series, or ordines, rather than individual volumes, and a few other modifications are introduced. The Paraphrases join the New Testament in ordo six to make room for all of Erasmus’ translations of the Greek Fathers in ordo seven. The ninth series now includes several Latin Fathers in addition to Jerome, and the plans for a tenth volume have been shelved. When the posthumous Opera omnia finally issued, from the Froben press of Basel in the course of 1538 to 1540, the publishers followed Erasmus’ plan, but once again separated New Testament and Paraphrases into orders six and seven, while the projected volume of Latin Fathers was canceled. The Apologies occupy the ninth and final order. This is substantially the same organization adopted by the French Protestant refugee Jean Le Clerc in his ten volume edition of the Opera omnia published in Leiden during the first decade of the eighteenth century. This edition is known as LB from the Latin name for the place of publication, Lugduni Batavorum. Le Clerc had to add a tenth volume to accommodate all the Apologies. As a further innovation, he even added an Index Expurgatorius, or repertory of all the passages in Erasmus’ work that were marked out for expurgation by the Spanish and the Roman censors, keyed to the pagination of the Basel edition. From this index it appears that the censors were particularly interested in the correspondence, and that only ordo eight, consisting of Latin translations of the Greek Fathers, escaped their vigilance.

b. Censorship

Le Clerc’s index is an opportune reminder that during the century and a half intervening between the Basel edition and LB, the history of the reception of Erasmus is largely a history of censorship. The Counter-Reformation devoted a fair amount of its institutional effort to the suppression of Erasmus’ legacy in Italy. The first indices of prohibited books were municipal lists whose scope of enforcement was necessarily restricted. In 1555 the Congregation of the Index promulgated the first papal Index of Prohibited Books, which included several titles of Erasmus such as the Annotations on the New Testament, the Colloquies, the Praise of Folly, the Enchiridion, and some writings on prayer and on celibacy. This index was suspended shortly after its promulgation but it did deter Italian printers from publishing new editions of Erasmus’ work. Then in 1559 the papacy promulgated a new index which inscribed Erasmus among the first class of heretics whose works were banned in entirety. This index in turn provoked a major incidence of book burning in the capital of European printing, Venice. Pope Pius IV issued a revised index in the wake of the Council of Trent, the Tridentine Index of 1564, which relaxed the comprehensive ban on Erasmus’ work. However, it also included new provisions for enforcement that resulted in the confiscation and destruction of a considerable number of Erasmus editions in the stock of Italian booksellers. The Tridentine Index also included a new provision for the expurgation of those works by Erasmus that were not banned outright. This provision yielded, after some delay, one of the more important printing and publishing ventures of the Counter Reformation, namely the official expurgated edition of Erasmus’ Adages. This was commissioned by the Council of Trent, prepared by Paolo Manuzio, prefaced by his son Aldo, and published in Florence by the Giunti in 1575 bearing the imprimatur of Pope Gregory XIII.  This was the censored edition of the Adages, purged of anything that could give offense to pious ears, and from which all traces of its author, Desiderius Erasmus, were so scrupulously removed that the work is often catalogued as the Adages of Paolo Manuzio. It’s a blessing that readers beyond Italy had access to the unexpurgated edition of the Adages, for if Montaigne had had to rely on Paolo Manuzio’s version, he never would have invented the essay form. Subsequent revisions of the Index of Prohibited Books gradually abandoned the project of further expurgation. The index of Pope Clement VIII in 1596 turned over the task of expurgation to individual readers relying on their own conscience or someone else’s directions. Eventually the papacy resigned itself to the presence of some copies of Erasmus’ literary and educational writings in private libraries in Italy.

c. Scholarship

Just as the era preceding the LB was an era of censorship, so the succeeding era has been, for the study of Erasmus, an era of burgeoning scholarship. The twentieth century in particular initiated several monuments of modern scholarship. From 1906 to 1958, P. S. Allen and his collaborators published in twelve volumes, with the Oxford University Press, the complete correspondence of Erasmus, the Opus epistolarum Desiderii Erasmi Roterodami. It has completely superseded the third ordo of his Opera omnia. In 1933, Wallace Ferguson published, as a supplement to the LB, his Erasmi opuscula, including Erasmus’ biography of St. Jerome. In the same year Hajo Holborn published an important edition of several key texts including what remains today the standard edition of the Enchiridion.  In 1969 an international committee based in the Netherlands began publishing the critical edition of the complete works of Erasmus known as ASD. Finally, in 1974, the University of Toronto Press launched its complete works of Erasmus in English translation known as CWE, for Collected Works of Erasmus. Both of those projects are still under way in the spirit of that royal adage Festina lente. Every year, under the auspices of the Erasmus of Rotterdam Society (which also publishes the journal Erasmus Studies, formerly the Erasmus of Rotterdam Society Yearbook), the Margaret Mann Phillips Lecture is delivered in the Spring and the Birthday Lecture in the Fall in order to commemorate and conserve the cultural and ethical legacy of Erasmus. The Society has credited an Erasmus website at erasmussociety.org.

5. References and Further Reading

a. Editions

  • Opus epistolarum Desiderii Erasmi Roterodami. Ed. P.S. Allen et al. 12 vols. Oxford: Clarendon Press, 1906-1958.
  • Opera omnia Desiderii Erasmi Roterodami. Amsterdam, 1969–. ASD
  • Desiderii Erasmi Roterodami opera omnia. Ed. Jean Le Clerc. 10 vols. Leiden: P. van der Aa, 1703-1706. (LB)
  • Ausgewählte Werke. Ed. Hajo Holborn. Munich: C.H. Beck’sche, 1933.
  • Erasmi opuscula. Ed. Wallace Ferguson. The Hague: Martin Nijhoff, 1933.
  • Collected Works of Erasmus. Toronto: University of Toronto Press, 1974-. (CWE)

b. Studies

  • Boyle, Marjorie O’Rourke. Erasmus on Language and Method in Theology. Toronto: University of Toronto Press, 1977.
  • Chomarat, Jacques. “Les Annotations de Valla, celles d’Erasme et la grammaire.” In Histoire de l’exégèse au XVIe siècle. Geneva: Droz, 1978.
  • Grendler, Paul. “The Survival of Erasmus in Italy.” Erasmus in English 8 (1976). 2-22.
  • Leushuis, Reinier. “The Paradox of Christian Epicureanism in Dialogue: Erasmus’ Colloquy The Epicurean.” Erasmus Studies 35 (2015). 113-136.
  • Meer, Tineke ter. “A True Mirror of the Mind: Some Observations on the Apophthegmata of Erasmus.” Erasmus of Rotterdam Society Yearbook 23 (2003) 67-93.
  • Monfasani, John. “Erasmus and the Philosophers.” Erasmus of Rotterdam Society Yearbook 32 (2012). 47-68.
  • Popkin, Richard. The History of Scepticism from Erasmus to Descartes. New York: Harper and Row, 1968.
  • Thompson, Craig. “Introduction.” In Erasmus, Literary and Educational Writings. Vol. 1. Toronto: University of Toronto Press, 1978. CWE 23.

 

Author Information

Eric MacPhail
Email: macphai@indiana.edu
Indiana University
U. S. A.

Parmenides of Elea (Late 6th cn.—Mid 5th cn. B.C.E.)

ParmenidesParmenides of Elea was a Presocratic Greek philosopher. As the first philosopher to inquire into the nature of existence itself, he is incontrovertibly credited as the “Father of Metaphysics.” As the first to employ deductive, a priori arguments to justify his claims, he competes with Aristotle for the title “Father of Logic.” He is also commonly thought of as the founder of the “Eleatic School” of thought—a philosophical label ascribed to Presocratics who purportedly argued that reality is in some sense a unified and unchanging singular entity. This has often been understood to mean there is just one thing in all of existence. In light of this questionable interpretation, Parmenides has traditionally been viewed as a pivotal figure in the history of philosophy: one who challenged the physical systems of his predecessors and set forth for his successors the metaphysical criteria any successful system must meet. Other thinkers, also commonly thought of as Eleatics, include: Zeno of Elea, Melissus of Samos, and (more controversially) Xenophanes of Colophon.

Parmenides’ only written work is a poem entitled, supposedly, but likely erroneously, On Nature. Only a limited number of “fragments” (more precisely, quotations by later authors) of his poem are still in existence, which have traditionally been assigned to three main sections—Proem, Reality (Alétheia), and Opinion (Doxa). The Proem (prelude) features a young man on a cosmic (perhaps spiritual) journey in search of enlightenment, expressed in traditional Greek religious motifs and geography. This is followed by the central, most philosophically-oriented section (Reality). Here, Parmenides positively endorses certain epistemic guidelines for inquiry, which he then uses to argue for his famous metaphysical claims—that “what is” (whatever is referred to by the word “this”) cannot be in motion, change, come-to-be, perish, lack uniformity, and so forth. The final section (Opinion) concludes the poem with a theogonical and cosmogonical account of the world, which paradoxically employs the very phenomena (motion, change, and so forth) that Reality seems to have denied. Furthermore, despite making apparently true claims (for example, the moon gets its light from the sun), the account offered in Opinion is supposed to be representative of the mistaken “opinions of mortals,” and thus is to be rejected on some level.

All three sections of the poem seem particularly contrived to yield a cohesive and unified thesis. However, discerning exactly what that thesis is supposed to be has proven a vexing, perennial problem since ancient times. Even Plato expressed reservations as to whether Parmenides’ “noble depth” could be understood at all—and Plato possessed Parmenides’ entire poem, a blessing denied to modern scholars. Although there are many important philological and philosophical questions surrounding Parmenides’ poem, the central question for Parmenidean studies is addressing how the positively-endorsed, radical conclusions of Reality can be adequately reconciled with the seemingly contradictory cosmological account Parmenides rejects in Opinion. The primary focus of this article is to provide the reader with sufficient background to appreciate this interpretative problem and the difficulties with its proposed solutions.

Table of Contents

  1. Life
  2. Parmenides’ Poem
    1. The Proem
    2. Reality
    3. Opinion
    4. Positive Aletheia. Negative Opinion?
  3. Interpretative Treatments
    1. Reception of the Proem
    2. The A-D Paradox: Select Interpretative Strategies and their Difficulties
      1. Strict Monism and Worthless Opinion
      2. Two-World (or Aspectual) Views
      3. Essentialist (or Meta-Principle) Views
      4. Modal Views
  4. Parmenides’ Place in the Historical Narrative
    1. Influential Predecessors?
      1. Anaximander/Milesians
      2. Aminias/Pythagoreanism
      3. Xenophanes
      4. Heraclitus
    2. Parmenides’ Influence on Select Successors
      1. Eleatics: Zeno and Melissus of Samos
      2. The Pluralists and the Atomists
      3. Plato
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

As with other ancient figures, little can be said about Parmenides’ life with much confidence. It is certain that his hometown was Elea (Latin: Velia)—a Greek settlement along the Tyrrhenian coast of the Appenine Peninsula, just south of the Bay of Salerno, now located in the modern municipality (comune) of Ascea, Italy. Herodotus reports that members of the Phocaean tribe established this settlement ca. 540-535 B.C.E., and thus Parmenides was of Ionian stock (1.167.3). Parmenides’ father, a wealthy aristocrat named Pyres, was probably one of the original colonizers (Coxon Test. 40-41a, 96, 106).

When exactly Parmenides was born is far more controversial. There are two competing methods for dating Parmenides’ birth, to either 540 (Diogenes Laertius) or 515 (Plato) B.C.E. Neither account is clearly convincing in-itself, and scholars are divided on their reliability and veracity.

The source for Parmenides’ earlier birthdate (c. 540 B.C.E.) is based upon a relatively late (3rd cn. C.E.) doxographical account by Diogenes, who relied on Apollodorus’ (2nd cn. B.C.E.) writings. This account claims Parmenides “flourished”—a euphemism conventionally understood to correspond with having reached forty years of age, and/or the height of one’s intellectual career—during the sixty-ninth Olympiad (between 504-500 B.C.E.). The reliability of this account is esteemed for its historical focus (as opposed to any philosophical agenda) of these authors. However, the lateness of the account can be considered a weakness, and the “flourishing” system of dating is quite artificial, vague, and imprecise.

The later birthdate (515 B.C.E). is based upon the opening of Plato’s Parmenides (127a5-c5). The narrative setting describes a young Socrates (about 20) conversing with Parmenides, who is explicitly described as being “about 65.” Since Socrates was born c. 470 B.C.E., subtracting the remaining 45 years yields a birthdate for Parmenides c. 515 B.C.E. Some have taken Plato’s precise mention of Parmenides’ age as indicative of veracity. However, Plato is also known for including other entirely fictitious, clearly anachronistic yet precise details in his dialogues. In fact, the very conversation reported in the dialogue would have been impossible, as it depends upon views Plato developed late in his life, which are certainly not “Socratic” at all. Plato is not necessarily a reliable historical source.

Choosing between these accounts can have significant historical implications regarding Parmenides’ possible relationship to other thinkers, particularly Heraclitus. For instance, if one accepts Plato’s later date, this would seem to require denying that Parmenides influenced Heraclitus (540-480 B.C.E., also based upon Diogenes’ reports) as Plato suggests (Sophist 242d-e). On the other hand, if one accepts the earlier dating by Diogenes, it makes it very unlikely Heraclitus’ work could have influenced Parmenides, as there would not have been sufficient time for his writings to become known and travel across the Greek world from Ephesus, Ionia.

Whenever Parmenides was born, he must have remained a lifelong citizen and permanent resident of Elea—even if he traveled late in life, as Plato’s accounts in Parmenides and Theatetus suggest. This is first indicated by the evident notoriety he gained for contributions to his community. Several sources attest that he established a set of laws for Elea, which remained in effect and sworn to for centuries after his death (Coxon Test. 16, 116). A 1st cn. C.E. pedestal discovered in Elea is dedicated to him, with an inscription crediting him not only as a “natural philosopher,” but as a member (“priest”) of a local healing cult/school (Coxon 41; Test. 106). Thus, he likely contributed to the healing arts as a patron and/or practitioner. Finally, if Parmenides really was a personal teacher of Zeno of Elea (490-430 B.C.E)., Parmenides must have been present in Elea well into the mid-fourth century B.C.E. Ultimately, however, when and where Parmenides died is entirely unattested.

2. Parmenides’ Poem

Ancient tradition holds that Parmenides produced only one written work, which was supposedly entitled On Nature (Coxon Test. 136). This title is suspect, as it had become common even by Sextus’ time to attribute this generic title to all Presocratic works (Coxon 269-70; Test. 126). No copy of the original work has survived, in any part. Instead, scholars have collected purported quotations (or testimonia) from a number of ancient authors and attempted to reconstruct the poem by arranging these fragments according to internal and external (testimonia) evidence. The result is a rather fragmentary text, constituted by approximately 154 dactylic-hexameter lines (some are only partial lines, or even only one word). This reconstructed arrangement has then been traditionally divided into three distinct parts: an introductory section known as the Proem; a central section of epistemological guidelines and metaphysical arguments (Aletheia, Reality); and a concluding “cosmology,” (Doxa, or Opinion).

The linear order of the three main extant sections is certain, and the assignment of particular fragments (and internal lines) to each section is generally well-supported. However, it must be admitted that confidence in the connectedness, completeness, and internal ordering of the fragments in each section decreases significantly as one proceeds through the poem linearly: Proem-Reality-Opinion. Furthermore, many philological difficulties persist throughout the reconstruction. There are conflicting transmissions regarding which Greek word to read, variant punctuation possibilities, concerns surrounding adequate translation, ambiguities in the poetical form, and so forth. Given all of this, any serious engagement with Parmenides’ work should begin by acknowledging the incomplete status of the text and recognizing that interpretative certainty is generally not to be found.

a. The Proem

The Proem (C/DK 1.1-32) is by far the most complete section available of Parmenides’ poem. This is due entirely to Sextus Empiricus, who quoted Lines 1-30 of the Proem (C1) as a whole and explicitly reported that they began the poem (Coxon Test. 136). Not only are the bulk of these lines (1.1-28) not quoted by any other ancient source, but their content is not even mentioned in passing. In short, modern scholars would have no idea the Proem ever existed were it not for Sextus. Simplicius quotes lines C/DK 1.31-32 immediately after quoting lines very similar to Sextus’ 1.28-30, and thus these are traditionally taken to end the Proem.

Nevertheless, there is some controversy regarding the proper ending of the Proem. While Lines 1.28-30 are reported by several additional sources (Diogenes Laertius, Plutarch, Clement, and Proclus), Simplicius alone quotes lines 1.31-32. In contrast, Sextus continued his block quotation of the Proem after line 1.30 with the lines currently assigned to C/DK 7.2-7, as if these immediately followed. Diels-Kranz separated Sextus’ quotation into distinct fragments (1 and 7) and added Simplicius’ lines to the end of C/DK 1. The vast majority of interpreters have followed both these moves. However, there may be good reasons to challenge this reconstruction (compare Bicknell 1968; Kurfess 2012, 2014).

The Proem opens mid-action, with a first-person account of an unnamed youth (generally taken to be Parmenides himself) traveling along a divine path to meet a didactic (also unnamed) goddess. The youth describes himself riding in a chariot with fire-blazing wheels turning on pipe-whistling axles, which seems to be traversing the heavens. The chariot is drawn by mares, steered by the Daughters of the Sun (the Heliades), who began their journey at the House of Night. The party eventually arrives at two tightly-locked, bronze-fitted gates—the Gates of Night and Day. In order to pass through these “aethereal” gates, the Heliades must persuade Justice to unlock the doors with soft words. After successfully passing through this portal and driving into the yawning maw beyond, the youth is finally welcomed by the unnamed goddess, and the youth’s first-person account ends.

Many have thought the chariot journey involved an ascent into the heavens/light as a metaphor for achieving enlightenment/knowledge and for escaping from darkness/ignorance. However, it would seem that any chariot journey directed by sun goddesses is best understood as following the ecliptic path of the sun and Day (also, that of the moon and Night). This is further confirmed given the two geographical locations explicitly named (the “House of Night” and the “Gates of Night and Day”), both of which are traditionally located in the underworld by Homer and Hesiod. Thus, the chariot journey is ultimately circular, ending where it began (compare C2/DK5). From the House of Night—far below the center of the Earth—the Heliades would follow an ascending arc to the eastern edge of the Earth, where the sun/moon rise. The journey would then continue following the ecliptic pathway upwards across the heavens to apogee, and then descend towards sunset in the West. At some point along this route over the Earth they would collect their mortal charge. Following this circular path, the troupe would eventually arrive back in the underworld at the Gates of Night and Day. Not only are these gates traditionally located immediately in front of the House of Night, but the mention of the chasm that lies beyond them is an apt poetical description of the completely dark House of Night. On this reading, rather than a metaphorical ascent towards enlightenment, the youth’s journey is actually a didactic katabasis (a descent into the underworld). It also suggests a possible identification of the anonymous spokes-goddess—Night (compare Palmer 2009).

The rest of the poem consists of a narration from the perspective of the unnamed goddess, who begins by offering a programmatic outline of what she will teach and what the youth must learn (1.28b-30):

…And it is necessary for you to learn all things,

Both the still heart of persuasive reality

And the opinions of mortals, in which there is no genuine reliability.

That the youth is supposed to learn some truth about “reality” (aletheia) is uncontroversial and universally understood to be satisfied by the second major section of the poem, Reality (C 2-8.50). It is also uncontroversial that the “opinions of mortals” will be taught in Opinion (C 8.51-C 20) and that this account will be inferior to the account of Aletheia in some way—certainly epistemically and perhaps also ontologically. The standard reconstruction of the Proem then concludes with the two most difficult and controversial lines in Parmenides’ poem (C 1.31-32):

ἀλλ’ ἔμπης καὶ ταῦτα μαθήσεαι ὡς τὰ δοκεῦντα

χρῆν δοκίμος εἴναι διὰ παντὸς πάντα περῶντα [περ ὄντα].

The suspicion that these lines might help shed light on the crucial relationship between Reality and Opinion is well-warranted. However, there are numerous possible readings (both in the Greek transmission and in the English translation) and selecting a translation for these lines requires extensive philological considerations, as well as an interpretative lens in which to understand the overall poem—the lines themselves are simply too ambiguous to make any determination.  Thus, it is quite difficult to offer a translation or summary here that does not strongly favor one interpretation of Parmenides over another. The following is an imperfect attempt at doing so, while remaining as interpretatively uncommitted as possible.

But nevertheless, you shall also learn “these things,” how the  “accepted/seeming things” should/would have had (to be) to be acceptably, passing through [just being] all things, altogether/in every way.

Commentators have tended to understand these lines in several general ways. First, Parmenides might be offering an explanation for why it is important to learn about mortal opinions if they are so untrustworthy/unreliable, as line 1.30 argues. Another common view is that Parmenides might be telling the youth he will learn counterfactually how the opinions of mortals (or the objects of such opinions) would or could have been correct (even though they were not and are not now). Alternatively, Parmenides might be pointing to some distinct, third thing for the youth to learn, beyond just Reality and Opinion. This third thing could be, but is not limited to, the relationship between the two sections, which does not seem to have been explicitly outlined in the poem (at least, not in the extant fragments). In any case, these lines are probably best dealt with once one already has settled upon an interpretative stance for the overall poem given the rest of the evidence. If nothing else, whether a selected interpretation can be coherently and convincingly conjoined with these lines can provide a sort of final “test” for that view.

b. Reality

Immediately following the Proem (C/DK 1), the poem moves into its central philosophical section: Reality (C. 2-C 8.49). In this section, Parmenides’ positively endorsed epistemic and metaphysical claims are outlined. Though lengthy quotations strongly suggest a certain internal structure, there is certainly some room for debate with respect to proper placement, in particular amongst the shorter fragments that do not share any common content/themes with the others. In any case, due to the overall relative completeness of the section and its clearly novel philosophical content—as opposed to the more mythical and cosmological content found in the other sections—these lines have received far more attention from philosophically-minded readers, in both ancient and modern times.

In Reality, the unnamed youth is first informed that there are only two logically possible “routes of inquiry” one might embark upon in order to understand “reality” (C 3/DK 2). Parmenides’ goddess endorses the first route, which recognizes that “what-is” is, and that it must be (it is not to not be), on the grounds that it is completely trustworthy and persuasive. On the other hand, the goddess warns the youth away from the route which posits “what-is-not and necessarily cannot be,” as it is a path that can neither be known nor spoken of. The reasoning seems to be that along this latter route, there is no concept to conceive of, no subject there to refer to, and no properties that can be predicated of— “nothingness.”

Arguably, a third possible “route of inquiry” may be identified in C 5/DK 6. Here, the goddess seems to warn the youth from following the path which holds being and not-being (or becoming and not-becoming) to be both the same and not the same. This is the path that mortals are said to wander “without judgment,” on a “backwards-turning journey.” Not only is confusing “what is” and “what is not” different from positing necessary being and non-being (C 3/DK 2), and thus a distinct “route,” but this description also seems to correspond far better to the cosmogony of opposites found in Opinion than to the route of “what is not and necessarily cannot be.”

The order not to follow the path that posits only “what is” is further complicated by the fragmentary report that there is some sort of close relationship between thinking (or knowing) and being (what exists, or can exist, or necessarily exists): “…for thinking and being are the same thing,” or “…for the same thing is for thinking as is for being” (C 4/DK 3). Scholars are divided as to what the exact meaning of this relationship is supposed to be, leading to numerous mutually exclusive interpretative models. Does Parmenides really mean to make an identity claim between the two—that thinking really is numerically one and the same as being, and vice-versa? Or, is it that there is some shared property(-ies) between the two? Is Parmenides making the rather problematic claim that whatever can be thought, exists (compare Gorgias “On Nature, or What-is-Not”)? Or, more charitably, only that whatever does exist can in principle be thought of without contradiction, and thus is understandable by reason—unlike “nothingness”? Perhaps both? Most commonly, Parmenides has been understood here as anticipating Russellian concerns with language and how meaning and reference must be coextensive with, and even preceded by, ontology (Owen 1960).

In any case, from these epistemic considerations, the goddess’ deductive arguments in C/DK 8 are supposed to follow with certainty from deductive, a priori reasoning. By studiously avoiding thinking in any way which entails thinking about “what-is-not,” via reductio, the subject of Reality is concluded to be: truly eternal—ungenerated and imperishable (8.5-21), a continuous whole (8.21-25), unmoved and unique (8.21-33), perfect and uniform (8.42-49). For instance, since coming-to-be involves positing “not-being” in the past, and mutatis mutandis for perishing, and since “not-being” cannot be conceived of, “what is” cannot have either property. In a similar vein, spatial motion includes “not-being” at a current location in the past, and thus motion is also denied. This line of reasoning can be readily advanced to deny any sort of change at all.

In the end, what is certain about Reality (whatever the subject, scope, or number of this “reality” is supposed to be) is that there is purportedly at least one thing (or perhaps one kind of thing) that must possess all the aforementioned “perfect” properties, and that these properties are supposed to follow from some problem with thinking about “what is not.” It has been commonly inferred from this that Parmenides advocated that there is actually just one thing in the entire world (that is, strict monism), and that this entity necessarily possesses the aforementioned properties.

c. Opinion

 

Opinion has traditionally been estimated to be far longer than the previous two sections combined. Diels even estimated that 9/10 of Reality, but only 1/10 of Opinion, are extant, which would have the poem spanning some 800-1000 lines. This degree of precision is highly speculative, to say the least. The reason Opinion has been estimated to be so much larger is due to the fragmentary nature of the section (only 44 verses, largely disjointed or incomplete, are attested) and the apparently wide array of different topics treated—which would seem to require a great deal of exposition to properly flesh-out.

The belief that Opinion would have required a lengthy explication in order to adequately address its myriad of disparate topics may be overstated. As Kurfess has recently argued, there is nothing in the testimonia indicating any significant additional content belonging to the Opinion beyond that which is explicitly mentioned in the extant fragments (2012). Thus, though Opinion would still be far longer than the quite limited sampling that has been transmitted, it need not have been anywhere near as extensive as has been traditionally supposed, or all that much longer than Reality. Regardless of its original length, the incompleteness of this section allows for substantially less confidence regarding its arrangement and even less clarity concerning the overall meaning of the section. As a result, the assignment of certain fragments to this section has faced more opposition (compare Cordero 2010 for a recent example). Nevertheless, the internal evidence and testimonia provide good reasons to accept the traditional assignment of fragments to this section, as well as their general arrangement.

 

Following the arguments of Aletheia, the goddess explicitly ends her “trustworthy account and thought about reality” and commands the youth from there on, “hearing the deceptive arrangement of her words,” to learn mortal opinions (C 8.50-52).

The range of content in this section includes: metaphysical critiques of how mortals err in “naming” things, particularly in terms of a Light/Night duality (C 8.51-61, 9, 20); programmatic passages promising a detailed account of the origin of celestial bodies (C 10, 11); a theogonical account of a goddess who rules the cosmos and creates other deities, beginning with Love (C 12, 13); cosmogonical and astronomical descriptions of the moon and its relationship to the sun (C 14, 15), along with an apparent description of the foundations of the earth (C 16); some consideration of the relationship between the mind and body (C 17); and even accounts related to animal/human procreation (C 18-19).

The error of mortals is grounded in their “naming” (that is, providing definite descriptions and predications) the subject of Reality in ways contrary to the conclusions previously established about that very subject. As a result, mortals have grounded their views on an oppositional duality of two forms—Light/Fire and Night—when in fact it is not right to do so (8.53-54). Admittedly, the Greek is ambiguous about what exactly it is not right for mortals to do. It is common amongst scholars to read these passages as claiming it is either wrong for mortals to name both Light and Night, or that naming just one of these opposites is wrong and the other acceptable. This reading tends to suggest that Parmenides is either denying the existence of the duality completely, or accepting that only one of them properly exists. “Naming” only one opposite (for example, Light) seems to require thinking of it in terms of its opposite (for example, “Light” is “not-dark”), which is contrary to the path of only thinking of “what is,” and never “what is not” (compare Mourelatos 1979). The same holds if only Night is named. Thus, it would not seem appropriate to name only one of these forms. This problem is only doubled if both forms are named. Thus, it would seem that mortals should not name either form, and thus both Light and Night are denied as proper objects of thought. The Greek can also be read as indicating that it is the confusion of thinking both “what is” and “what is not” that results in this “naming error,” and that thinking both of these judgments (“what is” and “what is not”) simultaneously is the true error, not “naming” in-itself.

Mortal “naming” is treated as problematic overall in other passages as well. This universal denigration is first introduced at C 8.34-41 on the traditional reconstruction (For a proposal to relocate these lines to Opinion, see Palmer’s 2009 discussion of “Ebert’s Proposal”). Here, the goddess dismisses anything mortals erroneously think to be real, but which violate the perfect predicates of Reality, as “names.” C 11 expounds upon this “naming error,” arguing that Light and Night have been named and the relevant powers of each have been granted to their objects, which have also been named accordingly. C 20 appears to be a concluding passage for both Opinion and the poem overall, stating that only according to (presumably mistaken) belief, things came-to-be in the past, currently exist, and will ultimately perish and that men have given a name to each of these things (and/or states of existence). If this is truly a concluding passage, the apparently disparate content of Opinion is unified as a treatment of mortal errors in naming, which the section uncontroversially began with. From these grounds, the other fragments traditionally assigned to Opinion can be linked (directly or indirectly) to this section, based upon parallels in content/imagery and/or through contextual clues in the ancient testimonia.

Both C 9/DK 10 and C 10/DK 11 variably promise that the youth will learn about the generation/origins of the aether, along with many of its components (sun, moon, stars, and so forth). C/DK 12 and C/DK 13 then deliver on this cosmogonical promise by developing a theogony:

C 12: For the narrowest rings became filled with unmixed fire,

The outer ones with night, along which spews forth a portion of flame.

And in the middle of these is a goddess, who governs all things.

For in every way she engenders hateful birth and intercourse

Sending female to mix with male, and again in turn,

male to mix with the more feminine.

C 13: [and she] contrived Love first, of all the gods.

C 14 and C 15 then describe the cosmology that results from the theogonical arrangement, expounding the properties of the moon as, respectively, “an alien, night-shining light, wandering around the Earth,” which is “always looking towards the rays of the sun.” Similarly, C 16 is a single word (ὑδατόριζον), meaning “rooted in water,” and the testιmonia explicitly claims this is grounded in the Earth.

In many ways, the theogonical cosmology presented so far is quite reminiscent of Hesiod’s own Theogony, and certain Milesian cosmologies at times. However, C 17-19 are more novel, focusing on the relationship between the mind and body (C 17/DK 16), as well as sexual reproduction in animals—which side of the uterus different sexes are implanted on (C 18/DK 17) and the necessary conditions for a viable, healthy fetus (C 19/DK 18). These passages can be tied to the previous fragments in that they are an extension of the theogonical/cosmogonical account, which has moved on to offer an account of earthly matters—the origin of animals and their mental activity—which would still be under the direction of the “goddess who governs all things” (C 12). This is clearly the case with respect to C 18-19, as the governing goddess is explicitly said to direct male-female intercourse in C 12.

d. Positive Aletheia. Negative Opinion?

Given the overall reconstruction of the poem as it stands, there appears to be a counter-intuitive account of “reality” offered in the central section (Reality)—one which describes some entity (or class of such) with specific predicational perfections: eternal—ungenerated, imperishable, a continuous whole, unmoving, unique, perfect, and uniform. This is then followed by a more intuitive cosmogony, suffused with traditional mythopoetical elements (Opinion)—a world full of generation, perishing, motion, and so forth., which seems incommensurable with the account in Reality. It is uncontroversial that Reality is positively endorsed, and it is equally clear that Opinion is negatively presented in relation to Aletheia. However, there is significant uncertainty regarding the ultimate status of Opinion, with questions remaining such as whether it is supposed to have any value at all and, if so, what sort of value.

While most passages in the poem are consistent with a completely worthless Opinion, they do not necessitate that valuation; even the most obvious denigrations of Opinion itself (or mortals and their views) are not entirely clear regarding the exact type or extent of its failings. Even more troubling, there are two passages which might suggest some degree of positive value for Opinion—however, the lines are notoriously difficult to understand. Depending upon how the passages outlined below are read/interpreted largely determines what degree/kind (if any) of positive value should be ascribed to Opinion. Thus, it is helpful to examine more closely the passages where the relationship between the sections is most directly treated.

Consider the goddess’ programmatic outline for the rest of the poem at the end of the Proem:==

C 1:   …And it is necessary for you to learn all things, (28b)
Both the still-heart of persuasive reality,
And the opinions of mortals, in which there is no trustworthy persuasion. (30)

From the very beginning of her speech, the goddess presents the opinions of mortals (that is, Opinion) negatively in relation to Reality. However, it does not necessarily follow from these lines that Opinion is entirely false or valueless. At most, all that seems entailed here is a comparative lack of epistemic certainty in relation to Reality. However, the transition from Reality to Opinion (C/DK 8.50-52), when the goddess ends her “trustworthy account and thought about reality,” and in contrast, charges the youth to “learn about the opinions of mortals, hearing the deceptive arrangement of my words,” implies falsity (C/DK 8.50-52). This deceptive arrangement could be understood to apply only to the goddess’ presentation of the account. However, as Aletheia is described as a “trustworthy account,” and there seems to be no doubt that it is the content (as well as the presentation) that is trustworthy, the parallel should hold for Opinion as well. Accepting that it is the content of Opinion that is deceptive, one of the most difficult interpretative questions regarding Opinion remains. Is the extent of the deception supposed to apply to: a) every proposition within Opinion (for example, Parmenides wants to say it is actually false that the moon reflects sunlight), or b) only some significant aspects of its content (for example, basing an account on opposites like Light/Night)? Either way, C/DK 1.30 and 8.50-2 make it clear that Opinion and the “opinions of mortals” are lacking in both veracity and epistemic certainty—at least to some extent.

Mortal beliefs are also unequivocally derided in between these bookends to Reality, though in slightly different terms. At C 5, the goddess warns the youth from the path of inquiry upon which “…mortals with no understanding stray two-headed…by whom this has been accepted as both being and not being the same” (Coxon’s translation). C 5 not only claims mortal views are in error, it identifies the source of their error—confusing being and non-being. C/DK 7 then further identifies the reason mortals tend to fall into this confusion—by relying upon their senses, rather than rational accounts. “But do keep your thought from this way of enquiry. And let not habit do violence to you on the empirical way of exercising an unseeing eye and a noisy ear and tongue, but decide by discourse the controversial test enjoined by me” (Coxon’s translation).

Finally, the goddess’ criticism of the “naming error” of mortals—which seems to be the primary criticism offered in Opinion—furthers the case against Opinion’s apparently complete lack of veracity. At the first mention of the “naming error” on the traditional arraignment, the goddess says, “…To these things all will be a name, which mortals establish, having been persuaded they are real: to come to be and to perish, to be and not to be, and to change location and exchange bright color throughout” (C/DK 8.38b-41; Coxon’s translation). The persuasion of mortals to believe in the reality of the objects and phenomena they “name” clearly implies here that the counterfactual is supposed to be true, and that any such phenomena which do not correspond with the properties advocated in Reality are not real. Immediately following the goddess’ transition to her “deceptive account,” C/DK 8.53-56 makes it clear that the activity of “naming” and “distinguishing things in opposition,” contrary to the unity of Reality, is the initial mistake of mortals: “For they established two forms to describe (“name”) their two judgments (of which one should not be—the one by which mortals are and have been misled), and they distinguished two opposites in substance, and established predicates separating each from one another.”

Given the passages outlined so far in this section, there appears to be quite a substantial case for taking Opinion to be entirely false and lacking any value whatsoever. Nevertheless, this may not be the entire story. It is important to stress that while these passages seem to strongly suggest (and one may argue that they even entail) Opinion is false, the goddess never actually says it is “false” (ψευδής). Furthermore, there is at least some textual evidence that might be understood to suggest Opinion should not be treated as negatively as the passages considered so far would suggest.

As noted in the summary of the Proem above, there are two particularly difficult lines (C 1.31-32) which may be understood as suggesting some positive value for Opinion, despite its lacking in comparison to Reality. However, even if the Greek is read along these lines, it remains to be determined whether this value is based upon some substantial value in the account itself (there is some sense or perspective in which it is true), or merely some pragmatic and/or instructive value (for example, it is worthwhile to know what is wrong and why, so as to avoid not falling into such errors).

In any case, even if there is some positive reason for learning Opinion provided in these lines, this could hardly contradict the epistemic inferiority (“no trustworthy persuasion”) just asserted at C/DK 1.30, just as it is quite difficult to deny the falsity implied from lines C/DK 8.50-52. At most, these lines could only soften the negative treatment of mortal views. Nonetheless, the possibility should be admitted that upon certain variant readings of C/DK 1.31-32, the status of Opinion and its value could be more complex and ambivalent than other passages suggest.

Only one further extant passage remains which might offer some reason to think Opinion maintains some positive value, and this is the passage most commonly appealed to for this purpose. At C/DK 8.60-61, the goddess seems to offer an explicit rationale for providing the youth with her “deceptive” account: “I declare to you this entirely ‘likely’ (ἐοικότα) arrangement so that you shall never be surpassed by any judgment of mortals.” The key word here is the apparently positive participle ἐοικότα, which does not obviously reconcile with the otherwise negative treatment of Opinion. The participle ἐοικότα can have the general sense of “likely,” in the sense of “probable,” as well as “fitting/seemly” in the sense of “appropriate.” Either translation could suggest at least the possibility of veracity and/or value in Opinion. That is, the account in Opinion could “likely” be true, though it is epistemically uncertain whether it is or not. Or, the account could be “fitting,” given the type of account it is—one which seeks to explain the world as it appears to the senses, which is still worth knowing, even if it is not consistent with the way the world truly is.  On either of these readings, Opinion could be “deceptive,” yet still be worthwhile in-itself. On these readings view, though Opinion is inferior to Aletheia in some regards, it can be positively endorsed in its own right, as Parmenides’ own version of “mortal-style” accounts.  If it is then understood that Parmenides’ “cosmology” is superior to all other possible mortal accounts of this kind, the goddess’ promise to the youth that learning this account will insure he is never surpassed by any other mortal judgments can be explained (C/DK 8.60b-1).

On the other hand, it is just as easy to understand Opinion being “likely” in the sense that it is indicative of the sort of account a deranged mortal relying upon their senses might be prone (“likely”) to offer, which is hardly an endorsement. Since mortals are incorrect in their accounts, the particular account offered in Opinion is representative of such accounts, and is presented didactically—as an example of the sorts of accounts that should not be accepted. If the youth can learn to recognize what is fundamentally mistaken in this representative account (Opinion), any alternative or derivative account offered by mortals which includes the same fundamental errors can be recognized and resisted. This seems to be far more consistent given the treatment of Opinion overall, and C/DK 8.60b-1 arguably better fits with this interpretation. Furthermore, it is quite difficult to defend a fundamental premise required for the alternative, more positive view outlined above—that the cosmology offered in Parmenides’ Opinion is intended to be superior to all other mortal views. Not only is this in tension with the clear negative treatment of Opinion throughout the text, it is implausible on more general grounds, as any account grounded upon fundamentally incorrect assumptions cannot be “superior” in any substantial sense. While some have attempted to claim that Opinion satisfies this on account of its dualistic nature, which is second-best to Reality’s monistic claims, this approach fails to account for how Opinion could possibly be superior to any other dualistic account.

Given all of this, it is undeniable that Opinion is lacking in comparison to Aletheia, and certainly treated negatively in comparison. It should also be taken as well-founded that the Opinion is epistemically inferior. Whether Opinion is also inferior in terms of veracity seems most likely—though again, it is not certain whether this means Opinion is entirely lacking in value, and the extent of its deceptiveness (all content, or its fundamental premises and assumptions) is still an open question. Navigating the Scylla and Charybdis of: a) taking the negative, yet often ambiguous and/or ambivalent, treatment of Opinion in the text seriously, while b) avoiding apparently absurd interpretative outcomes, is what makes understanding its relationship to Reality, and thus developing an acceptable interpretation of the poem overall, so very difficult.

3. Interpretative Treatments

This section provides a brief overview of: (a) some of the most common and/or influential interpretative approaches to the Proem itself, as well as (b) the relationship between Reality and Opinion, and/or the poem overall. The purpose is to provide the reader with a head-start on how scholars have tended to think about these aspects of the poem, and some of the difficulties and objections these views have faced. The treatment is not meant to be at all exhaustive, nor advocate any particular view in favor of another.

a. Reception of the Proem

The only ancient response to the content of the Proem is from the Pyrrhonian Skeptic Sextus Empiricus (2nd cn. C.E). In an attempt to demonstrate how Parmenides rejected opinions based upon sensory evidence in favor of infallible reason, Sextus set forth a detailed allegorical account in which most details described in the Proem are supposed to possess a particular metaphorical meaning relating to this epistemological preference. Sextus describes the chariot ride as a journey towards knowledge of all things, with Parmenides’ irrational desires and appetites represented as mares, and the path of the goddess upon which he travels as representative of the guidance provided by philosophical reasoning. Sextus also identifies the charioteer-maidens with Parmenides’ sense organs. However, he then strangely associates the wheels of the chariot with Parmenides’ ears/hearing, and even more strangely, the Daughters of the Sun with his eyes/sight—as if Sextus failed to recognize the numerical identity between the “charioteer-maidens” and the “Daughters of the Sun.” Similarly, he identifies Justice as “intelligence” and then erroneously seems to think that Justice is the very same goddess which Parmenides is subsequently greeted by and learns from, when the journey clearly leaves Justice behind to meet with a new goddess. In his attempt to make nearly every aspect of the story fit a particular metaphorical model, Sextus clearly overreaches all evidence and falls into obvious mistakes. Even more evidently problematic, the division of the soul into these distinct parts and accompanying metaphorical identifications is clearly anachronistic, borrowing directly from the chariot journey described in Plato’s Phaedrus. For these reasons, no modern scholar takes Sextus’ particular account seriously.

Modern allegorical treatments of the Proem have generally persisted in understanding the cosmic journey as an “allegory of enlightenment” (for a recent representative example, see Thanssas 2007). This treatment is possible no matter what one takes the geometry/geography of the chariot ride to be—whether an ascension “into the light” as a metaphor for knowledge as opposed to ignorance/darkness, or a circular journey resulting in a chthonic katabasis along Orphic lines. Though the particular details will vary from one allegorical account to the next, they tend to face objections similar to that of Sextus’ treatment. The metaphorical associations are often strained at best, if not far beyond any reasonable speculation, particularly when one attempts to find metaphorical representations in every minor detail. More theoretically problematic, determining some aspects to be allegorical while other details are not would seem to require some non-arbitrary methodology, which is not readily forthcoming. Due to ambiguity in, and variant possible readings of the text, there is room for many variants of allegorical interpretations—all equally “plausible,” as it seems none will be convincing on the evidence of the Proem alone. Recognition of this has led some to claim that while the Proem is certainly allegorical, we are so far distant from the cultural context as to have no hope of reliably accessing its metaphorical meanings (for example, Curd 1998). Finally, the allegorical accounts available tend to offer little if any substantive guidance or interpretative weight for reading the poem overall. For these reasons, allegorical treatment has become less common (for extensive criticism of Sextus’ account and allegorical treatments of the Proem in general, see Taran 1965; Palmer 2009).

With the decline of allegorical treatments, an interest in parsing the Proem in terms of possible shared historical, cultural, and mythical themes has ascended. For instance, a fair amount has been written on the parallels between the chariot’s path and Babylonian Sun-mythology, as well as how the Proem supposedly contains Orphic and/or Shamanistic themes. However, while Greek sun mythology may well have ancient Babylonian roots, the cultural origins do not seem at all relevant to Parmenides’ own cultural understanding at his time, nor that of any likely listener or reader of his work. Shamanistic influences are more suspect as influences and can be easily dismissed as a literary device designed to get the reader’s attention (Lombardo 2010). Purported Orphic parallels turn on Orphism’s revelatory journeys to the underworld, as well as initiations led by Night, and such influences are far more likely to have been relevantly parallel. Unfortunately, little is known about this mystery tradition overall, particularly at Parmenides’ time. Thus, it is overly speculative to hang very much on this purported influence with any confidence. Also, the theme of knowledge gained via chthonic journey, while consistent with Orphism, would not seem to be unique to that tradition, and the kind of “revelations” Parmenides’ youth undergoes are very different. The youth does not learn about any topics Orphism itself focuses on: moral truths, the nature of the soul itself, or what the afterlife was like. Furthermore, Parmenides’ unnamed youth learns a rational account based upon argumentation that can (and should be) tested and applied (compare C/DK 7.5-6), which is very different from the more “revelatory” nature of Orphism.

Overall, the Proem has far more commonly been minimized, dismissed as irrelevant, and/or entirely ignored by ancients and moderns alike, probably because they saw no immediately obvious philosophical content or guidance for understanding the rest of the poem within it. A select few advocate that the reader is merely supposed to recognize that Parmenides is here indicating that his insights were the product of an actual spiritual experience he underwent. However, there is no real evidence for this, and some against. The verbal moods (optative and imperfect) suggest ongoing, indefinite action—a journey that is repeated over and over, or at least repeatable—which cuts against a description of a one-off event that would be characteristic of a “spiritual awakening.” Even more problematic, the rationalistic account/argumentation of the goddess—which she demands the listener/reader to judge by reason (logos)—would thus be superfluous, if not undermined (C/DK 7.5-6). The same objection holds for attempts to dismiss the Proem as a mere nod to tradition, whereby epic poets traditionally invoke divine agents (usually, the Muses) as a source of inspiration and/or revelatory authority. It has also been common to reduce the Proem to a mere literary device, introducing nothing of relevance except the “unnamed Goddess” as the poem’s primary speaker.

While the Proem may be enigmatic, any summary dismissal which suggests that the Proem is entirely irrelevant to understanding Parmenides’ philosophical views is likely too hasty. There are very close similarities between the imagery and thematic elements in the Proem and those found throughout the rest of the poem, especially Opinion. For instance, the Proem clearly contrasts light/fire/day imagery with darkness/night, just as the two fundamental opposing principles underlying the cosmogony/cosmology in Opinion are also Fire/Light and Night. Both the Proem and the theogonical cosmology in Opinion introduce an anonymous goddess. In fact, in contrast to Reality, both sections have extensive mythological content, which scholars have regularly overlooked. The obvious pervasive female presence in the Proem (and the rest of the poem), particularly in relation to divinity, can also hardly be a coincidence, though its importance remains unclear. Once considered at greater length, the parallels between the Proem and Opinion seem far too numerous and carefully contrived to be coincidental and unimportant. This suggests a stronger relationship between the Proem and Opinion than has commonly been recognized and the need for a much more holistic interpretative approach to the poem overall, in contrast to the more compartmentalized analyses that have been so pervasive. Further scholarly consideration along these lines would likely prove quite fruitful.

b. The A-D Paradox: Select Interpretative Strategies and their Difficulties

The central issue for understanding the poem’s overall meaning seems to require reconciling the paradoxical accounts offered in Reality and Opinion. That is, how to reconcile: a) the positively endorsed metaphysical arguments of Reality, which describe some unified, unchanging, motionless, and eternal “reality,” with b) the ambiguously negative (or perhaps, ambivalent) treatment of the ensuing “cosmology” in Opinion, which incorporates the very principles Reality denies. Based upon the Greek terms for these respective sections (Aletheia and Doxa), I will refer to this as the “A-D Paradox.”

In this section, some of the more common and/or influential approaches by modern scholars to address this paradox are considered, along with general objections to each strategy. This approach provides a more universal appreciation of the A-D Paradox than taking on any selection of authors as foils, allowing the reader a broad appreciation for why various interpretative approaches to the poem have yet to yield a convincing resolution to this problem.

i. Strict Monism and Worthless Opinion

The most persistent approach to understanding the poem is to accept that for some reason—perhaps merely following where logic led him, no matter how counterintuitive the results—Parmenides has concluded that all of reality is really quite different than it appears to our senses. On this view, when Parmenides talks about “what is,” he is referring to what exists, in a universal sense (that is, all of reality), and making a cosmological conclusion on metaphysical grounds—that all that exists is truly a single, unchanging, unified whole. This conclusion is arrived at through a priori logical deduction rather than empirical or scientific evidence, and is thus certain, following necessarily from avoiding the nonsensical positing of “what is not.” Any description of the world that is inconsistent with this account defies reason, and is thus false. That mortals erroneously believe otherwise is a result of relying on their fallible senses instead of reason. Thus, the account in Opinion lacks any intrinsic value and its inclusion in the poem must be explained in some practical way. It can be explained dialectically, as an exercise in explicating opposing views (Owen 1960). It can also be explained didactically, as an example of the sort of views that are mistaken and should be rejected (Taran 1965). This strict monism has been the most common way of understanding Parmenides’ thesis, from early times into the mid-twentieth century.

This reading is certainly understandable. The text repeatedly sets forth its claims in seemingly universal and/or exhaustive contexts (for example, “It is necessary for you to learn all things…” C/DK 1.28b, “And only one story of the way remains, that it-is…,” C/DK 8.1a-2). The arguments of C/DK 8 all describe a singular subject, in a way that naturally suggests there is only one thing that can possibly exist. There is even one passage which is commonly translated and interpreted in such a way that all other existence is explicitly denied (“for nothing else either is or will be except what is…” C/DK 8.36b); however, the broader context surrounding this line undercuts this interpretation, on either selection of the variant Greek transmission. The broad range of topics in Opinion seems to be intended as an exhaustive (though mistaken) account of the world, which the abstract and singular subject of Reality stands in corrective contrast to. Perhaps the most significant driving force for understanding Parmenides’ subject in this way is Plato’s ascription to him of the thesis that “all is one” and Aristotle’s subsequent similar treatment.

While this view is pervasive and perhaps even defensible, many have found it hard to accept given its radical and absurd entailments. Not only is the external world experienced by mortal senses denied reality, the very beings who are supposed to be misled by their senses are also denied existence, including Parmenides himself! Thus, this view results in the “mad,” self-denying position that Descartes would famously show later was the one thing we could never deny as thinkers—our own existence. If there is to be any didactic purpose to the poem overall—that is, the youth is to learn how to not fall into the errors of other mortals—the existence of mortals must be a given; since this view entails they do not exist, the poem’s apparent purpose is entirely undercut. Surely this blatant contradiction could not have escaped Parmenides’ notice.

It is also difficult to reconcile the apparent length and detailed specificity characteristic of the account offered in Opinion (as well as the Proem), if it is supposed to be entirely lacking in veracity. Providing such a detailed exposition of mortal views in a traditional cosmology just to dismiss it entirely, rather than continue to argue against mortal views by deductively demonstrating their principles to be incorrect, would be counterintuitive. If the purpose is didactic, the latter approach would certainly be sufficient and far more succinct. The view that Parmenides went to such lengths to provide a dialectical opposition to his central thesis seems weak: a convenient ad hoc motivation which denies any substantial purpose for Opinion, implying a lack of unity to the overall poem.

Though the strict monist view remains pervasive in introductory texts, contemporary scholars have tended to abandon it on account of these worrisome entailments. Yet, there seems to be no way to avoid these entailments if Parmenides’ subject is understood as: i) making a universal existential claim, and if ii) the account offered in Opinion is treated as inherently worthless. Thus, alternative accounts tend to challenge one or both of these assumptions.

ii. Two-World (or Aspectual) Views

If the problems of strict monism are to be avoided while maintaining the apparent universal, existential subject (that is, “all of reality”), it makes sense to seek some redemptive value for Opinion so that Parmenides neither: a) denies the existence of the world as mortals know it, nor b) provides an extensively detailed account of that world just to dismiss it as entirely worthless. The primary strategy for redeeming the Opinion’s value has been to emphasize the epistemic inferiority of the Opinion, while denying its complete lack of veracity. Such approaches also tend to simultaneously downplay any ontological/existential claims made in the poem.

Emphasizing the epistemic distinctions, it can be pointed out that the conclusions offered in Reality are reached through a priori, deductive reasoning—a methodology which can provide certainty of the conclusion, given the premises. The Greeks tended to associate such knowledge with divinity, and thus the conclusions in Reality can also be understood as “divine” (note that it is narratively achieved via divine assistance, the poem’s spokes-goddess). On the other hand, there is no “true trust” or reliability to mortal accounts (C 1.30), either in the traditional divine v. mortal distinction or in Parmenides’ poem. Parmenides attributes this failing to the fact that mortals rely entirely upon fallible, a posteriori sense experience. However, while mortal accounts may be fallible, as well as epistemically inferior to divine (or deductive) knowledge, such accounts may still be true. By passing along the goddess’ logos via his poem, Parmenides has shown how mortals can overcome the traditional division between divine (certain) and mortal (fallible) knowledge. If it is just that Opinion is uncertain, and not completely false, then it can have intrinsic value. The account in Opinion could thus be “likely” in the sense that it is the best account that can be offered, even though the mortal approach does not yield certainty like divine methodology does. It is for these reasons that Parmenides provides his own, purportedly superior, cosmology.

Emphasizing the epistemological differences between these sections is not altogether wrong, as the explicit epistemic contrasts between these accounts in the poem are undeniable. However, holding the sole failing of Opinion to be its lack of epistemic certainty can hardly be the entire story. The conclusions offered in Reality remain irreconcilable with the account in Opinion, and the entailment that mortals still do not really exist to learn from Parmenides’ poem if the divine account is true, persists. Furthermore, other aspects of the poem are not adequately addressed at all. How is Opinion a “deceptive” account, other than it might be if we are misled by fallible senses (but it might also be true!), and we just cannot be certain?  How do mortals err by accepting being and not-being to both be actual, and by “naming opposites”? Even if it is granted that reliance on senses can result in these errors, it seems that any lack of error on these points would once again lead back to strict monism (if “what is” remains existential and universal) and its world-denying problems.

Attempts to resolve these issues have tended to rely upon positing an ontological hierarchy to complement the epistemic hierarchy. The account revealed by the divine methodology of logical deduction in Reality reveals what the world, or at least Being, must fundamentally be like. However, the world as it appears also exists in some ontologically inferior manner. Though any account of it cannot be truly correct, since mortals actually live in this lower ontological level, learning the best account of reality at that level remains important. In short, such views trade upon a distinction between: a) an unexperienced though genuine reality, which corresponds with divine epistemic certainty (Reality), in contrast to b) a lower-level of “reality,” accounts of which are epistemically uncertain, as well as deceptive in that they tend to obscure deeper ontological truths (especially if they are taken to describe all that there is).

A number of objections can be raised to this interpretative approach. However, they tend to boil down to anachronistic worries about the “Platonization” of Parmenides, by Plato and his successors, even down to the Neoplatonist Simplicius. The ontological gradations posited on this view (in addition to anachronistic translations of Parmenides’ Greek along such lines) would suggest that Parmenides very closely anticipated the ontological and epistemological distinctions normally taken to be first developed in Plato’s Theory of Forms. While Parmenides certainly made some very basic yet pioneering advances in epistemic distinctions—advances which very likely in turn influenced Plato—the far more refined distinctions and conceptions required for this interpretation of Parmenides are almost certainly the result of interpreters reading Platonic distinctions back into Parmenides (as Plato himself seems to have done), rather than the distinctions genuinely being present in Parmenides’ own thought. The pervasiveness of such “two-world” interpretative accounts likely says far more about Plato’s extensive influence, as well as the importance of finding some way out of the world-denying entailments, than it does about Parmenides’ own novelty.

It is also quite difficult to offer a convincing explanation for what possible grounds Parmenides could have for ascribing superiority to his own account of the apparent world offered in Opinion, in comparison to any other mortal offering of his time. The content certainly doesn’t appear to be superior. The echoes to other accounts, such as Anaximander’s and Hesiod’s, are rather obvious and not at all novel. While his cosmological claims may contain some novel truths (moon gets its light from the sun, etc.), these claims are still cast in a deceptive framework—the “naming error” of mortals. The defense that Parmenides’ own account is superior on the grounds that Opinion is the simplest account possible, relying upon a dualism of conflicting opposites, fails to explain how it would be superior to any similar dualistic account. Furthermore, the methodology does not appear to be superior in any way—Parmenides abandons his pioneering deduction in Reality, resorting to a traditional mythopoetic approach in Opinion.

iii. Essentialist (or Meta-Principle) Views

A promising suggestion by some recent commentators is that, rather than drawing ontological conclusions about the entirety of existence, Parmenides was instead focused on more abstract metaphysical considerations. Such approaches impute a primarily predicative (rather than existential) usage of the Greek word “to be” by Parmenides, particularly with respect to the deductive argumentation found in Reality. Such approaches result in C/DK 8.1-50 revealing the nature, or essence, of what any fundamental or genuine entity must be like.

Mourelatos was the first to advocate that Parmenides employed the Greek verb “to be” in a particular predicative sense—“the ‘is’ of speculative predication.” Mourelatos takes Parmenides to be attempting an exhaustive account of the necessary and essential properties for any fundamental ontological entity. That is, to say “X is Y” in this way is to predicate of X all the properties that necessarily belong to X, given the sort of thing X is (Mourelatos 1970, 56-67). Nehamas (1981) and Curd (1998) have both developed more recent proposals along similar lines.

A common upshot of Essentialist views is that, while it remains true that every fundamental entity that exists must be eternal, motionless, a unified whole, etc., this is consistent with existence of a plurality of such fundamental beings. Parmenides’ view would thus not be quite so radical as seen under the ontological, strict monist approaches. Furthermore, this view can have welcome implications for the narrative of how Parmenides was received by his immediate successors (that is, Anaxagoras, Empedocles, and the early Atomists). Rather than directly rejecting Parmenides’ strict monism in developing their pluralistic systems, they would be able to freely accept his conclusions regarding the nature of fundamental entities and move on to develop pluralistic systems that respected this nature while simultaneously explaining our perception of the world. Such a change in narrative is an improvement if, as Curd argues, one thinks the lack of any explicit argumentation against Parmenides’ strict monism by his successors is problematic, as it would entail later thinkers were guilty of “begging the question” against Parmenides.

Whatever the merits of this more limited and abstract thesis of Reality, such interpretations continue to face very similar, if not the same, problematic entailments and worries related to the value of the Opinion. First, there is substantial objection particular to such accounts. If Parmenides were truly providing an account of what any genuine being should be like, and this in turn outlines the requirements any acceptable cosmology must meet, it would be expected that Parmenides’ own cosmology (Opinion) would make use of these very principles. At the very least, one should expect some hint at how such an essentialist account of being could be consistent with mortal accounts. However, there is not even a hint of such in Opinion. Furthermore, though the arguments in Reality are now consistent with a plurality of fundamental perfect beings, there seems to be no way such entirely motionless and changeless entities could be consistent with, or productive of, the contrary phenomena found in the world of mortal experience. Thus, it remains difficult to see how Opinion could be true in any way, and the existence of mortals and Parmenides is still under threat, along with the implications that follow. The purpose of the poem is frustrated if mortals and Parmenides cannot exist. If Opinion is still entirely worthless, then the objections concerning its length and specificity also remain. Any attempts to introduce a “two-world” distinction still face charges of anachronism, and attempts to explain Opinion away as Parmenides’ own “best account” of the world (even though it is false) continue to be lacking in justification.

iv. Modal Views

While the presence of modal language in Parmenides’ works has long been recognized, this fact has largely escaped scholarly attention other than to evaluate whether any fallacies have been committed (compare Lewis 2009; Owen 1960, 94 fn. 2). Only recently has its presence been taken seriously enough to warrant a full-fledged interpretative account that addresses the relationship between Reality and Opinion (Palmer 2009). This approach is quite similar in some ways to the Essentialist approach. The account in Reality is still intended to provide a thorough analysis of the essential properties of some kind of being. However, the kind of being is more narrowly prescribed. Rather than an account of what any fundamental entity must be like, Parmenides is taken to explicate in Reality what any necessary being must necessarily be like, qua necessary being.

The inspiration for this approach is found in C 3/DK 2, where Parmenides introduces the initial two paths of inquiry. The first is the way “that is, and that is not to not be.” Something that is “not to not be” is equivalent to “must be.” Therefore, the first path concerns that which necessarily exists, or necessary being. The second path is clearly the contrary—“that it is not and must not be,” or necessary non-being. Though Parmenides does not use this exact formulation later in the poem, on the reasonable hypothesis that this construction is awkward (even in prose, let alone poetry), it is posited that “what is” and “what is not” are to be taken as shorthand for referring to these modes of being.

Adopting this understanding provides new and compelling perspectives on a number of issues in Reality. At the end of C 3/DK 2, the path that follows “what is not” is dismissed as one that can neither be apprehended nor spoken of. Rather than importing the likely anachronistic parallels to modern philosophy of language, particularly Russellian concerns with negative existential statements, the difficulty can be taken to be the impossibility of conceiving of necessarily non-existent things (for example, square-circles), which is a far more likely problem to have been recognized given the historical context. It is also readily understood why knowledge along these lines is entirely trustworthy, as any necessary entity must have certain essential properties given the sort of thing it is and its mode of existence.

This view also offers a very different perspective on the third way of inquiry introduced in C 5/DK 6. This is the “mixed” path of mortals, who knowing nothing and depending entirely upon their senses, erroneously think “to be and not to be are the same and not the same.” If Parmenides’ central thesis is to explicate the essential characteristics of necessary being (and reject necessary non-being as that which cannot be conceived at all), it is fitting for him to recognize that there are other beings as well: contingent beings. Mortals who have not used reason to conceive of what necessary being must be like are stuck only contemplating and believing in contingent beings, which can and often do change their aspects and existential status—or, are at least perceived to in certain contexts—leading to a “wandering” understanding that lacks the unchanging knowledge inherent in understanding necessary being qua necessary being.

Perhaps most compelling, the properties deduced in C/DK 8 as necessarily characteristic of “what is” make far more sense on the modal approach. Clearly, the provisions against coming-to-be and perishing are far more intuitive on this model than they are on models which simply disallow “what is not” to be part of its conception. More telling, while it is still certainly possible to justify some of these properties on the grounds that thinking “what is not” is not allowed in the conception, others are far more problematic. For instance, “what is” is argued to be “limited” in spatial extent and uniform throughout (C/DK 8.42-29). Thinking of something as motionless and limited in spatial extent, and uniform throughout itself, seems to require thinking of it as “not being” in other places than it actually is and of its own properties not existing beyond—that is, thinking “not being” in both instances. On the other hand, if Parmenides’ thesis is to explicate what a necessary being must be like on account of its modality alone, it is perfectly acceptable to think of a (spatially extended, material) necessary being as a discrete entity, which must possess its modal nature uniformly throughout. In short, were we even now to construct a list of properties essential for any necessary (spatially extended) being, such a list would closely match Parmenides’ own.

Though the modal view seems compelling in many ways with respect to Reality, the same might be said of other views considered above. The real question is whether it can resolve the “A-D Paradox,” while providing a compelling and meaningful answer for the inclusion of Opinion. Since Reality explicates the nature of necessary being, and this is a very different sort of thing from the contingent beings described in Opinion, the tension between these accounts has already been largely eliminated. Thus, the modal view generally succeeds in resolving the “A-D Paradox,” as it restricts Reality to such a narrow scope that Opinion can no longer be about the same entities. However, there can still be better and worse explanations for what Parmenides’ intention is with Opinion, and this still involves resolving whether its status is ultimately positive or negative, and in what sense. While Palmer has offered a very insightful and important contribution to Parmenidean studies, it is not beyond reproach or objection.

Palmer’s own view on Opinion is quite positive. Palmer takes the error of mortals to be thinking that contingent beings are all there is in the world, by relying solely upon their senses. It is not that the objects in Opinion do not exist, it is that they do not share the same unwavering epistemic account as necessary being does, as the contingent objects and phenomena found in Opinion are in a certain way, and then they are not—as they change, move, come to be, perish, and so forth. Thus, mortal knowledge remains “wandering,” while the (divine) knowledge of necessary being that Parmenides imparts is certain and unchanging. Nevertheless, the contingent world does exist, so there is value in knowing what one can about it. Thus, Palmer avers that Opinion is Parmenides’ own best attempt to explain the world of contingent being, which does not admit knowledge via the deductive methodology used in Reality. In this way, Palmer has succeeded in developing an interpretation that requires only an epistemic hierarchy between Reality and Opinion, without the additional ontological hierarchy of Two-World views and the anachronistic worries that accompany them.

While the modal view does allow the existence of contingent beings and thus an account of them would be valuable in-itself, it does not necessarily follow that this is what Parmenides was attempting in Opinion. Such a positive treatment still seems to be in tension with the overarching negative treatment of Opinion throughout the poem. Furthermore, Palmer’s attempts to portray passages about Opinion in a more positive light are far less compelling than his modal treatment of Reality.

One of the more problematic attempts to cast Opinion in a more positive light is Palmer’s corrective emendation of the missing verb at the end of line C 5.3/DK 6.3, changing the goddess’ words from warning the youth away from the “wandering path” of mortals (εἴργω) to a programmatic promise (ἄρχω) to being an explication of that path later on (Palmer 2009, 65-67). The problem with this emendation is that it is a common rule in Greek for the active verb ἄρχω to mean “rule”—the verb normally only carries the meaning of “begin” in its middle form (ἄρχομαι). This objection is not decisive, however, as Palmer’s overall view does not require this emendation.

What is fundamentally damning is Palmer’s view that Opinion is Parmenides’ best account of contingent being. First, Palmer faces the challenges noted above of explaining why Parmenides would be entitled to think his own mythopoetic account in Opinion would be superior to any other mortal account. Palmer is likely entitled to the view that the cosmology is in some sense “Parmenides’ own,” in the sense that it is his own construction and not borrowed from someone else, on the grounds that it contains novel cosmological truths (moon gets light from sun). The account could even be “superior” in that it contains novel cosmological truths that past accounts failed to include. However, this would require that Parmenides really think there could be no further discoveries that would then surpass his own knowledge. More importantly, there are no grounds to support that his theogonical content is supposedly superior to Hesiod’s.

Second, and most importantly, Palmer’s positive account of Opinion fails to explain how mortals could possibly be mistaken about the subject of Reality, as the text clearly requires.

to it all things have been given as names                                             8.38b

all that mortals have established in their conviction that they are genuine,

both coming to be and perishing, both being and not

and altering place and exchanging brilliant colour.                              8.41

“What is” is the subject of 8.38b-41, which is uncontroversially the entity described in Aletheia, and thus necessary being on Palmer’s view. It is to that entity mortals have “given as names” all the attributions listed: coming to be, perishing, and so forth. That mortals “in their conviction” believe these names to pick out something “genuine” with respect to the subject of Aletheia clearly implies that mortals are in error to do so—that these phenomena do not correctly specify the nature of Aletheia’s subject. This is the central claim about mortal errors by the goddess, and it is undeniable that, in order for mortals to incorrectly specify the nature of Aletheia’s subject, mortals must have some conception of and familiarity with that relevant object (or type of entity). Palmer recognizes this himself, asking “How can mortals describe or misconceive What Is [necessary being] when they in fact have no grasp of it?” (167). The answer is, of course, that they cannot. The naming error of mortals requires an account of how mortals get the nature of Reality’s subject wrong.

According to Palmer, Parmenides’ task is to explicate the essential nature of necessary being, qua necessary being. Since mortals have only ever relied upon their sense perceptions rather than deductive logic, they have never conceived of the essential nature of any necessary entity. Thus, their failure is to have believed that all of reality consisted entirely of contingent beings. However, if mortals have never conceived of necessary being, then they certainly could not ever have been wrong about it, and incorrectly predicated motion, change, coming-to-be, perishing, and so forth of it.  Palmer even realizes this tension and attempts to explain it away as follows:

Apparently because mortals are represented by the goddess as searching, along their own way of inquiry, for trustworthy thought and understanding, but they mistakenly suppose that this can have as its object something that comes to be and perishes, is and is not (what is), and so on. Again, the goddess represents mortals fixing their attention on entities that fall short of the mode of being she has indicated is required of a proper object of thought. (172)

However, this is no solution. Erroneously thinking that contingent beings can provide “trustworthy thought and understanding” may indeed be an error of mortals. Yet, this is certainly not the same error as mortals thinking that which is explicated in Aletheia can be properly described in ways contrary to its nature (that is, coming to be, perishing, and so forth), which is precisely the error the goddess insists they commit. Palmer’s view of Opinion simply cannot satisfy this textual requirement.

However, Palmer’s modal view of Reality can be readily modified to be consistent with a more negative treatment of Opinion. In fact, a more negative treatment of Opinion seems necessary in order to avoid this fatal flaw. A ready solution is available, which Palmer himself considered at some length, but ultimately rejected—identifying “what is” not only as necessary being, but divine being. This allows for mortals to have a familiar subject (divine being) which they have up until now misunderstood through the mythopoetic tradition, failing to recognize that such would have to be a necessary being, and as such could not be born, die, move, change, or even be anthropomorphic.

In explicating the essential nature of the divine qua necessary being in Reality, Parmenides can be understood as continuing the Xenophanean agenda of criticizing traditional, mythopoetic views of the divine, though he uses metaphysical and deductive argumentation, rather than the ethical appeals of his predecessor. This should not be surprising, given Parmenides’ historical context. Incorporating naturalistic elements or principles that are supposed to be divine, in contrast to anthropomorphic conceptions from the mythopoetical tradition, was otherwise pervasive amongst the Presocratics. The Milesians tended to treat their fundamental and eternal arche as divine entities. Pythagoras, perhaps more of a religious mystic in the first place, certainly included his own views on divinity. Identification of divine entities certainly does not end after Parmenides either, as the systems of the Pluralists and Atomists continue to associate their fundamental “parmenidean” entities with divinity. Most importantly, of course, Xenophanes’ conception of his supreme (and perhaps only) deity very closely parallels Parmenides’ description of “what is,” and recognizing “what is” as a necessary being would only seem to advance this metaphysical treatment of divinity even further. This should not be at all surprising given the extensive evidence for Xenophanes’ role as a strong influence, or even personal teacher, of Parmenides (compare 4.a.iii below). Thus, even if Parmenides never (at least, not in the extant fragments) refers to “what is” as a god/divine thing, that he was thinking along those lines and paralleling the properties others had ascribed to their conception of deity is hard to deny, and readily makes the modal view at least tenable, and perhaps compelling.

4. Parmenides’ Place in the Historical Narrative

a. Influential Predecessors?

Hardly less certain than the rest of his general biography is Parmenides’ intellectual background with questions arising regarding whether he was a pupil of, or at least heavily influenced by, some particular thinker(s). If so, the question remains whether he sought to further refine or challenge such views—or perhaps both. This section broadly analyzes the evidence for ascribing particular intellectual influences and teachers to Parmenides.

i. Anaximander/Milesians

Theophrastus alone asserted that Parmenides was a “pupil” of Anaximander (Coxon Test. 41, 41a). However, this is historically impossible—even with the earliest birthdate, Anaximander was long dead before Parmenides ever engaged in philosophical contemplation. Thus, Parmenides could never have been personally instructed by Anaximander. At most, one could argue that Anaximander’s views influenced and/or provided a particular target for Parmenides to reject, as some modern scholars have suggested.

It is quite likely Parmenides would have been familiar with Anaximander’s works. At least, there is no good reason to doubt this. Also, there are certainly parallel conceptions and opposing contrasts that can be drawn between these two thinkers. Both can be read as understanding the cosmos to be operating in terms of some “necessity,” in accordance with “justice” (compare Miller 2006). Anaximander’s dualistic opposites consist of “hot” and “cold” interacting with each other in generative fashion. Similarly, Parmenides’ dualistic “cosmology” names “Light/Fire” and “Night” as the primordial opposites that are found in all other things, and Aristotle and Theophrastus both explicitly associate these Parmenidean opposites with “hot and cold” (Coxon Test. 21, 25, 26, 35, 45). Both Parmenides and Anaximander describe cosmological light as rings, or circles, of fire. They both think that there are deathless and eternal things (Parmenides’ “what is,” and Anaximander’s divine arche, the apeiron). Both can be understood as drawing upon a rudimentary “principle of sufficient reason,” concluding that if there is no sufficient reason for something to move in one direction or manner versus another, then it must necessarily be at rest (Parmenides’ “what is,” and Anaximander’s description of Earth). Perhaps most importantly, Anaximander suggests opposites arise from an “indefinite” or “boundless” (aperion) eternal substance. Parmenides’ metaphysical deductions can be understood as a direct denial of this, either because nothing could ever arise from something so indefinite in qualities as the apeiron (as such a thing would essentially be nothing–that is, Parmenides denies creatio ex nihilo), and/or that Anaximander’s apeiron is not a proper unity (one distinct thing) if opposites with entirely different and distinct properties (hot and cold) can arise from it (Curd 1998, 77-79; Palmer 2009, 12).

However, closer inspection of these supposed parallels tends to undermine the thesis that Anaximander is particularly influential upon, or a specific target for, Parmenides. First, many of the particulars of Anaximander’s views are noticeably absent, or distinct from, the supposed parallels in Parmenides. While Anaximander’s “necessity” is probably best understood as physical laws, Parmenides’ conception appears to rely on logical consistency. Whereas Anaximander envisions “justice” as a regulatory, ontological compensation arising from the competition for existential pervasiveness amongst opposites, this conception of natural balance via justice is entirely absent in Parmenides’ works. The Aristotelian identification of Parmenides’ Light/Night dualism with Anaximander’s Hot/Cold opposition is also highly suspect. No extant fragments of Parmenides make this connection. Furthermore, the Peripatetics mistakenly refer to Parmenides’ two primary principles in Opinion as “fire and earth” instead of “fire (or light) and night.” Anaximander’s description of the cosmic fire rings as “tubes with vent-holes” is lacking in Parmenides’ “rings of fire.” In short, none of these supposed parallels clearly identifies Anaximander as an influence/target, and they can all be understood as rather common conceptions of cosmology and physics in philosophically-oriented Greek minds during Parmenides’ time (Cordero 2004, 20; Curd 1998, 116-126). The same can be said for the apparently shared views on the existence of divine/eternal beings, appeals to some “principal of sufficient reason,” as well as the denial of creatio ex nihilo, and the impossibility of distinct pluralities arising from a properly indistinct unity.

If Parmenides is not directly targeting Anaximander in particular, it is possible that he could be understood as responding to Milesian physics and cosmology in general, but probably not. On the one hand, Parmenides seems to be engaged in a very different sort of endeavor. Whereas the Milesians sought to explain cosmology and physics by identifying the arche (“origin” or “first principle”) from which all things originated (and possibly, remained constituted by), the only section of Parmenides’ poem that could provide an alternative or competing cosmological account (Opinion) is supposed to be fundamentally and deeply flawed, and offered for rejection on some grounds. The grounds upon which this cosmology is flawed is the point of Parmenides’ overall project, which seems far broader than denying Milesian views in particular. Though Parmenides may very well be challenging fundamental assumptions made by the Milesians—that things exist; that our senses provide knowledge of existing things; that there must be a primordial, foundational element(s)—these assumptions are hardly unique to them (Cordero 2004, 20). Thus, while the Milesians should most likely be listed amongst “the mortals” whose opinions about the world Parmenides thinks are fundamentally mistaken, and while Milesian views may even be paradigmatic examples of such mistaken views, Parmenides’ criticism would seem to include the common “man on the street” as well. Parmenides’ thesis is broader, his focus more metaphysical and logically-driven, than can be explained by ascribing the more historically-based motivation of challenging the Milesians (compare Owen 1960; Palmer 2009, 8-29). He is challenging everyone’s understanding.

ii. Aminias/Pythagoreanism

Ancient sources provide very limited support for imputing a significant Pythagorean influence upon Parmenides. Sotion alone attests that Parmenides, though admittedly a student of Xenophanes, did not follow him (presumably, in his way of thinking)—but was instead urged to take up the Pythagorean “life of stillness” by Aminias (Coxon Test. 96). Though Sotion goes on to describe how Parmenides built a hero shrine to poor Aminias upon his death, nothing else is known of Aminias himself. It is also reported by Nicomachus of Gerasa that both Parmenides and Zeno attended the Pythagorean school (Coxon Test. 121). However, even if this were true, it does not necessarily follow that either adopted or sought to challenge Pythagorean views in their later thought. Finally, Iamblichus mentions Parmenides amongst a list of “known Pythagoreans,” though no defense of, nor basis for, this attribution is provided (Coxon Test. 154).

Similarly, only a relatively small minority of contemporary scholars has been committed to defending a general Pythagorean influence upon Parmenides, and even fewer are willing to grant credence to the claim that Aminias was his teacher. The vast majority of contemporary Parmenidean scholars reject the Pythagorean influence entirely, or at least hold it to not be directly or substantially significant on informing Parmenides’ own mature views. When contemporary commentators have attempted to demonstrate the presence of Pythagorean elements within Parmenides’ text itself, the attempts have been quite strained, at best—particularly given the general lack of good information about early Pythagoreanism itself. Most telling against this purported influence is the fact that even amongst modern scholars who agree that Parmenides does demonstrate Pythagorean influences, the details and purported parallels differ entirely from one commentator to the next. As a result, even those who agree that there is a Pythagorean influence cannot agree at all on what exactly that influence consists of, or what counts as evidence for it.

It would seem that the real reason for the persistence of this association is far more dependent upon geographical considerations than is often let on.  In fact, this is probably the best argument for thinking any Pythagorean influence upon Parmenides is likely in the first place, as the primary Pythagorean school was founded in Croton, just over 200 miles SSE from Elea. While geographical considerations make it virtually certain that Parmenides was aware of the Pythagorean school, and even had interactions with Pythagoreans, there is simply no compelling evidence for any significant influence by this tradition in his mature work.

iii. Xenophanes

That Parmenides was either a direct disciple of Xenophanes, or at least heavily influenced by him in developing his own views, is pervasive amongst ancient sources. Plato is the first, claiming that there is an “Eleatic tribe,” which commonly held that “all things are one,” and that this view was first advanced by Xenophanes—and even thinkers before him! While this offhand remark by Plato may not be intended to be taken seriously in pushing Eleaticism back beyond Xenophanes, the idea that there is some real sense in which the philosophical views of these two are closely related is suggestive. Both Aristotle and his student Theophrastus explicitly claim that Parmenides was a direct personal student of Xenophanes. This is further attested by several later doxographers: Aetius (2nd-1st cn. B.C.E.) and “Pseudo-Plutarch” (1st cn.? B.C.E.). Numerous other ancient sources from variant traditions and spanning nearly a millennium could be listed here, all of which attest to a strong intellectual relationship between these two thinkers, on several different interpretative bases. While some are skeptical of this relationship (for example, Cordero 2004), most modern scholars are willing to grant some degree of influence between these thinkers, and the overall evidence is perhaps suggestive of a far deeper relationship than is normally admitted.

It is often superficially recognized that both Xenophanes and Parmenides wrote in verse rather than prose. It is also common to point to stark differences on this point. First, it is commonly claimed that Xenophanes was a philosophically-oriented poet, in contrast to Parmenides—a “genuine philosopher” who simply used poetry as a vehicle for communicating his thoughts. This seems to be based primarily upon the fact that Xenophanes also wrote silloi (satirical poetry), which do not always have obvious or exclusively philosophical themes. Also, even when both do make use of the epic dactylic-hexameter meter, there is a difference in vocabulary and syntax; Parmenides extensively and deliberately imitates language, phrasing, and imagery from Homer and Hesiod, while Xenophanes does not.

On the other hand, it is unfair to dismiss Xenophanes’ philosophical focus based upon his elegiac poems, which often do contain philosophical elements—that is, criticism of traditional religious views, the value of philosophy for the state, and how to live correctly. Furthermore, aside from these silloi, the majority of the extant fragments appear to be part of one major extended work by Xenophanes, all of which are in the epic style. More importantly, the content and general structure of this work bear substantial similarities to Parmenides’ own poem. Xenophanes begins by explicitly challenging the teachings of Homer and Hesiod in particular and of mortals in general regarding their understanding of the gods. Parmenides’ Proem clearly opens with a journey grounded in Homeric/Hesiodic mythology, and one of the main things for the youth is to learn why the opinions of mortals (including, or perhaps even especially, Homer and Hesiod) are misguided. Xenophanes draws a distinction between divine and mortal knowledge which mortals cannot overcome; Parmenides’ poem also seems to acknowledge this distinction, though he may very well be suggesting this divide can be overcome through logical inquiry, in contrast to Xenophanes. Xenophanes claims that the misunderstanding of the gods is the result of mortals relying upon their own subjective perceptions and imputing similar qualities to divine nature. Similarly, Parmenides’ spokes-goddess ascribes the source of mortal error to reliance on sense-perception, in contrast to logical inquiry, and may also have a divine subject in mind.

Having identified his intellectual targets, Xenophanes seems to move from criticism of others to providing a positively-endorsed, corrective account of divine nature. There is one (supreme or only?) god, which is not anthropomorphic in form or thought. Though this being does have some sort of sensory perception (hearing and seeing) and thinking abilities, it is different from how mortals experience these states—if in no other way than that this supreme god sees, hears, and knows all things. This being is unchanging and motionless, though it affects things with its mind. It cannot be denied that the description of Xenophanes’ (supreme/only) god bears many of the same qualities as Parmenides’ “what is”—the only question is whether Parmenides was directly influenced in this matter by Xenophanes’ views.

There are also extant fragments from Xenophanes which seem to provide an extended cosmology and physics. It is possible these constituted the end of Xenophanes’ major epic work. If so, there would again be at least a superficial similarity in structure between his poem and Parmenides’ own. More substantially, there may be parallel passages that could suggest the cosmology/physics on offer in each is not to be trusted. Consider Xenophanes’ injunction to believe things he has described as “resembling the truth” (Xenophanes B35). It is unlikely that he would be undercutting his positively-endorsed account of his “one god” in such a way, thus this likely refers to his physics/cosmology. When Parmenides’ spokes-goddess tells us why she is providing her “deceptive” account of mortal opinions (which in-itself implies it should not be taken as correct or real), she uses the exact same Greek adjective (ἐοικότα) to describe this mortal account as one which is “entirely fitting” for the youth to learn (C/DK 8.60). The context here seems to be that by learning the particular account offered in Opinion, which shares the mistakes any mortal account might possess, and/or which makes the failure of mortal accounts most evident, the deceptive account on offer is worth learning so as to best know how to avoid the mistakes other mortals make.

Finally, if geographical proximity is grounds for imputing a likely intellectual influence, then the case for a Xenophanean influence on Parmenides is just as strong, if not superior to, the Pythagorean association considered above. Though Xenophanes was originally a native of Colophon, an Ionian city which lies on the opposite side of the ancient Greek world from Elea, evidence suggests he spent significant time in or at least near Elea. Xenophanes describes himself as having spent sixty-seven years traveling and sharing his teachings after leaving Colophon at the age of twenty-five. Xenophanes’ writings clearly demonstrate familiarity with Pythagoras himself, and thus implies familiarity with his school in southern Italy. Diogenes explicitly reports that Xenophanes lived at two locations in Sicily (near Elea) and that Xenophanes even wrote a poem on the founding of Elea, as well as his native Colophon. What would make these two cities worthy of odes, and no others? Likely, both were important to Xenophanes in the same respect—he identified with both as “home.” Finally, Aristotle explicitly reports that the citizens of Elea sought Xenophanes’ guidance about religious matters on at least one occasion (Rhetoric II.23 1400b6). On these grounds, in addition to the ancient reports that Parmenides was Xenophanes’ student and the parallels found in their major works, it is very likely Xenophanes lived near or in Elea during his philosophical maturity, likely at just the right time to influence Parmenides in his own philosophical development.

iv. Heraclitus

No ancient source attests that Heraclitus influenced Parmenides. In fact, the only ancient source to suggest any relationship between the thinkers is Plato, who would have Parmenides influencing Heraclitus instead. Plato’s claim is almost universally rejected today, especially since Heraclitus does not hesitate to criticize other thinkers, and he never mentions Parmenides. However, it was quite common throughout much of the twentieth-century for modern scholars to argue that Parmenides was directly challenging Heraclitus’ views, and introductory textbooks continue to regularly draw interpretative parallels between them. The interpretative comparison generally relies upon the highly questionable ascription of a motionless, strict-monism to Parmenides, in contrast with understanding Heraclitus as the “philosopher of flux,” who advocated a pluralistic universe constantly undergoing motion and change. I leave consideration of the interpretative adequacy of these views aside here. While ahistorical interpretative comparisons can certainly be worthwhile exercises in-themselves, the question here is whether there are actually any good historical grounds to think Parmenides was directly challenging Heraclitus. The thesis that Heraclitus influenced Parmenides faces serious chronological challenges. All evidence suggests Heraclitus wrote his major work near the end of his sixty-year lifetime. Evidence also suggests Parmenides could not have written much after Heraclitus’ own death. This leaves little time in between, if any, for Parmenides to become aware of or be inspired to challenge Heracliteanism.

That Heraclitus wrote late in his lifetime is evident from his explicit criticism of other thinkers. In one passage, Heraclitus criticizes the Ephesians for exiling his friend Hermodorus, which would have occurred at the very end of the sixth century (B121). In another passage, he denigrates Hesiod, Pythagoras, Xenophanes, and Hecataetus as failing to understand anything, despite their studiousness (B40). This list of names is in chronological order, and Heraclitus’ use of the past tense may be taken as indicating they are all dead by the time of his writing. If so, as Hecataetus’ lifetime is estimated as c. 550-485 B.C.E., Heraclitus would have to have completed his work in the last few years of his life. Admittedly, Heraclitus’ use of the past tense here is not decisive, as it certainly does not require all those named be dead. It only requires that the named persons have, on Heraclitus’ view, demonstrated their lack of understanding in the past (likely through written works). Nevertheless, this passage still supports a late composition date. Though Pythagoras established his school in Croton early in Heraclitus’ life (c. 530 B.C.E.), it would seem to require a significant amount of time for the arcane teachings of Pythagoreanism to have made their way to Ephesus. In addition, since Pythagoras himself did not write anything, any written works in the Pythagorean tradition that were disseminated must have been written by his followers—again, probably after Pythagoras’ own death (post-500 B.C.E.). Finally, even if Hecataetus was still alive at the time Heraclitus wrote, Hecataetus almost certainly wrote his own works late in his lifetime, after his travels (that is, post-500 B.C.E.).

Given this evidence, it is reasonable to estimate the earliest composition of Heraclitus’ book to c. 490 B.C.E., plus or minus five years. If Plato’s claims that Parmenides was the personal teacher of a young Zeno (490-430 B.C.E.), and that Zeno wrote his own book in defense of his master’s while very young, then Parmenides must have written prior to 470 B.C.E. This estimate is reasonable even if the details of Plato’s Parmenides are not reliable and if one accepts Diogenes’ account, as Parmenides would be seventy by this point. This leaves a rather short window—less than twenty years—for Heraclitus’ views to spread across the Greek world to Elea and inspire Parmenides. Though not impossible, this is unlikely.

Furthermore, when further biographical details are considered, the window arguably completely vanishes. Diogenes reports that upon completion of his book, Heraclitus deposited it (apparently the only copy) in the Temple of Artemis (the Artemisium). Access to the work in the temple’s storeroom would almost certainly have been limited and available only to particularly privileged persons. Tradition further holds that Heraclitus himself did not have any students, but that a following eventually arose amongst those who studied his book and named themselves Heracliteans. Under these circumstances, in conjunction with Heraclitus’ deliberate obscurity, the time required to study, discuss, teach, and disseminate Heraclitus’ views into the rest of the Greek world would be substantial—not years, but decades. This inference seems supported by the lack of any records of Heracliteans in the early fifth century. In fact, the very earliest evidence of a Heraclitean outside of Ephesus is of Cratylus—a thinker to whom Plato dedicated an eponymous dialogue and who Aristotle reports as the first teacher whose views Plato adopted (Metaphysics i.6, 987a29-35). If Cratylus was the first to spread Heracliteanism beyond Ionia, and if he indeed taught Plato before Socrates did (pre-410 B.C.E.), Heracliteanism would only have first arrived in Athens (let alone Elea) sometime after 450 B.C.E…far too late to influence Parmenides work (pre-470), and almost certainly after his death.

All this may still be objected to, and the unlikely possibility apparently confirmed, on the grounds of supposedly clear textual evidence. Parmenides is commonly thought to have made a clear allusion to Heraclitus, describing mortals with no understanding as simultaneously accepting that “things both are and are not, are the same and not the same” (C5/DK6). This can be understood as referring directly to Heraclitus’ paradoxical aphorisms, which describe things like rivers and roads as being both simultaneously the same, but yet not (B39; B60). Proponents may even go on to point out that Parmenides describes those who hold such mistaken beliefs as being on a “backward-turning (παλίντροπός) journey,” which is the same adjective Heraclitus uses to describe the unity of opposites that mortals fail to appreciate. While this may initially seem compelling, closer examination of these textual claims reveals their inadequacy. Furthermore, a broader examination of the texts reveals that the apparent attractiveness largely depends upon selective cherry-picking.

First, it is not even clear if Heraclitus wrote παλίντροπός—some sources report παλιντονος (“backward-stretching”) instead, and there is no scholarly consensus on which is correct. Even if both did write παλίντροπός, imputing an intellectual influence from this is rather weak. It could simply be a coincidental usage of a relatively common term or idiom. In any case, the apparent similar philosophical usage—in relation to mortal cognitive failures—only stands on the surface. It is not that mortals themselves are, in their cognitive failures, “backwards-turning,” in either case. Instead, Parmenides is using it metaphorically to describe a way of inquiring that leads to contradiction. He is using this image to describe a way in which mortals should not think about things. On the other hand, while Heraclitus’ use is also metaphorical, he is advocating for his view of how opposites should be thought of. Here, nature itself is “backwards-turning,” and the failure by mortals is the failure to recognize this fortunate confluence of opposites, which can result in the complementary unity and harmony found in, for example, the bent sapling and taut string that form a bow. Parmenides is not here denying the more limited claims by Heraclitus—that opposites can work together to produce some new harmony. Rather, he seems to be claiming that even thinking of opposites requires thinking in terms of a more fundamental distinction—“what is” and “what is not”—and this inevitably leads to contradiction. While Parmenides’ claims would certainly refute Heraclitus, his view is aimed at an error that is found in all prior philosophical systems, and even the most common mortal beliefs. Thus, to think of this passage, and thereby Parmenides’ overall poem, as intentionally designed to directly challenge Heraclitus in particular is to risk missing Parmenides’ larger project for what it is.

Perhaps more telling, the apparent direct challenges and differences between these thinkers are belied by the similarities. Heraclitus describes the divine Logos as eternal and unchanging, much as Parmenides’ describes “what is.” Properly understanding the Logos is supposed to lead to the conclusion that “all is one,” and Parmenides has often been thought to be advocating similar monistic conclusions regarding “what is.” Similarly, both can be read as advocating there is no distinction between night and day—that they are both one, and that both are also divine. Both are critical of common mortal views, and both seem to acknowledge a distinction between mortal and divine knowledge.

In the end, these similarities should no more be taken as indicative of direct influence than the apparent critical differences—the chronology makes both problematic. How then might the apparent interaction/influence between these thinkers be explained, when they were almost certainly writing in isolation and ignorance of each other on opposite sides of the Greek world? The better explanation here is to seek a common influence which would explain the similarities in doctrine and critical themes and which would have been widely spread by the end of the sixth century. The most obvious common influence in this context is Xenophanes.

b. Parmenides’ Influence on Select Successors

As this article has set out to demonstrate, understanding the meaning Parmenides intended in his poem is quite difficult, if not impossible. Given that any historical narrative of Parmenides’ legacy is directly determined by how the poem is supposed to be understood, there are almost as many plausible accounts of the former as the latter. Such considerations are further complicated by the tendency of ancient philosophical authors to use prior views to serve their own interests and purposes, with relatively little regard for historical accuracy. Even if ancient authors did conscientiously attempt to portray earlier thinkers faithfully, there is no guarantee they properly understood the original authors intended meaning—and this is particularly the case given Parmenides’ enigmatic style. Thus, the question “how did Parmenides’ own views influence later thinkers” may in many cases be a mistake, as the more relevant question could be “how did each of Parmenides’ successors understand Parmenides and /or choose to make use of his work in their own.” Given these considerations, it is especially difficult to speak about Parmenides’ influence upon his successors in any detail without first adopting a particular interpretative outlook. However, there are some general observations that can be advanced which are, at least, highly suggestive. Some of these are briefly sketched out below.

i. Eleatics: Zeno and Melissus of Samos

Though it is highly questionable as to whether Parmenides himself argued for strict monism, the views found in the writings of his immediate followers can be taken as advocating such. It is important to realize that this does not show Parmenides was in agreement on this point—it could be that this is the way they developed his thoughts and logical method further. Whatever Parmenides himself held, however, it is clear that his writings did lead some to adopt this view.

Tradition holds that Zeno of Elea was a student of Parmenides from a young age. He is famous for writing a short book of “paradoxes,” which are designed to demonstrate the absurdity of positing a plurality of beings, as well as the associated conceptions of change and motion. Zeno certainly adopts and improves upon Parmenides’ deductive, reductio ad absurdum argumentative style in his prose. However, whether the denial of pluralism was Zeno’s own addition to his teacher’s views, or if he is truly and faithfully defending Parmenides’ own account, as Plato represents him to be (Parmenides 128c-d), is not clear.

It is universally understood and uncontroversial that there was at least one Greek thinker who did adopt and defend the radical metaphysical hypothesis of strict monism: Melissus of Samos. Melissus clearly (largely because he wrote in prose) adopts Parmenides’ own language and argumentative styles, especially from C/DK 8, and expanded upon them. Most qualities of Parmenides’ “what is” remain the same in Melissus’ account—it is eternal, motionless, a uniform and complete entity, and so forth. However, Melissus conversely describes his sole being as unlimited in extension, rather than limited. He further adds that “what is” cannot undergo psychological changes (such as pain, distress, or health) and explicitly denies the existence of void. He also expressly denies the existence of things mortals believe in, but yet fails to realize the entailment that mortals—including himself—thus also would not exist.

ii. The Pluralists and the Atomists

The Pluralists include Anaxagoras and Empedocles. Each posited very different cosmological accounts, based upon very different fundamental entities and processes. Anaxagoras posits an extensive number of fundamental and eternal seeds, every kind of which is found in even the smallest portion of matter, and which give rise to objects of perception according to whichever kind of seed dominates the mixture at a particular spatio-temporal location, in accordance with the will of Nous.

Empedocles (who may have been a student of Parmenides) makes the four fundamental Greek “elements” (earth, air, fire, water), as well as two basic forces (Love and Strife), his fundamental entities. The “elements,” in conjunction with his basic forces, are continuously mixed together to become one and subsequently separated entirely in the eternal cyclical nature of the cosmos.

Despite the radical surface differences, the Pluralists share some basic ideas. Both hold that the world as it appears to mortals is an outcome of the mixture and separation of fundamental entities—entities which satisfy the description of “what is” set forth by Parmenides in Reality, except for the motion they undergo. This is almost certainly no accident, and generally indicative of Parmenides’ influence on Greek thought overall. However, it is again not clear whether the Pluralists are best understood as agreeing with, or rejecting, Parmenides’ own views. Even if passages ascribed to these thinkers are seen to be rejecting Eleaticism, the rejection may need to be taken as directed against Melissus, not Parmenides’ himself.

The situation is very similar with respect to the early Atomists—Leucippus and Democritus. The atomists posit two fundamental entities, one that corresponds to “what is” (atoms), and one that corresponds to “what is not” (void). The void is simply the absence of “what is,” and is necessary for motion. The atomists do provide arguments for the existence of void, which can seem to be a direct challenge to Parmenides’ claim that “what is not” necessarily cannot be. However, the rejection is likely more indirect, as only Melissus explicitly argued against the existence of void (though since Leucippus and Melissus were contemporaries, Melissus could be responding to Leucippus instead, though it is unlikely). The atoms are infinite in number and kind, indivisible, uncuttable, whole, eternal, and unchanging with respect to themselves. Like the pluralists, the macro-objects found in human perceptions are formed out of combining these fundamental micro-objects in particular arrangements. Overall, the fundamental entities of atomism again closely correspond in many ways to Parmenides’ description of “what is,” with the primary exception being motion. Also, though it is clear that the Atomists are aware of Eleaticism, whether they are best described as agreeing with and/or rejecting Parmenides’ own views is unclear, given the clear target Melissus has provided in explicitly adopting strict monism and denying the existence of void.

iii. Plato

If there is any Presocratic whom Plato consistently treats with deep reverence, it is Parmenides, who he describes as possessing a “noble depth,” and being “venerable and awesome.” The deep influence of Eleatic thought upon Plato is clear in his regular use of a main character known as the “Eleatic Stranger” in several late dialogues, as well as his eponymous dialogue, Parmenides, in which the coherence of the Theory of Forms is examined. Within Plato’s Theory of Forms, and in his accompanying epistemological and ontological hierarchies, Parmenides’ influence can be most readily seen. Similar to “what is” in Parmenides, each Form is an eternal, unchanging, complete, perfect, unique, and uniform whole. They can only be understood via reason, and understanding them is the highest epistemic level attainable, wherein one possesses certain knowledge. The Forms are also the most fundamental ontological level, which are always true and real in all contexts. In contrast, understanding the account in Opinion as the mistake that mortals make due to their senses—thinking that being and not-being are both the same and different—and erroneously thinking they have thus grasped the way the world truly is in these ways, provides a parallel to Plato’s description of the “world of appearances.”  For Plato, this corresponds to a typical epistemological level at which human beings, relying upon their senses, can only have opinions and not knowledge. It is also a typical ontological level, at which objects and phenomena perceived simultaneoulsy “are and are not,” as they are imperfect imitations of the more fundamental reality found in the Forms. Beyond the Theory of Forms, there are also interesting epistemic and allegorical comparisons and contrasts to be drawn between Parmenides’ poem (particularly the Proem) and Plato’s “Allegory of the Cave.”

5. References and Further Reading

This article uses the revised edition (2009) of Coxon’s seminal monograph as the standard reference for the study of Parmenides in English. For ease of reference, references to fragments of Parmenides’ poem list, first, Coxon’s numbering (C) and then, Diels-Kranz’s (DK). Thus, the same fragment is indicated by (C 2/DK 5). References to all ancient testimonia regarding Parmenides are based on Coxon’s arrangement and numbering and are listed with “Test.” preceding the relevant number (for example, Coxon Test. 1). References to ancient sources concerning Presocratics other than Parmenides are based on DK’s arrangement.

a. Primary Sources

  • Austin, Scott. Parmenides: Being, Bounds, and Logic. New Haven: Yale, 1986.
    • A work focused solely on explaining the logical aspects of Reality.
  • Cordero, Néstor-Luis. By Being, It Is. Las Vegas: Parmenides, 2004.
    • Cordero provides a new perspective on Parmenides’ reasoning and method by focusing on Parmenides’ use of the verb “to be” and its enigmatic subject, in addition to the number and meaning of the “paths of inquiry” considered in the poem,
  • Coxon, A. H. The Fragments of Parmenides: Revised and Expanded Edition. Ed. and Trans. Richard McKirahan. Las Vegas: Parmenides, 2009.
    • The sixth-edition of Diels-Kranz’s (DK) Die Fragmente der Vorsokratiker (1952) has long been the standard text for referencing Presocratic fragments in a numbered arrangement, along with related testimonia. However, it is somewhat dated, has long been out of print, the German commentary is relatively brief, it does not contain all the available or pertinent testiomonia, and no translations of the testimonia are offered. This arrangement of Parmenides by Coxon is far more accessible to most readers of this article (in English, and easily available in-print), and it provides a more comprehensive list of testimonia, with English translations. The recent revisions by McKirahan have also kept it up to date with recent advances in scholarship. For these reasons, Coxon’s is now, or should be, the current standard text for Parmenidean studies. All references in this article to fragments of Parmenides’ poem and relevant ancient testimonia follow Coxon’s arrangement.
  • Diels, Hermann, and Walther Kranz. Die Fragmente der Vorsokratikor. 6th ed. Berlin: Weidmann, 1952.
  • Kirk, G.S., Raven, J.E., Schofield, M. The Presocratic Philosophers. 2nd ed. Cambridge: Cambridge, 2011.
    • A valuable introductory work on the Presocratics which provides all fragments of Parmenides’ poem in Greek with their English translation, in the midst of a running interpretative commentary. Select testimonia related to Parmenides are also provided in both Greek and English.
  • Palmer, John A. Parmenides and Presocratic Philosophy. Oxford: Oxford, 2009.
    • In perhaps the most comprehensive contemporary monograph on Parmenides’ poem, Palmer’s most novel contribution consists of fully developing a modal perspective for understanding Reality and Opinion, as well as the relationship between both sections.
  • Sider, David, and Henry W. Johnstone, Jr.  The Fragments of Parmenides. Bryn Mawr Commentaries. Bryn Mawr: Bryn Mawr College, 1986.
    • An essential resource for students who want to study Parmenides in the original Greek.
  • Tarán, Leonardo. Parmenides. New Jersey: Princeton, 1965.
    • One of the seminal works in the field advocating Parmenides’ strict monism.

b. Secondary Sources

  • Baird, Forrest E., and Walter Kaufmann, eds. Ancient Philosophy. 4th ed. Philosophy Classics. Vol. 1. New Jersey: Prentice Hall, 2003.
    • An introductory text with an incomplete translation of the poem, which paints Parmenides as a strict monist and contrasts his position as radically opposite to Heraclitus’ “philosophy of flux.”
  • Bicknell, P. J. “A New Arrangement of Some Parmenidean Verses,” Symbolae Osloenses 42.1 (1968): 44-50.
    • Challenges the arrangement of Diels-Kranz. Bicknell suggests a new arrangement of some Parmenidean fragments, primarily based upon Sextus’ report that the lines which began the poem (C/DK 1) were followed by lines currently assigned to fragments C/DK 7 and 8.
  • Bowra, C. M. “The Proem of Parmenides.” Classical Philology 32.2 (1937): 97-112.
    • One of the earliest and most influential treatments of Parmenides’ Proem, particularly focusing on its similarities to Pindar.
  • Cherubin, Rose. “Light, Night, and the Opinions of Mortals: Parmenides B8.51-61 and B9.” Ancient Philosophy 25.1 (2005): 1-23.
    • An insightful discussion of the dualistic principles found in Opinion.
  • Cohen, Marc and Patricial Curd and C.D.C. Reeve, eds. Readings in Ancient Greek Philosophy: From Thales to Aristotle. 4th ed. Indianapolis: Hackett, 2011.
    • An excellent, if limited, introductory text with a full translation of Parmenides’ poem. In the brief introduction to Parmenides, the likelihood of a Xenophanean influence is stressed, the difficulty in reconciling Reality and Opinion is raised, and Parmenides’ ultimate position is left open.
  • Cordero, Néstor-Luis. “The ‘Opinion of Parmenides’ Dismantled.” Ancient Philosophy 30.2 (2010): 231-246.
    • Cordero advances several theses in this paper. First, he avers that the standard arrangement of the fragments is based solely upon perceived content, which ultimately depends upon imputing anachronistic, Platonic distinctions to Parmenides. Having challenged this status quo, he goes on to advocate a new arrangement for the poem, moving some passages which make true cosmological claims out of Opinion, and into Reality.
  • Cordero, Néstor-Luis. Ed. Parmenides, Venerable and Awesome. Proc. of International Symposium, Buenos Aires, 10/29-11/2/2007. Las Vegas: Parmenides, 2011.
    • A collection of scholarly essays, many of which engage with each other, presented at the first international conference for Parmenidean studies.
  • Curd, Patricia. The Legacy of Parmenides. New Jersey: Princeton, 1998.
    • Curd argues that Parmenides intended to argue for “predicational monism”—that whatever exists as a genuine entity must be one specific, basic kind of thing. Curd’s primary argument is that none of Parmenides immediate successors offers any argument for the possibility of metaphysical pluralism. Thus, if Parmenides had held the strict-monist view, later thinkers would be begging the question against him, and Curd thinks this fallacious move unlikely. Since predicational monism allows for a plurality of entities, there would be no reason for his successors to argue for the possibility of pluralism, and thus their failure to do so is no longer fallacious.
  • DeLong, Jeremy. “Rearranging Parmenides: B 1.31-32 and a Case for an Entirely Negative Opinion (Opinion). Southwest Philosophy Review 31.1 (2015): 177-186.
    • In addition to considering the meaning of C 1.31-32 and how Opinion should be taken negatively, this article directly takes on and challenges Cordero’s proposed rearrangement of the fragments.
  • Diels, Herman. Parmenides Lehrgedicht: Griechisch und Deutsch. Berlin: George Reimar, 1897.
  • Granger, Herbert. “Parmenides of Elea: Rationalist or Dogmatist?” Ancient Philosophy 30.1 (2010): 15-38.
    • An article that helpfully sets interpretative approaches to Parmenides in the context of commentators’ background assumptions regarding Parmenides’ “rationalism.”
  • Hermann, Arnold. To Think Like a God: Pythagoras and Parmenides—The Origins of Philosophy. Las Vegas: Parmenides, 2004.
    • While ultimately denying any significant historical influence by Pythagoreans upon Parmenides, Hermann traces how the ancient conceptual distinction between divine and mortal knowledge led to the development of these diametrically opposed views. Whereas this distinction essentially led the Pythagoreans to develop a religious cult, it inspired Parmenides (stepping up to Xenophanes’ challenges) to became the first true philosopher, relying upon logic and reasoning to arrive at metaphysical conclusions, and thus achieving a sort of divine knowledge as a mortal.
  • Kingsley, Peter. Reality. Inverness: Golden Sufi Center, 2003.
    • Kingsley advocates rejecting that Parmenides attempted to communicate any epistemic or metaphysical truths in his poem—at least, not in any rationalistic sense. Rather, Parmenides is a mystic who has found divine truth through ritual and spiritual experiences. His poem recounts these experiences in the Proem, and what follows is designed to open the reader’s mind to similar experiences, via losing oneself in elenchus, and facing death metaphorically, if not literally.
  • Kurfess, Christopher John. “Restoring Parmenides’ Poem: Essays Toward a New Arrangement of the Fragments Based upon a Reassessment of the Original Sources.” Diss. U. of Pittsburgh, 2012.
  • Kurfess, Christopher. “Verity’s Intrepid Heart: The Variants in Parmenides, DK B. 1.29 (and 8.4).” Apeiron 47.1 (2014): 81-93.
  • Lewis, Frank A. “Parmenides’ Modal Fallacy,” Phronesis 54 (2009): 1-8.
  • Lombardo, Stanley. Parmenides and Empedocles. Eugene: Wipf & Stock, 2010.
    • For those interested in a translation that attempts to capture Parmenides’ poetical style.
  • Miller, Mitchell. “Ambiguity and Transport: Reflections on the Proem to Parmenides’ Poem.” Oxford Studies in Ancient Philosophy 30 (2006): 1-47.
  • McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994.
    • An exceptionally solid and detailed introduction to the Presocratics overall. Suitable for both beginning and more advanced readers. Provides an overview of the interpretative problems for Parmenides and various perspectives, in addition to McKirahan’s own perspective on them, (which differs markedly from Coxon’s, whose work he recently edited and revised).
  • Mourelatos, Alexander P. D. “Parmenides on Mortal Belief.” Journal of the History of Philosophy 15.3 (1979): 253-65.
  • Mourelatos, Alexander P. D. The Route of Parmenides: Revised and Expanded Edition. Las Vegas: Parmenides, 2008.
    • Perhaps the most influential modern work on Parmenides in the twentieth century, the contributions and insights offered by Mourelatos in this monograph are invaluable and extensive. He is perhaps best known for demonstrating the extensive and inventive ways in which Parmenides invokes and plays off of Homeric/Hesiodic meter and language, as well as being the first to posit an Essentialist interpretation of Reality.
  • Nehamas, Alexander. “On Parmenides’ Three Ways of Inquiry.” Deucalion 33/34 (1981): 97-111.
  • Owen, G. E. L. “Eleatic Questions.” The Classics Quarterly 10.1 (1960): 84-102.
    • In addition to focusing on the problematic lines 1.31-32, Owen provides one of the most influential interpretations of Parmenides. He claims that Parmenides was led to adopt a strict monism on logical and Russellian grounds, and explains how Opinion can be viewed negatively without contradiction as a mere dialectical exercise.
  • Reeve, C.D.C, and Patrick Lee Miller, eds. Introductory Readings in Ancient Greek and Roman Philosophy. Indianapolis: Hackett, 2006.
    • An example of an introductory text with a full translation of the poem, which straightforwardly casts Parmenides as advocating strict monism.
  • Stumpf, Samuel Enoch. Socrates to Sartre and Beyond: A History of Philosophy. 7th ed. Boston: McGraw-Hill, 2003.
    • An introductory text with no translation of Parmenides’ poem. Provides several pages of interpretative summary, casting Parmenides as a strict monist, and in opposition to Heraclitus.
  • Thanassas, Panagiotis. Parmenides, Cosmos, and Being: A Philosophical Interpretation. Milwaukee: Marquette, 2007.
    • Thanassas emphasizes the epistemological reliability as grounding the distinction between Reality and Opinion, concluding that Parmenides viewed true knowledge (and the philosophical methodology that leads to) as divine. He also offers a novel view concerning the content of Opinion, which he believes needs to be further divided up into several distinct sections with variant polemical ends.

 

Author Information

Jeremy C. DeLong
Email: jeremydelong@sbcglobal.net
University of Kansas
U. S. A.

Religious Pluralism

Religious pluralism, broadly construed, is a response to the diversity of religious beliefs, practices, and traditions that exist both in the contemporary world and throughout history. The terms “pluralism” and “pluralist” can, depending on context or intended use, signify anything from the mere fact of religious diversity to a particular kind of philosophical or theological approach to such diversity, one usually characterized by humility regarding the level of truth and effectiveness of one’s own religion, as well as the goals of respectful dialogue and mutual understanding with other traditions. The term “diversity” refers here to the phenomenal fact of the variety of religious beliefs, practices, and traditions. The terms “pluralism” and “pluralist” refer to one form of response to such diversity.

Philosophical and theological treatments of religious diversity have generally adopted different attitudes and different methods insofar as their respective disciplinary commitments differ. Since theological accounts tend explicitly to be grounded in the faith commitments that characterize particular religious traditions (or at least larger sets of traditions, such as Christianity or the “Abrahamic” religions), they often explore how members of a given faith ought to regard the beliefs and practices of other traditions. Philosophical accounts, by contrast, often tend to adopt a more or less disinterested attitude and instead evaluate, for example, the epistemological or ethical issues raised by religious diversity; this is especially true within the analytic tradition, which raises questions about the justification of conflicting religious beliefs that have received much attention in analytic literature. Although this article’s examination largely focuses on philosophical positions, despite their methodological and perspectival differences theological and philosophical accounts inform and influence each other.

As with many other philosophical topics, there are also significant differences between the way religious diversity is treated as a topic by analytic and continental philosophers. In general, it has been taken up more directly and explicitly in analytic philosophy since the 1980s, though religious diversity has also featured as an important, if secondary, theme in much continental philosophy of religion (as well as continental social, political, and ethical philosophy). Therefore, after introducing some basic terminology and exploring treatments of religious diversity in the history of modern philosophy, this article explores analytic and continental approaches in separate sections. Significant feminist discussions of religious diversity have emerged in both analytic and continental philosophy; these will be treated their own section. In addition, sections are devoted to contributions from both process philosophy and liberation theology.

The phenomenon of religious diversity occurs not only between particular religious traditions but also within them. Approaches to inter-religious and intra-religious difference are therefore explicitly treated as distinct by some philosophers of religion, while others argue that they should not be treated differently. The present article opts for the latter position, except in cases where there is an obvious reason to do otherwise.

Table of Contents

  1. Categories of Responses to Diversity
  2. Historical Influences
    1. Kant
    2. Schleiermacher
    3. Hegel
    4. James
  3. Analytic Approaches
    1. Epistemic Conflict
    2. Religious Language and Apologetics
    3. John Hick: the Pluralistic Hypothesis
    4. Criticisms of Hick
  4. Continental Approaches
    1. Hermeneutics and Religious Truth
    2. Hospitality
  5. Contributions from Feminism
  6. Process Philosophy
  7. Liberation Perspectives
  8. Conclusion
  9. References and Further Reading

1. Categories of Responses to Diversity

There are a number of different ways that philosophers and theologians have grouped various accounts of religious diversity. One of the most commonly adopted strategies – and the one that will be used in the following discussion – is the threefold division first introduced by Alan Race (1983): exclusivist, inclusivist, and pluralist. Exclusivist positions maintain that only one set of belief claims or practices can ultimately be true or correct (in most cases, those of the one holding the position). A Christian exclusivist would therefore hold that the beliefs of non-Christians (and perhaps even Christians of other denominations) are in some way flawed, if not wholly false; or that non-Christian religious practices are not ultimately efficacious – at least, to the extent that non-Christian beliefs and practices depart from or conflict with those defended by the Christian exclusivist.

Pluralist positions, in contrast, argue that more than one set of beliefs or practices can be, at least partially and perhaps wholly, true or correct simultaneously – or, that all beliefs intended to be understood in a realist fashion are false. Inclusivist positions occupy a middle ground between exclusivism and pluralism, insofar as they recognize the possibility that more than one religious tradition can contain elements that are true or efficacious, while at the same time hold that only one tradition expresses ultimate religious truth most completely. As McKim (2012) expresses it, inclusivists grant that many (perhaps all) religious traditions do well in regards to truth or salvation, but that one tradition does better than others by more accurately describing objects of belief or mechanisms of salvation. A Christian inclusivist might claim that those who live good lives but remain non-Christians may still achieve salvation, but that such salvation is nevertheless still achieved through Jesus Christ. Inclusivism thus may be understood as a more charitable variety of exclusivism, though exclusivists can also treat it as pluralism by another name. In addition, it is worth noting that exclusivist, inclusivist, or pluralist arguments about beliefs are sometimes presented separately from those about salvific practice and that consequently one approach to diversity of belief does not necessarily imply the same approach to diversity of practice, or vice versa.

There remain a few other possible positions outside of exclusivism, inclusivism, and pluralism that are worth mentioning briefly, though two of these are not commonly treated as sophisticated options in philosophical discussions of religious diversity. The first of these two is relativism: the view that the truth of beliefs or the efficacy of practices are wholly dependent on the perspective of the religious individual and her cultural environment. In contrast to the pluralist position, that of the relativist seems necessarily to imply an anti-realist theory of religious truth, which would deflate the significance of religious pluralism as a philosophical and theological issue since religious truth claims could only be upheld or defeated within the context of their own traditions. The second, which would have similar consequences, is the position that no positive religious beliefs are true in any sense, even the relativist ones, and no religious practices are efficacious (at least not according to their own terms). In this case, which may be termed strong anti-realism, religious diversity remains at most a sociological, psychological, or historical topic. It is perhaps not surprising, then, that serious philosophical approaches to religious diversity tend not to adopt either of these positions but rather to treat diverse religious traditions as at least possibly having some positive relationship to an ultimate reality. Another position, that of skepticism, seems to entertain this possibility insofar as it concedes that some set of religious claims may be true. However, since this position contends that, given the extent of religious diversity and disagreement, no one is ever justified in making such claims, this position will not be considered in detail here (for a contemporary defense of the skeptical position, see Feldman 2007).

2. Historical Influences

What is today called philosophy of religion appears early in the history of Western philosophy – arguably as far back as Plato’s Euthyphro, wherein Socrates questions the title character as to whether the nature of piety follows on the will of the gods or vice versa. However, most early accounts of religion either ignore religious diversity or do not treat it as an issue worthy of genuine consideration. Pre-modern sources that do treat religious diversity seriously tend to adopt exclusivist positions, though applying this label to such accounts is somewhat anachronistic. Even the use of “religion” as a term that signifies one particular tradition of beliefs and practices among others was not common before modernity.

Once religions began to be considered as such alongside each other, though, positions approaching pluralism soon appeared. In the seventeenth and eighteenth centuries, works by figures such as Spinoza, Locke, Hume, and Voltaire presented rational and naturalistic interpretations of religion and argued for religious toleration. Though these accounts focused largely on Christianity (and to a lesser extent Judaism), they laid the foundation for future pluralist approaches to religion. Though not the only important precursors of pluralism in the post-Enlightenment age, four influential approaches worth examining here are those of Immanuel Kant, Friedrich Schleiermacher, G.W.F. Hegel, and William James. In each, the diversity of traditions is seen not simply as contingent but as in some sense unavoidable, either because of humanity’s inherent limitations or because of the nature of history.

a. Kant

Immanuel Kant’s philosophy of religion serves as an important touchstone in the development of religious pluralism if for no other reason than for its strong influence on John Hick, whose pluralist position is one of the most significant of contemporary responses to religious diversity (see below). Kant does not offer an argument for religious pluralism especially, but such a position emerges as a consequence of his account of rational religion and the distinction he makes between it and particular religious traditions. In his Religion within the Bounds of Bare Reason (1793), he argues that authentic religion is purely rational; specifically, it is grounded in a human being’s moral capacity. Humans ultimately cannot know God’s nature, or even whether or not God exists (as he argues in the Critique of Pure Reason), but the idea of God can and should nevertheless serve as a morally regulative ideal. Similarly, Kant argues that while the existing religions and their respective sets of particular beliefs and practices can often be beneficial for the moral progress of their communities, the rational ideal is an “invisible church” made up of persons able to live autonomously moral lives. Thus, Kant makes the following claim: “There is only one (true) religion; but there can be many kinds of faith.—One may say, further, that in the various churches, set apart from each other because of the difference in their kinds of faith, one and the same true religion may nonetheless be found” (2009: 118). According to this view, multiple religious traditions (or “churches”) may exhibit, to greater or lesser degrees, the purported true essence of religion as long as they promote a morality that agrees with the dictates of practical reason. However, Kant also argues that in the progress of humanity toward greater enlightenment, these various traditions will be discarded as the purely rational religion Kant advocates becomes more fully realized. It is worth noting that he privileges Christianity (specifically, Protestant Christianity) as the paradigmatic historical faith, the tradition which, according to his argument, most fully manifests the moral core of religion.

b. Schleiermacher

Schleiermacher is in agreement with Kant in the argument that the essential core of religion lies deeper than the diverse forms of belief and practice that make up the various religious traditions. Neither Kant nor Schleiermacher accept the idea that religion is fundamentally a matter of adherence to a particular set of dogmatic claims, or that religion, properly so-called, even puts forward any metaphysical claims at all. Schleiermacher goes further though, asserting that morality is no more a part of the essence of religion than metaphysics. In his Speeches on Religion, Schleiermacher explains that religion is primarily a matter of intuition or feeling, with the entire universe as its object. Later, in his systematic theological work, he famously identifies the feeling that characterizes religion as one of “absolute dependence” – that is, the feeling that one’s very existence depends on something that, in some way or other, transcends oneself. For Schleiermacher, then, religion is first and foremost a matter for the individual, and he does not discount the possibility that one could live a highly religious life without belonging to a particular religious tradition. However, he also concedes that consciousness of religious feeling is, in most cases, best developed within a community, and that within such communities the religious intuition is always accompanied by metaphysical claims, moral prescriptions, ritual practices, and so forth. These outward forms of religion, in turn, serve to shape the way that individuals understand and articulate their own religious feelings. Given that the historical forms of religion emerge from individuals’ subjective experiences – and given that Schleiermacher holds the object of these experiences to be both all-encompassing and infinite – a wide variety of religious traditions is inevitable. Indeed, he goes so far as to argue that the diversity of traditions is necessary for the task of representing the infinite within the limitations of forms comprehensible to humans. Thus, while he also accords a privileged place to Christianity among other religions, Schleiermacher provides the framework for a type of pluralism grounded in the idea of a universal form of intuition that discounts the importance of metaphysical and moral claims in the foundation of religion.

c. Hegel

Hegel goes much further than Kant or Schleiermacher toward constructing a comprehensive and systematic philosophy of religion, particularly in offering both a general concept of religion and a detailed account of how diverse historical traditions manifest the basic religious essence. Hegel criticizes Kant for relegating God to the status of a regulative ideal, and he criticizes Schleiermacher for resting his concept of religion on subjective feeling. The main significance of Hegel’s approach, though, is its contention that the rational concept of religion and the historical phenomena of religious traditions cannot be understood separately; the latter give being to the former and make it intelligible. Indeed, he argues that the diverse concrete forms of religion are neither unavoidable deviations from an ideal form nor contingent responses to the infinite. The different forms religion has taken throughout history are determinate steps along the path of the self-manifestation of both the concept of religion and its object (that is, God). On the one hand, this means that the details of diverse forms of religion are of not only historical but also conceptual significance, considering that each religion gives humans insight into absolute spirit in its own way. On the other hand, Hegel’s insistence on the unfolding of spirit throughout the history of religions as a progressive movement leads him to a hierarchical model in which older forms are superseded by newer ones, with Christianity taking the privileged place of the “consummate” religion. In addition, Hegel maintains that religion as such can only offer a representation of the truth of absolute spirit, whereas it is philosophy’s task to proceed from this representation to complete knowledge. Much like Schleiermacher, Hegel’s attention to religious phenomena in their historical particularity is an important precursor to pluralist approaches, but the privileged status he reserves for Christianity as the most fully developed religion in terms of both belief content and practice evinces a degree of exclusivism in his approach.

d. James

James’s most direct and significant contributions to philosophy of religion come from his “The Will to Believe” and The Varieties of Religious Experience – the second of which having the greater influence on the development of religious pluralism during the twentieth century. In the series of lectures that make up this book, James largely brackets out any considerations as to the truth content of religious claims regarding the ultimate nature of reality or as to the various rituals, practices, and institutions that constitute the diverse religious traditions. Instead, as the title suggests, he focuses on the religious experiences of individual humans. James examines these experiences according to only a few basic categories, the most important distinction being that between “the healthy-minded” and “the sick soul.” Throughout his examination, James maintains that the individual’s personal experience lies at the core of religion and that rituals, doctrines, and traditions are thus secondary or inessential. His approach bears some similarity to Schleiermacher’s to the extent that it focuses on personal feeling, but it departs from the earlier approach in that it rejects the idea that there is a single “religious sentiment.” James also explicitly states that the concept of religion that he puts to use is one he has pragmatically adopted to suit the needs of his present study, and that it does not necessarily represent the actuality of religious phenomena in a comprehensive way. The few speculations James offers about the nature of ultimate reality toward which religious experiences point suggest a relatively pluralist attitude toward the diversity of religions, according to which none could claim a monopoly on genuine experience of the divine nor be excluded from it. Together with his later writings on pragmatism and pluralistic metaphysics, James serves as important touchstone for later theories of religious pluralism – especially those emerging from process philosophy (see below).

3. Analytic Approaches

Accounts of religious pluralism within the broadly construed analytic philosophy of religion tend to focus on the diversity of conflicting belief claims as the primary issue at stake. This emphasis has led some philosophers to concern themselves with religious diversity solely in terms of conflicting claims (for example, Basinger [2002] explicitly addresses diversity as “epistemic peer conflict”), while others also include treatments of variety in practice or moral norms. The present examination will address first the perspective in which the epistemology of religious belief claims is central, and second that which focuses on the semantics of belief claims. In both cases, the literature is extensive and contains sometimes highly nuanced discussions of specific issues. What follows are, therefore, only brief surveys of major trends, roughly sketched, of work in these two general areas. This section will then proceed to a more detailed examination of John Hick’s influential pluralist position, as well as some of the criticisms this position has received.

a. Epistemic Conflict

It is an undeniable fact of religious diversity that many traditions offer various accounts of the nature of both mundane and supramundane reality, of the ultimate ends of human beings, and of the ways to achieve those ends. The claims offered by different traditions, as well as the sometimes unarticulated assumptions about matters of ultimate concern, can (and do) seem to conflict with each other, and one possible approach to religious diversity is to treat this conflict as real, genuine, and significant. Treating epistemic conflict as real means not assuming that it only seems to arise because of lack of clarity or universality in the terminology of various traditions; treating it as genuine means admitting that equally sincere and knowledgeable adherents of different traditions may uphold conflicting claims, while treating it as significant means that such conflict constitutes a problem – perhaps the central problem – that any philosophical response to religious diversity must address in order not to ignore or misrepresent the challenge of religious diversity itself.

Possible responses to the epistemological issues raised by religious diversity include exclusivist and inclusivist as well as pluralist perspectives. Exclusivist arguments maintain that, in the case of conflicting claims about, for instance, the intrinsic nature of divine reality, no more than one non-contradictory set of claims can be correct. Thus, the exclusivist may hold that all the claims made by her own tradition are true and consequently that all conflicting claims made by other traditions are false. Inclusivist approaches argue that while only one set of claims is wholly true (or one set of claims is more true than others), claims made by other traditions may be true partially, selectively, or to a lesser extent. The distinction between this perspective and that of exclusivism often has to do only with the degree to which multiple religions’ claims can be conceded at least partly as true  – hence the suggestion that inclusivism may be more properly named “soft exclusivism” (Basinger 2002: 5).

The exclusivist, in order to maintain her belief claims, may simply choose not to recognize or respond to the fact of religious diversity at all. However, many philosophers of religion agree that anyone who is sincerely interested in maximizing the truth and minimizing the error of her belief claims is prima facie obligated to address significant epistemic conflict, at least in part by assessing the strength or justification of her own beliefs. It remains a debated point whether or not the existence of multiple conflicting belief claims necessarily decreases such justification; positions on this question range from outright denial of the possibility of justified religious belief in cases of epistemic conflict (cf. Schellenberg 1994) to the argument that justification in such instances is at most only slightly diminished (cf. Alston 1988). In most cases of diverse religious beliefs, it seems that there is no objective evidence or set of criteria that would allow for straightforward adjudication between conflicting belief claims. If this is the case, then the exclusivist would need to provide other grounds for justifying her commitment to the correctness of her own beliefs or to the incorrectness of others’ beliefs (and some defenders of religious exclusivism argue that this is possible). Given that the exclusivist already holds the beliefs that she does, though, epistemic conflicts with others’ beliefs do not necessarily provide sufficient reason for her to give up or modify her beliefs. Alvin Plantinga argues that the assessment and consequent modification or abandonment of some or all of one’s beliefs would be required in cases where conflicting religious beliefs are on epistemic par with one’s own. However, he maintains that such epistemic parity need not ever be obtained – that is, a Christian may reasonably believe that others’ beliefs with which her own beliefs conflict do not achieve significant epistemic parity (Plantinga 1995, p. 203-5; Plantinga’s argument is discussed at more length below).

It is also unclear that every belief one holds may reasonably be assessed in light of the conflicting claims of others. Some beliefs may serve a foundational role in the total belief system of an individual or community, in which case its epistemic status is somewhat different than other beliefs. Especially if it is granted that religious beliefs are often not subject to evidential justification, it may be the case that beliefs cannot be assessed except on the basis of other, more foundational beliefs. There would thus be a set of beliefs in any belief system that cannot meaningfully be subject to doubt or assessment in light of conflicting claims belonging to other systems. Jerome Gellman calls these “rock bottom beliefs,” arguing that their foundational status excuses those who hold them from the obligation to assess them even when faced with others’ conflicting beliefs (Gellman 1993). He goes on to argue that even if rock bottom beliefs are subject to assessment, a believer is obligated to assess them only when she loses confidence in them in the face of other, competing beliefs (Gellman 2000).

William Alston, similarly to Plantinga, also accepts that certain religious beliefs can be foundational, yet he does not agree that this renders them immune from assessment or revision. On the contrary, Alston takes religious diversity as one of the most serious challenges to the justification of a particular religious tradition’s beliefs. Nevertheless, he maintains that exclusivism with regard to one’s belief remains rational in practice precisely because of the lack of common ground between believers of different traditions. He argues that the absence of proof for the superiority of a Christian’s beliefs (or their grounds) over others’ need not diminish the justification of those beliefs, since one would have no way of knowing what such proof would entail even in principle (Alston 1988, p. 443). In fact, Alston goes on to claim, “In the absence of any external reason for supposing that one of the competing practices is more accurate than my own, the only rational course for me to take is to sit tight with the practice of which I am a master and which serves me so well in guiding my activity in the world” (1988, p. 444). Philip Quinn agrees with Alston that “sitting tight” is a rational option (and that practical self-support counts toward this option being rational), but contends that it is not the only rational option. Quinn holds that internal revision of one’s core beliefs in ways that would make them more reliable “if some refined pluralistic hypothesis were true” is also a rational option (Quinn 2006, p. 298).

Naturally, pluralists advocate a significantly different approach to the issue of epistemic assessment of beliefs. If there exists real conflict between beliefs held by different religious traditions, conflict which cannot be resolved by appeal to evidence or to a universal set of justificatory criteria, then the pluralist may conclude that no one set of beliefs can reasonably be held as preferable to others. A strong form of this position is simply the skeptical argument, which maintains that judgment must be suspended on belief claims that remain unresolved in this way such that an exclusivist would be required to give up her commitment to them (cf. Feldman 2007; Kitcher 2011). A less extreme version of the position, however, may admit that the lack of criteria by which to resolve conflicting belief claims does put the beliefs of different traditions on equal epistemological footing, but that this does not necessarily require that believers suspend or give up their beliefs. What it requires is rather that beliefs that are in conflict with others’ (equally justified) beliefs be held with less confidence. Such a pluralist response to epistemic conflict emphasizes awareness of the fact of diversity, humility regarding the level of justification of one’s own beliefs, and openness to the possibility that beliefs other than one’s own may turn out to be correct. While an exclusivist may also acknowledge that epistemic conflict may lead her to reduce her confidence in her beliefs, the pluralist position goes further in asserting that no one tradition’s beliefs are more or less justified than other tradition’s beliefs. It acknowledges that most religious believers are likely to hold on to the set of beliefs (and to engage in the practices) with which they are most familiar given their historical and cultural backgrounds, and it posits that, given the absence of strong arguments to the contrary, they are justified in doing so. The difference between this position and that of the exclusivist (for example, Alston) is that this justification does not extend to the second-order belief that one’s own beliefs are rationally preferable to other conflicting beliefs. This is the direction in which Quinn (2006) gestures when he argues that there is more than one rational option for the believer in the face of a variety of conflicting claims.

b. Religious Language and Apologetics

Arguments for the assessment of religious belief claims in instances of epistemic conflict tend to presuppose that the meaning of such claims as well as the evaluation criteria applied to them are to some extent mutually understandable – even if it is conceded that there are no strictly universal criteria by which to resolve conflicts between belief claims. However, another position advanced by some philosophers of religion is that belief claims from different religious traditions cannot be assessed for their relative strength because particular claims cannot retain their meaning outside the context of the whole system of beliefs in which they belong. This position can be understood as a version of Wittgenstein’s approach to statements about religious belief, in which he suggests that individuals holding conflicting belief claims may be operating within different cultural-linguistic frameworks, exemplified by the idea of “language games”. In this case, they would have no shared frame of reference according to which the conflict could be resolved. It may not even be accurate to say that an epistemic conflict actually exists, rather than simply a difference in linguistic practice. What is called for, then, is not assessment and justification of belief claims in epistemic terms – where different claims’ truth, likelihood, or reasonableness is evaluated – but merely examination of the use and practical consequences of statements about belief in the lives of those who offer them.

If one accepts this approach to religious language, there are significant consequences. One is that it would be meaningless to speak of the truth or falsity of an entire belief system, because the truth of belief claims could only be assessed according to the internal grammar of this or that system. Another is that an individual’s commitment to one system of belief rather than another would be more or less arbitrary – in most cases, it would be wholly dependent on accidents of birth. It also seems that adopting this approach may commit one to adopting one of the positions that were set aside at the beginning of this article: that is to say, religious relativism, strong anti-realism, or religious skepticism. The relativist would conclude that, if belief claims are justified only based on the semantic criteria provided by their particular religious cultural-linguistic frameworks, then claims from different frameworks simply cannot be meaningfully compared. The strong anti-realist would go a step further, arguing that religious belief claims thus cannot refer to content beyond their cultural discursive context, while the skeptic would hold that the cultural-linguistic delimitation of the meaning of belief claims prevents us from knowing whether or not such claims accurately refer to extra-linguistic reality. Each of these positions exhibits the tendency of the cultural-linguistic approach to religious language, which is to minimize the importance of whatever cognitive content there may be in such language.

Critics of this approach argue that it does not accurately represent the attitudes of religious individuals toward the meaning and status of the particular claims they make or the ways in which they use religious language generally. Statements like “Jesus Christ is divine” or “Atman is Brahman” seem in most cases not only to fit into a larger system of religious linguistic use but also to describe a state of affairs the reality of which the utterer of the statement believes in. Even if one were to take such statements merely as expressive of certain subjective attitudes or existential commitments on the part of the person who utters them, one could argue that the adoption of such attitudes or commitments on the part of the speaker presupposes that she also believes the statements to be cognitively true. Otherwise, the attitude that she expresses in them would be arbitrary or unwarranted.

If one maintains, contrary or in addition to the cultural-linguistic model, that at least some religious statements do express actual truth claims and that some of these claims conflict with each other, then the question returns as to whether such conflicts need to be resolved and, if so, how. One of the most significant responses to this issue made in terms of the logic of religious language is that of Paul J. Griffiths, who argues that the incompatibility of certain doctrinal statements belonging to diverse religious traditions creates the obligation that representatives of these traditions engage in inter-religious apologetics (1991, p. 3). Griffiths calls this the principle of the necessity of inter-religious apologetics (NOIA). It proceeds from the initial presuppositions that statements that seem to make claims about ultimate religious truth do so, that arguments for believing such claims can intelligibly be made to those who do not believe them, and that to admit that others’ beliefs conflict with mine and subsequently to try to persuade others to accept the truth of my claims is to show respect for the sincerity and rationality of others. Griffiths divided the enterprise of inter-religious apologetics into two categories: “negative apologetics” aims to defend the plausibility of belief claims against outside criticism, while the goal of “positive apologetics” is to argue for the superiority of one tradition’s set of belief claims over that of others. He maintains that the NOIA principle enjoins representatives of religious communities to engage in both forms of apologetics, insofar as the interest in maximizing truth that is evident in making statements of doctrine entails both internal self-evaluation and external correction. Griffiths argues that the imperative of the NOIA principle is both epistemic, insofar as religious communities have a responsibility to justify and defend the content of their claims against challenges posed by conflicting claims, and ethical, insofar as assent to certain belief claims is often understood by religious communities to be a necessary condition of salvation (1991, p.15).

To a certain extent, Griffiths’ advocacy of apologetics commits him to exclusivism. Adherence to the NOIA principle implies not only that there is a real referent to which some belief claims point but also that, when claims conflict, one claim is likely to be more accurate in its description of reality than others. If this were not the case, it would be difficult to defend the necessity of positive apologetics, and perhaps negative apologetics as well. However, a commitment to the importance of apologetic dialogue as Griffiths proposes it is not necessarily equivalent to strict exclusivism, especially the variety that would allow the exclusivist to retain confidence in the superiority of her own position without assessing them in a religiously diverse context. That Griffiths advocates not only recognizing the fact of religious diversity but indeed engaging with a variety of conflicting claims via respectful argument – thereby entertaining the possibility that such claims might be partly or wholly true – suggests his position may be better understood as a form of inclusivism.

c. John Hick: the Pluralistic Hypothesis

The pluralist position advanced by John Hick has been and continues to be one of the most, if not the single most, significant and influential philosophical approaches to religious pluralism. It has garnered both considerable praise and considerable criticism from a variety of fronts. After outlining the main features of Hick’s arguments, we will also examine a few of the most substantial criticisms they have received.

Hick begins from the position that the world as it appears is ambiguous with regard to religion – that is, there is no inherent epistemological obstacle to experiencing and interpreting the world from the point of view of one religion rather than another, or indeed from a non-religious perspective (2004, p. 124). From here, he proceeds to his central claim that diverse religious traditions have emerged as various finite, historical responses to a single transcendent, ultimate, divine reality. The diversity of traditions (and the belief claims they contain) is a product of the diversity of religious experiences among individuals and groups throughout history, and the various interpretations given to these experiences. Hick’s claim that diverse interpretations are responses to a single transcendent Real draws on the Kantian distinction between noumenon and phenomenon; that is, the Real is epistemologically unavailable to human beings in itself, but we can nevertheless experience it “as the range of gods and absolutes which the phenomenology of religion reports” (2004, p. 242). Thus, Hick argues that no single description or set of descriptions applied to the Real from within the realm of human experience can apply literally to the Real (2004, p. 246). Nevertheless, the Real remains the final referent of the ontological claims made by the different religious traditions, even though such claims can at best only partially approximate the real truth of divine reality. Hick understands the multiple claims from diverse religious traditions that the object of their respective beliefs is ultimately unsayable and incomprehensible as supportive of his argument. As for the content of particular belief claims, Hick understands the personal deities of those traditions that posit them (for example, Yahweh, Viṣnu, Amida) as personae of the Real, explicitly invoking the connotation of a theatrical mask in the Latin word persona. Alongside these, he recognizes impersonae of the Real: concepts such as Brahman, nirvana, and Tao that represent ultimate reality in a non-personal way. Since the ideas in both of these categories arise from our experience of phenomenal reality, none of them can adequately describe the Real as it is in itself. The only way that humans can describe the Real directly is by using formal language such as “ultimate truth” or “ground of experience” (2004, p. 246).

In light of his epistemological arguments, Hick claims that all religious understandings of the Real are on equal footing insofar as they can only offer limited, phenomenal representations of transcendent truth. This position, which he calls the “pluralistic hypothesis,” brings together elements of several philosophical perspectives on religion into a complex whole. At the phenomenal, historical level in which humans live and religious traditions emerge, Hick advances the view that meaning is constituted largely by practice (linguistic and otherwise) within the contexts of particular cultures. Thus, the justification of belief claims rests on the relation of various practical commitments within a certain tradition or culture, and evaluation of or comparison between claims of distinct cultures appears problematic at best. This aspect of Hick’s position seems to rely heavily on the cultural-linguistic model of religion that places little importance on any cognitive content that belief claims may carry. However, Hick also maintains that the content of religious belief claims necessarily implies the believer’s sincerity in supposing that such claims actually refer to the transcendent reality that they purport to describe. That is, Hick maintains that the belief claims of the various historical religions have traditionally been articulated in ways that imply a realist perspective (2004, p. 176). Hick’s response to this issue is to posit that the referent of religious belief claims is ultimately real, but that any claims or knowledge about it must necessarily be historically and culturally mediated. If this is the case, then, as the pluralist hypothesis maintains, each religious tradition has some grounds for holding to its own beliefs and practices while no one tradition has grounds for claiming an exclusive or privileged status.

Hick claims that the pluralist hypothesis is more reasonable than either anti-realism or exclusivism given the broad extent and wide diversity of religious experiences and traditions. In response to religious anti-realism, Hick argues it is at least no less plausible to postulate a real, transcendent referent for religious belief claims than it is to reject such a reality in favor of a purely naturalistic explanation. Furthermore, he argues (drawing on William James) that a religious individual’s basic trust in her own experience is rationally justified, given the fundamental ambiguity of the world as it appears (2004, p. 228). In response to exclusivism, Hick maintains that adopting an exclusivist stance toward the justifiability of beliefs is not rationally defensible (2004, p.235). Even if it were the case that only one religious tradition correctly represented the Real, it would not be possible for humans to know this with any certainty. However, Hick is clear that this is not a case of merely epistemological uncertainty; because the Real positively transcends all human description, no one way of describing can even possibly be true (2004, p. xx).

The moral and soteriological content of diverse religious traditions is also an important focus of Hick’s argument. He posits that, in various ways, all the major religious traditions that emerged from the “axial age” understand the salvation from or transformation of the present world as a central aspect of the human relation to the Real (2004, p. 300). The ability to bring about such a transformation, as well as to promote a generally moral way of life, is perhaps the only common method by which one can evaluate diverse religious traditions. Thus, despite the various concrete paths to such an end proposed by the world’s major religious traditions, Hick affirms that the soteriological process at work in them is essentially the same. Furthermore, he points to what he sees as a broad consensus regarding basic moral claims among religious traditions to advocate the equal validity of diverse traditions with respect to their soteriological claims. Similarly to his argument regarding the ontological claims of various traditions, Hick does not ignore the fact that the details of both moral and soteriological prescriptions of these traditions often not only differ but conflict with each other. Nevertheless, he maintains that their overall moral themes generally agree and that their soteriological visions depict in various ways a path of transformation from self-centeredness to “Reality-centeredness.” The logic of this aspect of his position is also similar to the ontological-epistemological aspect insofar as no phenomenal experience can provide humans with certainty about the true effectiveness of any one soteriological path (2004, p. 337). In principle, such knowledge can only be attained when salvation is achieved; according to Hick, such a point would be the proverbial mountain peak at which the various upward paths converge.

d. Criticisms of Hick

Hick’s version of pluralism has garnered a wide variety of critical responses since it was first proposed; nearly every aspect of his argument has encountered some kind of challenge. This discussion will cover some of the most significant criticisms leveled from others in the analytic tradition (broadly construed), but other criticisms will be mentioned briefly in subsequent sections where appropriate.

Some of Hick’s critics argue that his claim that the same process of transformation from self-centeredness to Reality-centeredness is at work in each of the world’s major religious traditions is not a valid interpretation of the different forms of diverse religions’ soteriological paths. While, from the point of view of philosophical or sociological studies, there may be structural similarities between the ultimate aims of different religious traditions, it is more difficult to take the further step (to which Hick seems committed) of positing that these aims are essentially the same (cf. Hick 2004, p. xxvi). Furthermore, it seems a crucial part of many religions’ self-understanding that their beliefs and practices are uniquely suited to achieving a true salvation not offered by other traditions. This is an issue, of course, with which any pluralist approach must contend, but Hick’s critics maintain that his argument does not address it more clearly than other possible perspectives (for example, Plantinga 2000, p. 56ff.). In order to maintain the identity of soteriological paths across religions, Hick seems forced to minimize or ignore differences while translating important concepts out of their native terms into ones that members of particular traditions may not accept.

Among the criticisms of the soteriological aspect of Hick’s perspective, one of the most radical is that offered by S. Mark Heim. Heim argues that Hick’s position rests on two basic assumptions: the unity of the Real and the convergence of soteriological aims toward a single religious end (2006, p. 23). The first, says Heim, rests largely on Hick’s interpretation and acceptance of a Kantian distinction between the noumenal and phenomenal – a distinction that, Heim points out, is not always easily reconcilable with particular religious claims. The second assumption, argues Heim, undermines Hick’s concession that religious diversity does truly exist, since the unity of the soteriological process would render non-soteriological (that is, purely dogmatic) differences ultimately insignificant unless they inhibit this process for the religious individual. However, Heim explains that Hick seems to deny that such impediment is even possible in principle, so that any salvific human effort (even of a non-religious variety) participates in the same soteriological process (2006, p. 28).

Heim’s alternative – what he calls a “more pluralistic hypothesis” – combines a modified inclusivism with the concession of the possibility of a real plurality of ultimate ends. The former part of his argument holds that the concrete claims of diverse traditions regarding avenues for salvation ought to be acknowledged in their particularity and on their own terms, and that the sincerity of their adherents’ commitments to such claims is significant. The latter part of Heim’s argument, however, rests on an epistemological humility regarding one’s possible knowledge of eschatological reality and acceptance of the notion that such reality may in itself be plural. This puts Heim’s position in close proximity to that of process philosophical approaches (see below).

Criticisms of the epistemological and ontological aspects of Hick’s argument proceed in a vein similar to those of the soteriological aspects. Some claim that Hick’s hypothesis does not provide adequate support for the claim that the world is religiously ambiguous and that one is thus justified not only in treating one’s religious experience or belief as valid but also, perhaps, in treating it as superior to conflicting beliefs. Plantinga, for instance, does not concede that awareness of religious diversity necessarily calls for alteration of one’s previously held beliefs, though it might invite new reflection on them. Against Hick (and others), he defends a version of exclusivism, which he defines minimally as holding that “the tenets or some of the tenets of one religion—Christianity, let’s say—are in fact true” and that “any proposition, including other religious beliefs, that are incompatible with those tenets are false” (Plantinga 1995, p. 194). However, this minimal definition of exclusivism is not necessarily that which stands in need of defense since it could include non-culpable ignorance of others’ actual conflicting beliefs, so Plantinga narrows his version of exclusivism to include sticking to one’s beliefs despite awareness of other religions, acknowledgment that they contain examples of genuine piety, and belief that one does not have a conclusive rational argument that proves the truth of one’s own beliefs (1995, p. 196). He recognizes that challenges to this version of exclusivism are made on moral and epistemological grounds, and attempts to defend it against both by showing that, in one way or another, these challenges ultimately undermine themselves. If the moral argument is that exclusivism is arrogant, then Plantinga claims that either this argument itself is also arrogant (that is, that one ought to withhold belief from beliefs that contradict those of others) or that it lacks grounds (1995, p. 200). Plantinga breaks down the epistemological challenge to exclusivism into those centered on justification, on rationality, and on warrant. His defense in each case remains largely the same: either the pluralist position succumbs to the same criticism that it levels against exclusivism, or it cannot provide sufficient grounds for its own reasoning. Ultimately, he contends that religious diversity itself can prove neither the falsity nor the lack of warrant of particular beliefs – though it may tend to reduce confidence in belief (1995, p. 214). At the same time, though, Plantinga suggests that awareness of and reflection of diversity can also serve to strengthen the exclusivist’s conviction.

Plantinga also levels a challenge to the logical consistency of Hick’s position, contending that it would be nonsensical to suggest that the Real cannot have either of two strictly contrary properties (similar to the criticism made by Rowe [1999, p. 146]). As Plantinga puts it, the Real “could hardly be neither a tricycle nor a non-tricycle, nor do I think that Hick would want to suggest that it could” (2000, p. 45). However, Hick replies that he does indeed mean to suggest that the Real is “neither a tricycle nor a non-tricycle,” neither green nor non-green, and so forth, both because these sorts of concepts cannot apply either positively or negatively to the Real and because they are not religiously relevant. Hick finds Plantinga’s argument unacceptable because, when translated from an irrelevant concept (tricycle or non-tricycle) to a religiously relevant one (for example, personal or non-personal), it forces an exclusivist choice between different conceptions of the Real – a choice that seems to have no stronger support than personal preference or cultural background (Hick 2004, p. xxii).

Another, more recent epistemological criticism of Hick’s position takes aim at precisely this point: namely, his contention that to a significant extent our religious beliefs are contingent products of factors such as where and when we were born, and that this contingency poses a problem for claims to religious knowledge and, particularly, exclusivism (Hick 2004, p. 2). Tomas Bogardus argues that the inferences from contingency to pluralism or skepticism in this argument are invalid. He specifies, first, that the problem of the contingency of belief is only a significant problem if it deals with “only unreflective religious belief, belief formed genuinely, for example, on the basis of passive receipt of testimony during childhood” (Bogardus 2013, p. 378). Reflective belief must escape this problem, since otherwise pluralism (which is itself a reflective position) would be subject to the contingency objection. Bogardus then further specifies that Hick’s (and others’) contingency-based objections seem to target the “safety” of beliefs – that is, if a person had been born elsewhere than she was and used the same method to form her beliefs as she actually has, then (in light of her actual beliefs) she might have believed falsely. Bogardus reads the contingency objection as inferring from this statement that religious beliefs are, in fact, not formed safely, and therefore they do not constitute knowledge (2013, p. 380). Yet, he maintains that these inferences are invalid because, in the first place, “[t]he fact that something might have happened which would have rendered my faculties unsafe does not entail that my faculties are actually unsafe,” and, in the second place, because safety is not necessary for knowledge (2013, p. 381-2). After offering a similar criticism of the version that makes the contingency objection a question not of safety but of accidentality, Bogardus departs from his criticism of Hick to consider the issue of epistemic symmetry between conflicting beliefs held for contingent reasons (cf. Kitcher 2011). However, he maintains that this version of the contingency objection is either self-defeating or else excuses reflective belief. He concludes with the suggestion that even the unreflective believers specifically targeted by the narrow version of the contingency objection that he has considered may be excused due to non-culpable ignorance (2013: 391).

These epistemological challenges are distinct from, but related to, the issue of Hick’s ontology: the central claim that there is one ultimate, noumenal Real to which the different religions address in a variety of ways. Critics charge that it is difficult to maintain that all religious representations of ultimate reality refer to the same Real given not simply the wide variety of different actual representations but particularly those elements of them that seem to be incompatible or contradictory (cf. Rowe 1999). On this point, it is not obvious why Hick maintains the unity of the Real rather than positing a plurality of ultimate referents to match the plurality of ways it is signified; Mavrodes (1995) goes so far as to describe Hick’s position as actually polytheistic. In addition, if the Real is as epistemologically inaccessible as Hick maintains, then explaining how religious concepts can refer to it at all becomes problematic (cf. Plantinga 2000). Hick’s responses to these and other related criticisms are by and large pragmatic, appealing to a principle of global religious irenicism or claiming that his hypothesis is offered as a “‘best explanation,’ not an iron dogma” (Hick 2004, pgs. xxii, xxvii).

4. Continental Approaches

Treatments of religious pluralism in continental philosophy of religion tend not to focus on epistemological or ontological issues, but rather on the ethical and political aspects of diversity. To the degree that the potential truth of religious claims is discussed in these contexts, it tends to be in hermeneutic perspectives that focus on the history of texts in various traditions – or on various traditions approached as texts – and that aim to bring these traditions into constructive dialogues with each other. In addition, the concept of hospitality has become a prominent theme in discussions of both textual interpretation and interreligious ethics. The present discussion will examine the hermeneutic model and the theme of hospitality separately, though it will become clear that there is significant overlap between these two.

a. Hermeneutics and Religious Truth

Hermeneutic approaches to religious diversity take seriously both the deep divergences between different religious traditions and the idea that diverse religions’ claims to truth should be taken seriously as such. Drawing on the work of Heidegger, Ricoeur, and especially Gadamer, advocates of what may be termed “hermeneutical pluralism” attempt to understand interreligious encounters according to the models provided by textual interpretation and interpersonal conversation. Gadamer argues that a text to be interpreted makes a demand on the interpreter in that it presents itself as holding claims to truth that can only be adequately judged, affirmed, or denied after a careful reading in which the interpreter, to some extent, adopts the same questions as those that the text itself addresses (2002, p. 369). Similarly, one must acknowledge in another person the possibility of her knowing and speaking truth if one is to be able to engage in any meaningful conversation. In both cases, the relationship between the interpreter and the interpreted is asymmetrical: the truth claims of the person or text to be interpreted (and the questions to which they relate) should be understood as the measure against which one’s interpretations are judged. It is this focus on interpretation as a model for understanding that gives this approach the designation “hermeneutical.” And it is the claim that an effective understanding of a text, person, or tradition depends on one being open to the possibility of truth outside one’s own familiar tradition or worldview and on one being able to place one’s own position not only on par with but even “below” that of the other (insofar as the other, which takes the place of the text to be interpreted, serves as the standard against which interpretations are measured) that makes this approach a pluralist one.

Gadamer uses the term “horizon” to designate the position from which the interpreter approaches the text (or other object of interpretation). One’s horizon is constituted by one’s present phenomenal and conceptual environment as well as the history that has shaped this present and the particular ways in which one is open toward future possibilities. Thus, one’s horizon includes all presuppositions that one brings into new encounters, presuppositions that one can recognize but never completely escape (2002, p. 302). The work of interpretation involves engaging with an other whose horizon is different, sometimes drastically so, from one’s own. Openness to others does not rest on attaining an objective point of view free from prejudices; this sort of abandonment of one’s own horizon would be impossible. Instead, the openness that Gadamer advocates is a willingness to let our presuppositions and perspectives alter as a result of an interpretive or dialogical encounter, a process he calls the fusion of horizons.

In the application of hermeneutic philosophy to religious diversity, religious traditions or their respective exemplary texts are taken as representative of different horizons. The issue, then, is how to join these horizons effectively, and what the resulting new horizon will look like. First, one can make the argument that if one has even rudimentary knowledge of a tradition beside one’s own, then the fusion of horizons has already begun. Any relation between two or more religious traditions means that at some point the respective horizons of the different traditions overlap. The task would then be to map out these overlaps by way of discourse with members of different traditions or interpretation of their religious texts. For example, insofar as Jewish, Christian, and Muslim traditions share a common set of basic theological concepts (for example, strict monotheism) or scriptural figures (for example, Adam, Noah, and Abraham), these commonalities could be and often are used as frameworks for interreligious dialogue in the present. In the process of interpretation and discourse, relations could be extended and new points of contact developed, so that a more extensive fusion of religious horizons can be achieved.

Since the term “horizon” signifies one’s present perspective and particularly since it includes the presuppositions on the basis of which one begins an interpretive encounter, to a certain extent one’s criteria for the judgment of truth claims (including one’s own) are shaped by one’s current horizon. Therefore, openness to an encounter with another tradition from which a new horizon might emerge entails openness not only to the revision of specific presuppositions but also to the possibility of reassessing one’s criteria for judgment. Interpretation or dialogue across different religious traditions would then include openness both to the possibility that the other’s traditions contain elements of truth not available within one’s own tradition and to the possibility that engagement with such truth might affect one’s understanding of truth in general. Depending on the hermeneutical approach taken and the arguments on which it is based, perspectives that advocate such dialogue are often correctly characterized as pluralist (though more inclusivist positions are not ruled out).

Advocacy of this openness to the revisability of basic epistemological criteria is based on the assumption, which many pluralist accounts of religion share with hermeneutics generally, that present standards of judgment are always subject to adjustment or augmentation in the future. This assumption goes along with the claim that there is in principle no end to the hermeneutic process; interpretation is continually producing new understandings of its objects and new horizons (that is, new contexts of understanding) that can subsequently become starting points for future attempts at dialogue and expansions of mutual understanding between different perspectives. Since hermeneutics denies the possibility of objective access to truth, or truth that is free from a particular perspective, every mutual understanding achieved by diverse religious traditions will necessarily be provisional and subject to later reinvestigation. This is, however, not to be considered an unfortunate limitation but rather an opportunity for further constructive engagement. Only in particular historical encounters can new horizons of understanding and experience be forged; the hermeneutical philosopher of religious pluralism seeks out new possibilities for dialogue through which to partly reappraise the merit of former interpretations.

Regarding the question of whether or not diverse claims and interpretations point beyond themselves to a higher universal truth, the hermeneutical approach to pluralism may lead to different conclusions. One possibility would affirm that there is such a universal truth, resulting in pluralism not unlike that of Hick. This seems to be the approach that follows most closely from Gadamer’s claims about the nature of language, insofar as he maintains that the diversity of languages nevertheless expresses a basic universal relationship to the world. One could then say that all of the various religious traditions, and the diversity of representations of reality found within them, express the same ultimate truth from their different horizons. On the other hand, it is also possible to argue on Gadamerian grounds that what is productive and valuable in interreligious discourse is precisely the encounter with the other as other, and that appeals to a higher unity undermine this. According to this view, when beliefs and practices belonging to different traditions seem to refer to different ultimate realities, it may be preferable either to hold that they really do so or to avoid the question of a possible transcendent unity of religious truth entirely. The former position makes the issue of ongoing interpretation and judgment of conflicting truth claims particularly significant, while the latter is sometimes more pragmatic but also potentially less productive. Merold Westphal, for example, argues against something like the latter view, claiming that it precludes meaningful dialogue with religious others by eliminating even the possibility of cognitive conflict (1999, p. 3). His defense of a modest form of theological exclusivism that has been uncoupled from political and cultural imperialism may serve as a helpful point of contact between analytic and continental approaches to religious diversity, especially insofar as he advocates the consideration of epistemological issues and of the cognitive content of belief claims in addressing the ethical concerns at the root of many forms of pluralism.

b. Hospitality

Hospitality to others has become a central theme in continental philosophy of religion, as well as in continental ethical and political philosophy more broadly. Understood as an ethical demand, the idea of hospitality prescribes that one should offer one’s home, food, and other resources to the needs of others – particularly those who do not share one’s nationality, culture, language, or religion. This idea carries with it a demand to respect the other in her difference, and thus has been taken up as a productive concept in terms of which to articulate a pluralist approach to religious diversity. In addition to its conceptual resources, it has also traditionally been recognized as an important virtue in many different religions and cultures, thus making it a helpful starting point for dialogue. Philosophers who discuss the ethics of hospitality also recognize certain inherent tensions in the concept that make it difficult to translate into a concrete practice.

As discussed in the work of Ricoeur (2010) and Derrida (2002), among others, hospitality appears to be something that in principle can never be fully offered. Being hospitable in the most complete sense means anticipating the appearance of the other, understanding her needs and wants, and ultimately being able to turn over anything and everything within one’s control to the other. This raises a few problems, one of which is that the ability to successfully anticipate the appearance and needs of the other seems to undermine her very otherness – what marks the other as other is surely, at least in part, one’s lack of knowledge about her. If one attempts to make up for this lack of knowledge by encountering the other not on her own terms but on one’s own, however, this also undermines the otherness of the other by not paying sufficient attention to her difference. Another problem is that the other to whom one is called to be hospitable can appear as a threat insofar as one’s hospitality to the other, when carried to its furthest point, displaces one from one’s own position of control and puts one at the mercy of the other. The host, in her obligation to the needs of the guest, becomes in a sense the guest of the guest – and the possibility is always present that the other will treat one inhospitably or even violently.

Pluralist accounts that think in terms of hospitality must thus deal with at least two related issues: how to address the seeming impossibility of adequately fulfilling the requirement that one be hospitable to the religious other as other, and how to be respectful and welcoming of the religious other even while recognizing the risks inherent in doing so. Specifically, religious communities must recognize both an obligation toward engagement with those from different religious traditions and the impossibility of achieving a complete understanding of religious others. In addition, hospitality demands that religious communities open themselves up to religious others even if their own communities or beliefs are threatened in the process.

Ricoeur connects hospitality to a hermeneutic theory of translation, arguing that the impossibility of being perfectly hospitable mirrors the impossibility of translating a text from one language to another without any loss or distortion of meaning. In each case, he argues that this impossibility suggests that we should not aim for the ideal of perfection but rather accept that every instance of hospitality, just like every translation, will be risky, limited, and contingent (2010, p. 38). Accepting such a fact may be difficult, especially insofar as it seems to suggest failure to achieve the goals toward which hospitality is oriented. However, Ricoeur maintains that fixation on perfect, complete hospitality is counterproductive and that achieving it, if it were possible, would actually result in the erasure of the difference between oneself and the other (cf. Kearney & Taylor 2011, p. 17). Instead, the limitations inherent in any attempt at hospitality produce a situation in which creative exchange between two individuals, communities, texts, or traditions happens without either subsuming one into another or synthesizing both into a third that erases the initial identities. Because there is no absolute perspective against which to measure the particular perspectives of the various religious traditions, their particularities must be respected as fundamental – that is, there is no prior “common ground” against which the relative value or truth of particular differences can be measured. Interreligious encounter offers the opportunity to engage in a process of translation in which the parties involved learn both about others and about themselves by constructing a context in which their differences and limitations are emphasized.

While his account of hospitality agrees with Ricoeur’s in a number of respects, Derrida does not see the ideal of perfect hospitality as something that should be abandoned because of its impossibility. Drawing on the ethical philosophy of Levinas, Derrida (2000) argues that the other makes an infinite ethical demand that one is never able to answer adequately, but that this inability is precisely what drives the continual concrete effort of hospitable practice. Making a conceptual distinction between the “conditional hospitality” that dictates practical action and the “absolute hospitality” that this action can never achieve, he claims that the inescapable gap between the two not only motivates us to continue to act in conditionally hospitable ways but also to strive toward making our actions and dispositions ever more hospitable. In the context of religious pluralism, this means that one must recognize not only that every attempt to understand and welcome the religious other is necessarily inadequate but also that one has a responsibility to try to be more understanding, more welcoming, and more accepting of the religious difference as such (cf. Derrida 2002). Importantly for Derrida, it also entails a recognition that one stands in a similar relation to the other as one does to the divine. Recalling again the thought of Levinas (1989), Derrida stresses that “every other is wholly other,” and the religious individual or community bears the same obligation to the religious other as it does to God (Derrida 2008, p. 78).

5. Contributions from Feminism

Feminist theology and philosophy of religion has, perhaps surprisingly, turned its attention to religious pluralism only recently. Feminist scholars have long emphasized the need for greater diversity in both analytic and continental discussions, but this has often meant gender, ethnic, and socio-economic diversity rather than religious diversity. To the extent that feminist perspectives have been applied specifically to religious pluralism, though, arguments have emphasized the degree to which particular religions have been taken as homogeneous traditions without internal diversity (Gross 2002, p. 60). A similar criticism has been levied against representations of femininity in certain pluralist discussions of religion – for instance, in Hick’s discussions of feminist theological critiques of traditional Christianity (Hick 2004, pg. 52; cf. xxxviii). In either case, one can observe a skepticism concerning understandings of both religions and women as possessing relatively stable or universal identities, as such conceptions provide adequate pictures neither of historical realities nor of the full range of belief claims (and their implications). Indeed, they can often serve to conceal the roles, concerns, beliefs, and practices of not only women but also other power minorities within traditions. Insofar as they share a critical stance toward male-dominated traditions of thinking about religion generally and diversity specifically, feminist approaches to religious diversity may serve as points of contact between analytic and continental discussions.

The arguments that have emerged recently for explicitly feminist varieties of religious pluralism – or, conversely, for explicitly pluralist approaches to feminist theology – offer a variety of reasons to support the claim that feminism and religious pluralism are natural allies. For instance, many feminist philosophers and theologians argue that special attention needs to be paid not only to the experiences of people with diverse gender, ethnic, national, and economic identities, but also particularly to those whose experiences have traditionally been ignored or underrepresented within Western philosophical and theological traditions (cf. Gross 2002; Kwok 2002). If emphasizing the importance of such diversity is already a feminist value, then it stands to reason that inclusion of experiences from members of diverse religious traditions should also be valued. Furthermore, concepts, practices, and experiences arising from non-Western traditions may deserve special attention since they have traditionally been given less consideration in Western philosophical and theological discourse.

In addition, many feminist theologians take it as a central aim to find alternative scriptural sources or minority practices that can be used to critique, augment, or replace traditions that have historically excluded or undervalued women. This same inquiry into alternative sources and interpretations leads feminist theology toward interreligious dialogue, out of both the spirit of openness to difference and that of solidarity (Ruether 1987, p. 147). By extending the scope of critical feminist investigations across diverse religious traditions, the possibilities for finding constructive resources may be broadened (Gross 2002, p. 63). For example, many non-Christian traditions include devotions to goddesses, or (as discussed above) non-personal representations of the divine. Such practices and concepts can be helpful in critiquing the traditional masculine bias in Christianity, provided that the Christian theologian adopts a pluralist attitude toward other religious ideas. Of course, this is not to say that non-Christian traditions do not also contain patriarchal elements, and it is likewise the task of the feminist pluralist to analyze these critically (cf. Ruether 1987). Nevertheless, the main goal of much feminist theology and philosophy of religion is criticism and reinterpretation of one’s own tradition. In the context of religious diversity, this criticism would include not only fostering greater appreciation of the concepts and practices of other traditions as such, but, particularly, adopting a charitable attitude when approaching elements of other traditions that may at first glance seem anti-feminist. For example, norms of female dress that may seem heavily modest or the practice of arranged marriage may appear to limit women’s individual freedom, but upon further study could be seen as protecting women from sexual objectification (Gross 2002, p. 69).

Another important feminist contribution to religious pluralism is the critique of conceptions of particular religious traditions as possessing single, uniform identities. Perhaps the most direct and substantial of such critiques is offered by Jeannine Hill Fletcher (2005), who argues that describing religious traditions according to their specific differences and then identifying their members according to such distinctions is misleading in several ways. For one, this approach ignores the internal diversity of religious traditions; members of the same religion differ from one another in gender, ethnicity, profession, nationality, wealth, and so forth, and these differences may lead to drastically different experiences of the “same” tradition. Also, the religious identity of a single individual or community always intersects various other identities, all of which are informed by social, cultural, and geographical locations and particular experiences and behaviors. Perhaps most importantly for her understanding of religious pluralism, Fletcher also contends that such intersectional religious identities are always hybrid, by which she means that all identities are formed in relation with multiple other identities. Because of this, no identity is absolutely distinct and no difference completely precludes any communication and understanding. Dialogue and mutual understanding between members of diverse religious traditions is possible because, in the complex mesh of relationships out of which different religious identities emerge, possibilities already exist for building solidarity with one another. Furthermore, the particular points of contact and the character of the mutual understanding and solidarity that result from interreligious discourse cannot be determined beforehand, because one cannot know before actually engaging with the other what similarities and differences one will encounter. Thus, from Fletcher’s perspective, feminist approaches to religious pluralism must be characterized by attention to the details of concrete engagements between individuals or communities, rather than abstract conceptions of religious identities.

6. Process Philosophy

One important approach to religious pluralism that is not covered in the discussions of analytic and continental perspectives above is that of process philosophy. Drawing primarily on the work of Alfred North Whitehead and Charles Hartshorne, process philosophies of religion highlight the potential for novelty and creativity in the world. Since it is constitutive of process philosophy to hold that becoming and change are ontologically fundamental, process philosophers of religion tend to reinterpret, downplay, or deny the idea of an immutable divine reality. Process approaches also emphasize complexity and difference as inherent aspects of all levels of reality, so a pluralist approach to religious diversity would seem to follow naturally from their metaphysical commitments.

Another commitment that serves a central role for many process philosophers is the commitment to the naturalistic worldview of modern science. This is not to be equated with a strict positivism in which hard scientific inquiry is the only route to knowledge, but rather with the attitude that supernatural explanations that run counter to the presuppositions and conclusions of scientific knowledge—that is, explanations that depend on the interruption of natural causal processes by a deity or other supernatural force – should be abandoned as untenable. Griffin argues that although such naturalism is often combined with atheism and ontological materialism, it is not a necessary relation (2005, p. 13). Accepting such naturalism as part of the basis for a pluralist approach to religious diversity may create difficulties for process positions, though, insofar as one cannot presume it to be a shared assumption of various contemporary religious perspectives. On the other hand, it provides an ontologically minimalist and relatively objective framework from which members of different religious traditions can, at least in principle, begin dialogue. It can also serve as the basis for an argument for epistemic humility in a similar vein to that of Hick – though with crucially different ontological claims.

In both Hick’s pluralistic hypothesis and process pluralism, ontological claims, religious experience, and devotional practices are understood to have a real referent. The first distinction between these two positions, though, is that in the former the Real is posited as a single ultimate referent while in the latter different claims or representations can be taken to have distinct ultimate referents. John Cobb, for instance, argues especially that personal and impersonal representations of the divine are difficult to reconcile as simply different concepts of the same noumenal reality, and attempting such a reconciliation would end up revealing very little about either tradition from which the concepts emerge or the Real to which they point (cf. Griffin 2005, p. 47). Instead, Cobb posits that ultimate reality must in itself be unimaginably complex. The various perspectives on this reality found among diverse religious traditions are then not only different ways of representing the same ultimate truth, but indeed distinct ways of representing different aspects of the complex totality of this truth. Impersonal representations signify one basic aspect of the Real, while personal representations signify a fundamentally different basic aspect.

Cobb’s claim rests on a broader metaphysical hypothesis that involves a plurality of ultimate realities, or at least ultimate aspects of reality: a personal deity or supreme being to which personal representations of the ultimate refer, an impersonal creativity to which impersonal representations refer, and the universe itself understood as the totality of all finite beings, to which nature-centered religious traditions refer. Given this basic plurality, the different concepts and experiences found in different religious traditions can be taken to be equally valid while retaining even those differences that appear to be mutually contradictory. In order to reconcile such contradictions, conflicting claims can be understood essentially as answers to different questions about the nature of reality (Griffin 2005, p. 48). If this is the case, then claims that at first seem conflicting can be reinterpreted as complementary. The work of pluralistic dialogue would be synthetic, placing claims from different traditions alongside each other to attempt a deeper understanding of the multiform nature of the ultimate.

The aim of such a synthetic approach, though, would not be to construct a new perspective that would incorporate claims from various religious traditions into a larger system, but rather to provide a context in which members of one tradition can both learn about and appreciate the value of other traditions and meaningfully reflect on their own beliefs. Between religions that focus on the same ultimate reality (for instance, the Abrahamic traditions, theistic Indian traditions, and others that posit personal deities), pluralistic dialogue may primarily motivate reflection on and purification of one’s own concepts. Between religions that focus on different ultimates (as in Buddhist-Christian dialogue, the example to which much of Cobb’s work is devoted), dialogue may serve to broaden and enrich the perspective of each participant.

While process approaches tend toward the kind of deep pluralism examined so far, there are exceptions. The differences have to do in large part with the way that one’s ontological commitments are articulated. Cobb’s position is grounded in a Whiteheadian metaphysics that affirms plurality and complexity all the way down; however, Schubert Ogden proposes a position that, ontologically speaking, more closely resembles Hick’s in that Ogden affirms that ultimate reality must be conceived as singular. Following Hartshorne more closely than Whitehead, Ogden maintains that ultimate reality must in itself have a single structure, and he identifies this primarily with the Christian concept of God (1992, p. 47). Proceeding from this basis, it is still possible to affirm the possibility of a plurality of valid religious perspectives, insofar as complete knowledge of the ultimate structure of reality lies beyond the grasp of human experience. However, it seems to be more difficult to proceed to affirmation of the actuality of such pluralism, unless that affirmation takes the general form of Hick’s hypothesis. Alternatively – and this is the position that Ogden seems to support – a process philosophy that posits a singular ultimate reality may support an epistemological pluralism (that is, a rejection of the idea that one can know that there is not more than one valid religious perspective) joined with a broader religious inclusivism in which only that within others’ traditions which is “substantially” consonant with one’s own religious commitments is recognized as true (Ogden 1992, p. 102).

7. Liberation Perspectives

Liberation theology, which advocates a religious duty to aid those who are poor or suffering other forms of inequality and oppression, has had a significant influence on recent discussions of pluralism. The struggle against oppression can be seen as providing an enterprise in which members of diverse religious traditions can come together in solidarity. Paul F. Knitter, whose work serves as a prominent theological synthesis of liberation and pluralist perspectives, argues that engaging in interreligious dialogue is part and parcel of the ethical responsibility at the heart of liberation theology. He maintains not only that any liberation theology ought to be pluralistic, but also that any adequate theory of religious pluralism ought to include an ethical dimension oriented toward the goal of resisting injustice and oppression.

Knitter claims that, if members of diverse religions are interested (as they should be) in encountering each other in dialogue and resolving their conflicts, this can only be done on the basis of some common ground. Yet in contrast to a position such as Hick’s, which posits a common noumenal reality toward which different religious traditions are oriented but which none can definitively represent, Knitter thinks this common ground needs to be neither transcendent nor already existing. In fact, the most meaningful interreligious encounters can spring from constructing shared responses to particular situations. What is necessary is that such responses react to experiences or phenomena that are more or less universal, and suffering is just such a universal phenomenon. It provides a common cause with which diverse religious traditions are concerned and towards which they can come together to craft a common agenda. Particular instances of suffering will, of course, differ from each other in their causes and effects; likewise, the practical details of work to alleviate suffering will almost necessarily be fleshed out differently by different religions, at different times and in different places. Nevertheless, Knitter maintains that suffering itself is a cross-cultural and universal phenomenon and should thus serve as the reference point for a practical religious pluralism. Confronting suffering will naturally give rise to solidarity, and pluralist respect and understanding can emerge from there.

Knitter does not limit his argument only to confrontations with suffering that fall within the scope of human ethics or politics. He extends his claim to encompass the entire earth, insofar as this is the shared horizon within which not only humans but all creatures coexist. Earth not only serves as a common physical location for all religious traditions, but it also provides these traditions with what Knitter calls a “common cosmological story” (1995, p. 119). That is, the earth is the focal point for our modern scientific knowledge of the cosmos, and this knowledge has become so all-encompassing that it now unavoidably places the entirety of creation alongside humanity in its narrative. On this basis, Knitter makes a case that different religious traditions share an ecological responsibility and that awareness of this shared responsibility, as it continues to emerge, can also serve as a basis for mutual understanding.

A few possible criticisms of Knitter’s liberationist view are worth mentioning. First, while it may be difficult to dispute that suffering in some form or other is actually a universal phenomenon, it is not apparent that the particular forms of suffering that arise in particular circumstances bear enough commonality to ground the kind of deep, cross-cultural, and interreligious solidarity that Knitter maintains they will. At least, it seems to be going too far to claim that such solidarity will arise automatically. Similarly, while it may be reasonable to presuppose that religious communities will respond to suffering in solidarity with those who suffer (or to prescribe that they should do so), this certainly is not and has not been the case universally. One can all too easily point out the many instances of one religious community suffering at the hands of another. Even in cases where members of multiple religions agree that a particular instance of suffering ought to be alleviated, it is conceivable different religions will respond to the same instance of suffering in different, conflicting ways. While some form of justice is a central value in most religious traditions, the ways in which such a value is understood and practiced can vary considerably. By basing a pluralist approach on solidarity in response to suffering, one runs the risk of not giving sufficient attention to the diversity of actual forms justice can take. As Heim points out, treating others as we would like to be treated does not necessarily equate to treating others as they would like to be treated, and it is not obvious that there is an objective standard by which one’s response to suffering can be evaluated (2006, p. 96). It would then perhaps benefit advocates of the liberationist version of pluralism to posit that even justice is a concept capable of critical examination and reinterpretation in the context of interreligious dialogue (Suchocki 1987, p. 159).

With regard to Knitter’s appeal to a common cosmological story, it is unclear to what degree such a narrative is actually held in common, especially among diverse religions. While cosmological narratives grounded in broadly accepted scientific knowledge are certainly widely accepted, there just as certainly remain communities of those who reject such narratives – in many cases, precisely for reasons having to do with religious belief and practice. If it is still the case that some religious traditions do tell largely different stories about the world, it seems problematic to take it as given that “the earth” or “the universe” forms the basis for a common cosmological narrative. Knitter’s optimism regarding an emerging awareness of a shared ecological responsibility might thus be premature.

One possible way of addressing these concerns is to build a liberationist approach to pluralism on the basis solely of particular, concrete struggles rather than the idea of a universal struggle against suffering in general. In a particular situation, the needs of an oppressed group can be addressed on their own terms, and the responses offered by religious communities already involved in the situation can serve as a starting point for fighting injustice and working toward liberation. Of course, this is only a starting point, since the responses of communities already engaged may be found wanting either by those who are suffering or by other communities who enter the struggle later. A pluralist approach would maintain that such involvement by religious outsiders be held open as a possibility; this is the function of the liberationist appeal to solidarity in struggle. The religious outsider may be motivated to work on behalf of the oppressed by commitments that differ from those of the oppressed, though, and the pluralist would hold that these differences ought to be respected. At the same time, the situation of the particular struggle provides a concrete context within which members of different religious communities can achieve better understanding of each other in their difference from each other. Any solidarity and mutual respect achieved would, in this account, be contingent and perhaps not easily transposable to other particular contexts – though this remains as a possibility.

8. Conclusion

Religious pluralism, understood as a broad category of philosophical and theological responses to religious diversity, aims to account for this diversity as a positive phenomenon and to articulate ways that religious differences can be celebrated and conflicts mitigated, explained, or at least reasonably discussed. Pluralist positions can vary according to one’s understanding of religion (for example, whether it is taken primarily to consist of epistemic content, culturally constructed discursive practices, or salvation-oriented behavior), as well as according to one’s ultimate goal in articulating a position (for example, clarifying philosophical concepts of religion, or effecting social and political change such as in liberation theology). While there are significant differences in pluralist approaches evident in analytic and continental philosophy, there is also significant overlap in the content of arguments belonging to these traditions.

Major nineteenth and early twentieth century accounts of religion provide important precursors to religious pluralism, though they are largely not pluralist according to the strict sense of the term but rather exclusivist or inclusivist. Religious pluralism as a distinct philosophical and theological position has emerged more recently, and in its various forms it both draws on and is critical of these earlier accounts. Pluralism, of course, continues to be debated. It faces external challenges from exclusivists and inclusivists as well as religious anti-realists and relativists, and its various arguments are contested internally by those who argue that it concedes too much or that it has not yet become pluralist enough.

9. References and Further Reading

  • Alston, William. “Religious Diversity and the Perceptual Knowledge of God.” Faith and Philosophy 5.4 (1988): 433-448.
    • Offers an exclusivist argument that it is not irrational to continue to form and hold socially established religious beliefs (that is, those belonging to this or that religious tradition), even though the fact of religious diversity (particularly epistemic conflict) may decrease confidence in one’s beliefs.
  • Alston, William. Perceiving God: the Epistemology of Religious Belief. Ithaca, NY: Cornell UP, 1993.
    • A detailed (and significant) examination of the perception of God and the way in which such perception can provide grounds for religious belief. Alston’s response to the problem of diversity as laid out in his earlier article is included here, in the context of a more thorough discussion of its epistemological presuppositions.
  • Basinger, David. Religious Diversity: A Philosophical Assessment. Burlington, VT: Ashgate, 2002.
    • An overview and critical analysis of a variety of issues and perspectives regarding religious diversity understood primarily as epistemic conflict.
  • Bogardus, Tomas. “The Problem of Contingency for Religious Belief.” Faith and Philosophy 30.4 (2013): 371-392.
    • A criticism of the argument (particularly as advanced by John Hick and Philip Kitcher) that many religious beliefs are held due to contingent factors irrelevant to the truth or warrant of the beliefs. Bogardus attempts to show that several common forms of this argument rely on invalid logical inferences.
  • Byrne, Peter. Prolegomena to Religious Pluralism: Reference and Realism in Religion. New York: St. Martin’s, 1995.
    • Covers ontological, epistemological, and semantic aspects of religious pluralism, arguing that pluralism is plausible given a realist account of religious language.
  • Dean, Thomas, ed. Religious Pluralism and Truth: Essays on Cross-Cultural Philosophy of Religion. Albany: SUNY, 1995.
    • Contains essays addressing religious diversity and pluralism from a variety of perspectives, discussing questions of truth criteria, dialogue, and interpretation. Several essays take up the hermeneutic model of pluralism.
  • Derrida, Jacques. “Hospitality.” Acts of Religion. New York: Routledge, 2002.
    • Provides a sustained deconstructive account of hospitality as it relates to religion, with a lengthy analysis of the inter-religious work of Louis Massignon.
  • Derrida, Jacques. The Gift of Death. 2nd ed. Chicago: U of Chicago P, 2008.
  • Derrida, Jacques and Anne Dufourmantelle. Of Hospitality. Stanford, CA: Stanford UP, 2000.
    • Contains a deconstructive account of hospitality broadly construed, as well as more specifically in political and legal contexts, with reference to Kant.
  • Esack, Farid. Qur’an, Liberation, and Pluralism. Rockport, MA: Oneworld Publications, 1997.
    • A liberation-theological account of pluralism from a Muslim point of view, set in the context of the South African struggle against apartheid.
  • Feldman, Richard. “Reasonable Religious Disagreements.” Philosophers without God. Ed. Louise M. Antony. New York: Oxford UP, 2007.
    • A defense of the skeptical view that, in light of the diversity of conflicting religious belief claims, no one is justified in holding any set of such beliefs as true.
  • Fletcher, Jeannine Hill. Monopoly on Salvation? A Feminist Approach to Religious Pluralism. New York: Continuum, 2005.
    • A feminist theological perspective on religious pluralism, emphasizing the intersectionality and hybridity of religious identities.
  • Gadamer, Hans-Georg. Truth and Method. 2nd revised ed. New York: Continuum, 2002.
    • The classic foundational text for twentieth-century hermeneutic philosophy.
  • Gellman, Jerome. “Religious Diversity and the Epistemic Justification of Religious Belief.” Faith and Philosophy 10.3 (1993): 345-364.
    • A defense of an exclusivist position concerning “evidence-free” belief in the face of religious diversity, on the basis of the foundational nature of at least some religious beliefs.
  • Gellman, Jerome. “In Defence of a Contented Religious Exclusivism.” Religious Studies 36.4 (2000): 401-417.
    • An extension and further defense of Gellman’s exclusivist position against more recent criticisms.
  • Griffin, David Ray, ed. Deep Religious Pluralism. Louisville, KY: Westminster John Knox, 2005.
    • A collection of essays articulating process approaches to religious pluralism, including contributions by and about John Cobb.
  • Griffiths, Paul J. An Apology for Apologetics. Maryknoll, NY: Orbis, 1991.
    • An argument that religious pluralism implies obligations for both “negative” and “positive” apologetic arguments concerning one’s own religious commitments.
  • Gross, Rita M. “Feminist Theology as Theology of Religions.” The Cambridge Companion to Feminist Theology. Ed. Susan Frank Parsons. New York: Cambridge UP, 2002.
    • An argument that feminist theology ought to incorporate more religious diversity and that pluralist philosophy and theology ought to engage more with feminist perspectives.
  • Hegel, G. W. F. Lectures on the Philosophy of Religion. Ed. Peter C. Hodgson. 3 vols. New York: Oxford UP, 2007.
    • Hegel’s monumental treatment of the concept and history of religion; this edition gathers three versions of his lectures, given in 1824, 1827, and 1831. Volume one focuses on the concept of religion, volume two on the various forms it has taken historically, and volume three on what Hegel calls the “Consummate Religion” (that is, Christianity).
  • Heim, S. Mark. Salvations: Truth and Difference in Religion. Mayknoll, NY: Orbis, 2006.
    • A criticism of the versions of pluralism offered by Hick, Knitter, and Wilfred Cantwell Smith, and a pluralist account that posits a diversity of ultimate religious ends.
  • Hick, John. An Interpretation of Religion. 2nd ed. New Haven, CT: Yale UP, 2004.
    • Hick’s classic account of the “pluralistic hypothesis;” his introduction to the second edition contains responses to some of the criticisms his arguments have received.
  • Hick, John and Paul F. Knitter, eds. The Myth of Christian Uniqueness. Maryknoll, NY: Orbis, 1987.
    • An important early collection of essays by Christian theologians advocating pluralistic approaches to religious diversity.
  • James, William. The Varieties of Religious Experience. New York: Routledge, 2002.
    • The influential account of religion in terms of diverse personal experiences and emotions.
  • Kant, Immanuel. Religion within the Bounds of Bare Reason. Trans. Werner S. Pluhar. Indianapolis: Hackett, 2009.
    • Kant’s classic formulation of religion as properly concerning human morality, positing religious diversity as historically inevitable but nevertheless inessential.
  • Kearney, Richard and James Taylor, eds. Hosting the Stranger: Between Religions. New York: Continuum, 2011.
    • A collection of essays approaching religious pluralism in terms of hospitality, from a variety of religious traditions.
  • Kitcher, Philip. “Challenges for Secularism.” The Joys of Secularism. Ed. George Levine. Princeton, NJ: Princeton UP, 2011.
    • An argument for the necessity of developing secularism as a positive doctrine and way of life, which posits that the primary challenge secularism makes against religious belief is the argument from the epistemic symmetry of conflicting beliefs.
  • Knitter, Paul F. One Earth, Many Religions. Maryknoll, NY: Orbis, 1995.
    • A significant liberation-theological approach to religious pluralism from a Christian perspective.
  • Kwok, Pui-Lan. “Feminist Theology as Intercultural Discourse.” The Cambridge Companion to Feminist Theology. Ed. Susan Frank Parsons. New York: Cambridge UP, 2002.
    • A review of the ways that feminist theology has been influenced both by the particular experiences of women in diverse cultures and by dialogue among these cultures. The chapter also offers a critique of Eurocentric tendencies in feminist theology and a plea for greater attention to injustices exacerbated by globalization.
  • Levinas, Emmanuel. “Ethics as First Philosophy.” The Levinas Reader. Ed. Seán Hand. Cambridge, MA: Blackwell, 1989.
    • A summary of Levinas’s central philosophical argument, which aims to reorient earlier phenomenological and hermeneutic positions (particularly those of Edmund Husserl and Martin Heidegger) towards greater concern for ethics.
  • Lindbeck, George A. The Nature of Doctrine. Louisville, KY: Westminster John Knox, 1984.
    • An influential account of religion and theology that articulates two distinct models: the experiential-expressive and the cultural-linguistic. Chapter three deals particularly with the issue of religious diversity.
  • Mavrodes, George I. “Polytheism.” The Rationality of Belief and the Plurality of Faith: Essays in Honor of William P. Alston. Ed. Thomas D. Senor. Ithaca, NY: Cornell UP, 1995.
    • A critical reading of Hick’s “pluralistic hypothesis” that argues that this hypothesis demonstrates a “descriptive polytheism,” despite Hick’s positing the unity of the Real.
  • McKim, Robert. On Religious Diversity. New York: Oxford UP, 2012.
    • An examination and critical appraisal of a variety of approaches to religious diversity, focusing on epistemological and soteriological concerns and ultimately advocating a version of inclusivism.
  • Ogden, Schubert M. Is There Only One True Religion or Are There Many? Dallas: SMU P, 1992.
    • A process-theological account of pluralism that challenges the idea, put forward by other process thinkers, that ultimate reality must be plural in order for religious pluralism to be a tenable position.
  • Plantinga, Alvin. “Pluralism: A Defense of Religious Exclusivism.” The Rationality of Belief and the Plurality of Faith: Essays in Honor of William P. Alston. Ed. Thomas D. Senor. Ithaca, NY: Cornell UP, 1995.
    • As the title suggests, a defense of a version of exclusivism against charges that it is either epistemologically or morally objectionable.
  • Plantinga, Alvin. Warranted Christian Belief. New York: Oxford UP, 2000.
    • An important and extensive epistemological examination of the idea of warrant: that is, that quality of a belief that is accorded to it due the proper function of cognitive faculties. The book includes consideration of possible “defeaters” of theistic (and specifically Christian) belief, including religious diversity. Plantinga maintains that a version of exclusivism remains tenable even given such diversity.
  • Quinn, Philip L. “Toward Thinner Theologies: Hick and Alston on Religious Diversity.” Essays in the Philosophy of Religion. New York: Oxford UP, 2006.
    • Provides critical but mostly positive accounts of Hick’s and Alston’s respective positions, arguing that a revised version of Hick’s hypothesis ought to be considered as rational within the framework of Alston’s approach.
  • Quinn, Philip L. and Kevin Meeker, eds. The Philosophical Challenge of Religious Diversity. New York: Oxford UP, 2000.
    • A wide-ranging collection of (mostly analytic-philosophical) essays that presents a variety of exclusivist, inclusivist, and pluralist accounts of religious diversity.
  • Race, Alan. Christians and Religious Pluralism. London: SCM P, 1983.
    • The Christian theological account of pluralism that introduced the categories of exclusivism, inclusivism, and pluralism.
  • Ricoeur, Paul. “Religious Belief: the Difficult Path of the Religious.” Passion for the Possible. Eds. Brian Treanor and Henry Isaac Venema. New York: Fordham UP, 2010.
    • One version of Ricoeur’s account of interreligious dialogue as translation.
  • Rowe, William. “Religious Pluralism.” Religious Studies 35.2 (1999): 139-150.
    • A criticism of Hick’s “pluralistic hypothesis” focused on Hick’s claim that, given pairs of contradictory properties, the Real in itself does not possess either one.
  • Ruether, Rosemary Radford. “Feminism and Jewish-Christian Dialogue.” The Myth of Christian Uniqueness. Ed. John Hick and Paul F. Knitter. Maryknoll, NY: Orbis, 1987.
    • An examination of possibilities for feminist pluralism in the context of, on the one hand, Jewish-Christian dialogue and, on the other hand, particular feminist criticisms of each of these two traditions.
  • Runzo, Joseph. “God, Commitment, and Other Faiths: Pluralism vs. Relativism.” Faith and Philosophy 5.4 (1988): 343-364.
    • An articulation and defense of religious relativism as an alternative to pluralism.
  • Schellenberg, J. L. “Religious Experience and Religious Diversity: a Reply to Alston.” Religious Studies 30.2 (1994): 151-159.
    • An argument that Alston’s response to the problem of diversity is unsuccessful and that, in cases of significant epistemic conflict, justification for religious belief is removed.
  • Schleiermacher, Friedrich. On Religion: Speeches to its Cultured Despisers. Ed. Richard Crouter. New York: Cambridge UP, 1996.
    • The influential account of religion that distinguishes it from both metaphysics and morality, and grounds it in an individual intuition of the absolute. The fifth and last speech concerns religious diversity; Schleiermacher argues that such diversity is necessary.
  • Suchocki, Marjorie Hewitt. “In Search of Justice: Religious Pluralism from a Feminist Perspective.” The Myth of Christian Uniqueness. Ed. John Hick and Paul F. Knitter. Maryknoll, NY: Orbis, 1987.
    • An argument that feminist theology, out of its commitment to work against oppression, must affirm a pluralist perspective. The author draws on liberation theology to posit justice as the foundation on which interreligious dialogue can happen.
  • Tracy, David. Dialogue with the Other: the Inter-Religious Dialogue. Grand Rapids, MI: W. B. Eerdmans, 1991.
    • A hermeneutical approach to religious pluralism primarily from a Christian point of view, though it incorporates perspectives from other traditions.
  • Westphal, Merold. “The Politics of Religious Pluralism.” The Proceedings of the Twentieth World Congress of Philosophy 4 (1999): 1-8.
    • A brief critique of the pluralist approach to religious diversity, which argues that ethical and political norms found in religious traditions – particularly the commitment to non-violence – can help dissociate the cognitive content of religious belief claims from “religiously sanctioned” violence.
  • Wittgenstein, Ludwig. Lectures and Conversations on Aesthetics, Psychology, and Religious Belief. Berkeley, CA: U of California P, 1966.
    • Wittgenstein’s lectures on religious belief have been influential on positions that argue that religious diversity is not primarily an epistemic or cognitive matter, especially the cultural-linguistic model favored by Lindbeck and the aspect of Hick’s argument that focuses on practice rather than belief.

 

Author Information

Michael Barnes Norton
Email: mbnorton@ualr.edu
University of Arkansas at Little Rock
U. S. A.

Philosophy of Biology

Philosophy of biology is the branch of philosophy of science that deals with biological knowledge. It can be practiced not only by philosophers, but also by scientists who reflect on their own work. The distinctive mark of philosophy of biology is the effort to achieve generalizations about biology, up to various degrees. For instance, philosophy of biology makes biology relevant to classic issues in philosophy of science such as causation and explanation, chance, progress, history, and reductionism. It also works to characterize how knowledge is acquired and modified in different areas of biology, and sometimes to clarify the criteria that demarcate science from non-science.

Philosophy also performs constructive criticism of biology. For example, it has an important role in analyzing cases of “naturalization”—when science becomes able to study issues that traditionally were the exclusive domain of philosophy. The life sciences and their objects are changing and growing exponentially. A challenge for philosophy of biology is thus to keep the pace, not only with new knowledge modifying long-standing ideas (for example, the “Tree of Life”), but also with new scientific practices and unprecedented kinds of data. Accordingly, philosophy of biology is constantly provoked in shifting its own methods and attention. In some cases, philosophy of biology can aid the life sciences to reach their goals, by means of conceptual analysis, linguistic analysis, and epistemological analysis.

Hybridizations and intersections between scientific fields are particularly conducive to philosophical considerations. Contemporary examples are ‘EvoDevo’ (the recent integration between development and evolution) and ‘cultural evolution’ (an approach to cultural change inspired by evolutionary biology). Theses and analyses of philosophy of biology are often entwined with history of biology and with the history of evolution. Finally, philosophy of biology can elaborate messages and general views out of biology, and has a crucial role in caring for how science is publicly interrogated and communicated.

Table of Contents

  1. Introduction
  2. General Issues in Philosophy of Biology
    1. General Problems in Philosophy of Science, as Seen in Biology
    2. A General Picture of Biology
    3. A General Picture of Science
    4. Generalization as a Possible Distinctive Feature of Philosophy with Respect to Biology
  3. Philosophy Flanking Biology
    1. Clarifying Taxonomy, Classification, Systematics, Phylogeny, Homology
    2. Formulating Natural Selection
  4. Who Can Do Philosophy of Biology?
    1. Philosophical Biologists
      1. Mayr and Population Thinking
      2. Gould and Adaptationism
    2. Philosophical Issues Naturalized
      1. An Example: The Biology of Morality
      2. Philosophy Versus Naturalization?
  5. Philosophy Bringing the Life Sciences out of Their Research Context
    1. Philosophy of Biology at Intersections
    2. Biology’s Critical Friend
    3. Developing Messages from Biology
  6. Scientifically Up-to-Date Philosophy
    1. Questioning Influential Ideas
    2. Understanding New Scientific Practices
    3. Rethinking the Philosophical Approach from New Ways of Doing Science
  7. History and Philosophy of Biology
  8. Conclusion
  9. References and Further Reading
    1. Cited Examples
    2. Classics
      1. First Generation
      2. Second Generation
    3. Contemporary
      1. Reviews
      2. Some Monographs
    4. Anthologies and Textbooks
    5. Journals
      1. Dedicated
      2. Generalist
    6. Organizations
    7. Online Resources

1. Introduction

According to several reconstructions of the history of philosophy of biology, the field emerged gradually in the 1960s with a first generation of self-identified philosophers of biology, especially Morton Beckner, David Lee Hull, Marjorie Grene, Kenneth Schaffner, Michael Ruse, and William C. Wimsatt. As an explanation for such branching of philosophy of science, some philosophers put forth the decline of logical positivism in the 1960s and 1970s. For others, logical positivism did not actually decline, and anyway it had never suppressed philosophy of biology (Callebaut 1993). At times, the ‘official’ chronology gets questioned. For Byron (2007), proper philosophy of biology was already there in early philosophy of science, since the 1930s, as shown by a bibliometrical analysis. The most quoted philosopher in this article is David Lee Hull (1935-2010). He is a noncontroversially important figure in the founding generation of philosophers of biology. His meta-reflective papers “What philosophy of biology is not” (1969) and “Recent philosophy of biology” (2002) are particularly useful.

Philosophy of biology turned into a professional subdiscipline since the mid-1970s, with a ‘second generation’ of philosophers, the most cited being Ronald Amundson, John Beatty, Robert N. Brandon, Richard Burian, Lindley Darden, David J. Depew, John Dupré, James R. Griesemer, Philip Kitcher, Elisabeth A. Lloyd, Alexander Rosenberg, Elliott Sober, and Bruce H. Weber. Some of them were experienced philosophers who progressively shifted to biological issues. The first journal partially devoted to philosophy of biology – History and Philosophy of the Life Sciences – began to be published in 1979, and in the mid-1980s the discipline was fully established. Specialized journals flourished. In the early 2000s, a growing number of scholars, institutions, and journals specialized in philosophy of biology, and the discipline gained more and more room in scientific books, journals, and conferences (see the resources at the end of the article).

As we shall see, philosophy of biology provides accounts of biological knowledge, asking: how are explanation, causation, evidence and other epistemological primitives elaborated in the explanations that are typical of biology, such as natural selection, genetic drift, and homology? Does biology differ from other sciences? How? And how do we understand the epistemological diversity across different branches of the biological sciences? Philosophy of biology also considers whether biology may contribute to redefine classical demarcations   of science from other forms of knowledge and human creation.

Philosophy of biology can be seen as a possible aid for scientific advancement in the life sciences. Contributions of philosophers were widely appreciated by scientists, for example, in the areas of classification, taxonomy, and related activities, and in the abstract formulation of natural selection in the development of biology after Darwin. Scientists themselves may reflect philosophically on their own field of research, justifying and correcting their practices, or denouncing biases and transformations in their own community. Concepts, such as ‘adaptation’ or ‘species,’ are underlain by complex, inferential structures that can be revealed and sometimes criticized by philosophical analysis. Multiple and conflicting meanings may be uncovered and systematized to help the progress of science and to develop more general messages.

Phenomena studied by biology make this science particularly sensible and interesting for philosophy. Humans are organisms, and quite a few fields of biology have potential or direct implications for our self-understanding. Interesting philosophical debates have stemmed, for example, since the 1970s from the provocative proposal of a ‘sociobiological synthesis’; such synthesis claims to provide evolutionary explanations for human prosocial (and anti-social) behaviors that were traditionally covered by ethics. Philosophy overcame mere self-defensive attitudes, and its important role lied in epistemological analysis and in deep reflections on the limits and conditions of naturalization, which may be understood as the transition of a problem into the domain of empirical science. Neurobiology offers a particularly fertile ground for reflections about how human phenomena can be related to, or even explained by, biology. And how should a philosophical field like moral philosophy take biology into account? (For more on the topic of the naturalization of morality, for example, see ethics.)

Philosophy of biology may study and support the interaction among different life sciences, as in the case of evolutionary developmental biology, where workers claim to be reuniting genetics and evolution with embryology, recomposing a historical divide in biology. How do different research traditions integrate or replace each other? This question illuminates classic issues such as progress and scientific change with new light. Philosophy of biology also monitors the natural hybridization of biology with extra-biological fields, such as cultural transmission, and enriches the debate among scientists where extreme positions often pop out: does biology offer more rigorous methods to replace the failing methods of the social sciences? Are we facing, instead, a case of mutual inspiration? Or methodological integration? Which reciprocal prejudices are well-grounded? And how can they be overcome for fruitful scientific collaborations?

Philosophy of biology also has a mandatory critical role towards biology. For example, it can unveil the progressionist, anthropomorphic, and anthropocentric biases that affect scientists as human beings who live immersed in a society and in a cultural environment. Critical attention must be particularly high when scientific classifications of humans (for example, through measures such as IQ or ethnicity) may lead to justify and increase social discrimination, segregation or oppression.

Philosophy of biology may also develop ways of thinking up from biological research, providing an inspiring and readable encompassing view of the living world that will hardly be found in any standard, scientific publication. Furthermore, philosophy of biology is called upon to work on the interface between science and society, contributing to both the common misunderstandings and the best strategies for citizens to become conscious and informed, as they are called to decide what kind of research and intervention will be allowed or actively pursued by society.

It is hard for philosophy of biology to keep pace with the fast development of biological knowledge. But the effort of following the moving frontier of knowledge allows philosophy of biology to study the fall of influential ideas, such as the universal Tree of Life, and the rise of new scientific practices, such as intensive computer modeling. Philosophy also has the unsettling opportunity to constantly rethink its own approach, avoiding drifting too far away from scientific practice so as to become detached. In this dynamic, philosophy of biology is also well integrated with history of science, so that it is often hard to distinguish between the two. An analysis of the relationship between molecular biology and Mendelian genetics, for example, is intertwined with the historical account of the birth and early development of molecular biology in the 1980s. In turn, the philosophical framing of genetics and developmental biology as either ontology-based disciplines or research styles transforms radically the way in which the history of the two fields is told.

Philosophy of biology belongs to philosophy, therefore, no fixed procedure or protocol constrains its research (what is philosophy?). Philosophy of biology consists in free and critical — although rigorous and informed — thought on biological knowledge as the latter develops through time. However, as a mature and recognized field with its own interconnected practicing community, philosophy of biology seems to feature some methodological principles:

  • Philosophy of biology is supposed to be scientifically informed and up-to-date, capturing how recent research modifies established knowledge and creates new scientific practices. In turn, these novelties transform philosophy’s problems and approaches, especially in the current explosive growth of biology.
  • Philosophy and biology are not always clearly distinct. Scientific work can routinely require, for example, conceptual or epistemological On the other hand, philosophy can turn out to be effective in setting up scientific research projects. However, philosophy can be characterized by its leaning towards generalities about biology, namely general philosophical problems, general characterizations of fields and approaches within biology, or conceptualizations of biology as a whole or even of science as a whole.
  • Philosophy of biology should try to be understandable and possibly useful to biologists. Its tools — conceptual analysis, epistemology, traditions of thinking and debates — should be put to use for improving scientific research.
  • Biologists can do philosophy of biology. This happens, for example, when they become interested in general features of biology and try to contribute with principles derived from their work or when they think about the inferential patterns employed by themselves and their colleagues. Also, scientists can do philosophy and speak to philosophy when particular objects of philosophical study, such as human morality, get naturalized (see below).
  • Philosophy of biology cares for working across disciplinary contexts. For example, it studies novel contacts between previously separated fields, develops general views of the living world from some aspect of the life sciences, or reveals complex connections between science and the socio-cultural context in which it is carried out. It also takes advantage of its knowledge for monitoring and assisting how science is publicly communicated and interrogated.
  • Philosophy of biology is increasingly seen as one piece with history of biology, since philosophical and historical theses are mutually necessary, and their results reverberate reciprocally.

These six methodological principles are usually tacit, but sometimes they are made explicit by philosophers of biology, who may also disagree on some of them. The principles will be presented here by means of exemplar studies. Any set of examples is anyhow partial and biased, since philosophy of biology is a huge field full of fascinating topics, growing exponentially along with biology. For a more complete picture, the interested reader will have to navigate the resources listed at the end of the article, such as philosophy of biology journals or programs of conferences such as the biennial meetings of the International Society for the History, Philosophy, and Social Studies of Biology, the main reference society of the field. A number of textbooks in philosophy of biology are available, often in the form of anthologies. A list of all these resources is provided at the end for further reading.

Given the vastness of the philosophy of biology literature, this article can only indicate some of the main topics and the richness of discussions. The examples in this article are mainly focused on evolutionary biology. Evolutionists such as Ernst Mayr (1904-2005) and Stephen Jay Gould (1942-2002), two of the most influential authors, are extensively treated in this article though they are not universally representative. The predominance of evolution can be justified not only by the author’s specialization, but also by the fact that—as many philosophers of biology have critically stressed in recent times—evolutionary theory has long been the main target of philosophy of biology. Only in the last few decades has this situation changed radically (Müller-Wille 2007, Pradeu 2009). Philosophy of biology is already tackling an enormous range of topics in the most disparate fields, from biomedicine to community ecology, from neurobiology to microbiology and microbial ecology, and from chemistry and biochemistry to exobiology. To look for some specific area, the interested reader is, once again, encouraged to venture into the journals and online resources.

2. General Issues in Philosophy of Biology

Philosophers of science (though not always under this fairly recent name) have reflected for centuries on explanation, causation, correlation, chance, and many other general topics concerning science or knowledge in general. Important philosophers contributed to concepts like reduction vs. multiple realizability, and provided theories of explanation that describe what a scientific explanation is. During the second half of the 20th century, philosophy of science adopted a pluralistic strategy, considering the diversity of scientific disciplines and methods and striving to understand their differences along with their common aspects. With this pluralization, the complex task remained of finding a satisfying description of science as an endeavor which—unitary or not—is distinct from other forms of knowledge.

The study of living beings offers a universe of occasions for philosophy of science to advance the reflection. For instance, explanations by natural selection and by drift (see below, 2.a) can be seen as instances of causal explanations that nonetheless bring about new reflections and conceptual puzzles on the classic issues of causation, randomness, and ontology of processes. Homology explanations, another typical feature of biology, explain the properties or the variability of a biological character by citing the ancestral character or organ, and the causal factors that historically modify the descendants of that ancestral organ. In trying to account for biological sciences, philosophy of biology may take concepts from philosophy of science, such as causal explanation or reduction, and find new putative cases of them in the life sciences or locate failures of reduction in biology. Other times, philosophy of biology may need to tailor new concepts to accommodate biology. In fact, some kinds of explanation seem peculiar to biology or to historical sciences. While chasing the peculiarities of biology, philosophy of biology also has some general research goals among its aims:

  1. Often, philosophy of biology scours biology in search of new insights on general philosophical problems about science, such as the “problem of induction,” or the realism vs. instrumentalism dichotomy. Additionally, new general problems arise from the particular forms that explanation, causation, or reduction take in biology.
  2. Sometimes philosophy of biology seeks general characterizations of particular fields, practices, or ways of thinking within the life sciences Other times, the goal seems to be a general picture of biology, especially by contrast to other sciences, such as classical mechanics.

Sometimes philosophy of biology suggests general views of science, descriptions and characterizations of science with all the complexity, differentiation, and plurality that it exhibits in the contemporary world. In fact, the classical task of a demarcation of true science from other forms of knowledge has lost importance under the effect of philosophy of ‘special’ sciences like biology (Fodor 1974).

a. General Problems in Philosophy of Science, as Seen in Biology

Natural selection is a major biological explanation for the features of organisms. Inherited traits, originated by cumulative retention of random variation, are there because of the positive contribution they have brought to their bearers in past generations, in terms of survival and reproduction. Yet, the explanatory structure of natural selection is very complex, and implies a reflection on concepts like causality and randomness. In a classic book, Sober (1984) pointed out that only a few of the traits that get selected are in fact explanatory, specifically those traits that are selected for. Other traits are free riders that are somehow correlated with traits that are selected for and thus are preserved in the population without actively contributing to fitness. Thus, there is selection of such free rider traits that are not causally relevant to survival and reproduction. Hearts are positively selected; heartthrob is also selected, but not selected for; efficiency in pumping blood is selected for; the existence of heartthrob is thus explained by natural selection, but heartthrob is not explanatory per se; it undergoes selection of, not selection for. The idea of free riders on selection was already considered by Darwin, but philosophers of biology spelled out its consequences for the explanatory structure of natural selection. Causal relationships are the core of some theories of explanation. Sober proposed rethinking the idea of causality in light of evolutionary biology, and this is an example of how classic philosophical categories can be modified in their application to biology: “We must show that by considering evolutionary theory, old problems can be transformed and new problems brought into being. It remains to be seen, I think, how radically the philosophy of science will be reinterpreted” (Sober 1984: 7; see Matthen and Ariew 2009, Ramsey 2013).

Randomness and chance are very important in biology. Natural selection is not random: “it requires randomness as its ‘input’…but the ‘output’ of natural selection is decidedly nonrandom, the differential survival and reproduction of the variants that are better adapted” (Rosenberg and McShea 2008: 21). Philosophers worked, for example, on the meaning of “random mutation,” a concept considered essential to Darwinism as opposed to Lamarckism (Merlin 2010). Random does not necessarily mean lacking deterministic causes. Rather, random mutation points to the fact that the usefulness of a trait in the environment where it appears is not among the causes of its appearance. The source of variation is thus more properly contingent with respect to fitness. An interesting line of reasoning and evidence points out that “evolutionary divergence is sometimes due to differences in the order of appearance of chance variations, and not to differences in the direction of selection” (Beatty 2010: 39).

Genetic drift is the predictable change of the frequencies of traits that are not under selection. The absence of selection makes the dynamic depend only on the reproduction mechanism, and although the fate of individual traits is not predictable, the overall landscape of frequencies is. Genetic drift has long been known in formal models, studied in the field, and used for evolutionary reconstruction: it is not only necessary, but also causally relevant to evolution. Yet, a lively philosophical debate exists on the ontology of drift and selection (Millstein 2002; Walsh et al. 2002). A major disagreement concerns the epistemic status of mathematical models: granted that drift is a necessary feature of mathematical models, what can be legitimately inferred about the existence of a process in the world to be called drift? A statistical interpretation sees both drift and selection as mathematical features of aggregates of individuals (Walsh 2007, Matthen 2009, 2010). Another point of view considers them as causal physical processes (Millstein 2006, Millstein and Skipper 2009). The notion of evidence as a way of choosing between alternative explanatory hypotheses, selection versus drift, for example, is another object of philosophical study. Some philosophers think scientists are more qualified to evaluate and weigh evidence (Hull 1969: 169). Others point out that it is up to philosophy to probe the explanatory limits—current and constitutional—of biology (Rosenberg and McShea 2008: 2). One fortunate approach to such a task is the technical account of evidence based on Bayesianism (Sober 2008). Bayes’ theorem belongs to the mathematics of probability theory. It is based on prior probability, the probability of a particular statement before the observation, and posterior probability, the probability of the statement in the light of the observation. Bayes’ theorem is used by some schools of philosophers of biology in explicating various issues connected with evidence and confirmation.

Homology explanations explain the properties or the variability of a biological character,  the form of a wing or the range of different wing shapes across different groups of organisms, for example, by citing the ancestral character or organ, and the modification factors that affect the descendants of that ancestral organ. Like many modes of explanation in biology, homology explanations are historical (see also 3.a). Philosopher Ereshefsky (2012) compares homology explanations with analogy explanations, which instead explain a character by citing the contribution of that character to a function. Ereshefsky points out that homology explanations are more detailed and offer a better account of observed differences. Homology explanations can be also turned into strong historical explanations, as opposed to weak ones that only cite the ancestral, initial condition. This happens, for example, when detailed molecular studies of the development of the target character enlighten precise events, such as gene duplications, that must have been crucial along the historical path. The study of genetic and developmental pathways also gives access to hierarchical disconnect which, for Ereshefsky, relates homology explanations to classical topics of philosophy of science: multiple realizability and reductionism. Hierarchical disconnect happens when “a homologue at one level of biological organization is caused by non-homologous developmental factors at lower levels of organization” (p. 385). Along the historical path, for example, a morphological structure can remain stable while its underlying developmental modules change (Griffiths and Brigandt 2007). For Ereshefsky, this is a biological example of multiple realizability, that is the fact that one level of organization cannot be reduced to a kind at a lower level. This in turn, for Ereshefsky, counters ideas by Alex Rosenberg about reductionism in biology (Rosenberg 2006). For Rosenberg—Ereshefsky says—the history of homologous characters should be reducible to the history of their physical substrata, but Ereshefsky says hierarchical disconnect shows decoupled histories and multiple realizability.

b. A General Picture of Biology

In characterizing ways of thinking and kinds of explanations and evidence, philosophy of biology formulates generalizations about biology. These generalizations become particularly encompassing when philosophy of biology tries to characterize biology explicitly and comprehensively as distinct from other sciences. Biology is generally considered a ‘special science’—a term inherited from logical positivist philosophy of science that doesn’t preclude, in the long run, a reduction to physics (Fodor 1974, Rosenberg and McShea 2008). Among the most noticed and studied features of biology there is the apparent absence of scientific laws, whose blueprint, so to speak, are physical laws (but see Waters 1998). Many have been the endeavors of describing biology as a science without relying on laws, but also without relegating it as a mere collection and description of singular events where exceptions are the law.

For Ernst Mayr (1982, 2004), biology is based on concepts or principles, which are more flexible than laws: biology is a unique science by virtue of concepts that allow for biological explanation, including inheritance, program, population, variation, emergence, organism, individual, species, selection, fitness, and so on. Biology is also characterized by population thinking, introduced by Darwin, which differentiates biology from mechanics or chemistry whose thinking is, for Mayr, essentialist or typological (for more on this see 4.a.i).

For paleontologist Niles Eldredge biology is based on patterns. Patterns are law-like regularities, consisting of repeated schemes of events. This notion characterizes biology as a historical science, while reducing the gap that, in other views, separates biology from other natural sciences like physics. The pattern of inclusive hierarchical similarity in the biological world was seen by Linnaeus and captured in his binomial nomenclature. Darwin saw more patterns, for example in the geographical distribution of species and varieties. He then discovered a subset of the grand complex of repeated events, or regular processes, that give rise to biodiversity on Earth—the pattern of evolution. Mendel caught some patterns as well, in his observations of inheritance. Patterns have a double nature, ontological and epistemological: “Patterns in the natural world are extremely important.… They pose both the questions and the answers that scientists formulate as they seek to describe the world…. Science is a search for resonance between mind and natural pattern as we try to answer these questions” (Eldredge 1999: 4-5).

Other scholars think of biology as a science of mechanisms. A growing philosophical movement called “the New Mechanism” (Machamer et al. 2000) use the concept to revise classic ideas like causation, discovery, and explanation. In this view, biologists aim to discover and represent mechanisms with their schemas, sketches, and theoretical models. The characterization of natural selection as a mechanism, for instance, has been proposed, but is yet to be resolved (Skipper and Millstein 2005).

Model-based accounts have acquired particular importance in philosophy of biology (Schaffner 1993, 1998). The semantic view of scientific theories in the 1980s (see 2.c) was a first ambitious endeavor to characterize biology as a model-based science. Downes (1992) pointed out the limits of the semantic view in its original formulation, but proposed that philosophers of biology keep some of its central claims, among which are the centrality of various kinds of models in biology and their promise of accounting for all scientific theorizing.

At a lower degree of generality, philosophy of biology offers a proliferation of ideas about how schools of scientists, fields, or approaches perform fundamental activities of science like explaining, describing, understanding, or predicting. Some included the aforementioned and conceptually challenging explanations like natural selection, drift and historical homology explanations. Others include the practices of ecological modeling and model organisms, (6.b) particular inferential patterns such as adaptationism (4.a.ii), and the difference between geneticists and developmental biologists (7). These are all typicalities, or modest generalizations, of sub-parts of biology. Population genetics is a thoroughly studied field from this point of view. By studying population genetics, philosophers have raised wide-ranging topics of reflection, such as the degree of idealization of models and experiments (Plutynski 2005, Morrison 2006). The concept of the possible has played a role in accounting for how biological idealizations can be explanatory: if biological models cannot predict or demonstrate necessity, they can at least restrict the field of what is possible, yielding so-called “how possibly” answers or explanations. For Rosenberg and McShea (2008) it is a “usual scientific” strategy that “scientists try to characterize a range of possible causes of evolution, and then to determine which of these possibilities actually obtained. The actual is first understood by first embedding it in the possible” (p. 13).

Even the tightest case studies usually produce generalizations to a certain degree. Grote and O’Malley (2011), for example, claim that their historical reconstruction of research on microbial rhodopsins

offers a novel perspective on the history of the molecular life sciences…, sheds light on the dynamic connections between basic and applied science, and hypothesis-driven and data-driven approaches [and] provides a rich example of how science works over longer time periods, especially with regard to the transfer of materials, methods and concepts between different research fields. (Grote and O’Malley 2011, p. 1082)

Sometimes generalizations come in negative form. Case studies are counted as evidence that science may not have features that are commonly thought as necessary requirements. Grote and O’Malley (cit.) propose their case against those philosophical descriptions that interpret scientific advancement in terms of general, overarching, and unifying theories. They observe that “productive interactions between different fields of science occur not only through the adaptation of theories, concepts, or models, but very much at the level of materials or experimental systems” (p. 1094, emphasis added).

c. A General Picture of Science

What is science in and of itself? How does it differ from other forms of knowledge? These two sides of the coin were a major motivating question for philosophy of science since its very beginning. Philosophers of biology sometimes take the methodological lessons they learn from biology and tentatively postulate their broader applicability to science (see Griesemer in section 7). However, a general view of science is often far from the explicit goals of philosophy of biology. There are historical reasons for this. In a paper on general philosophy of science, Psillos (2012) traces the problem back to Aristotle, through landmark thinkers like Immanuel Kant and Pierre Duhem, to the Modern era. The problem of demarcation was surely central in logical empiricism, but in the 1970s the idea was spread that “individual sciences are not similar enough to be lumped together under the mould of a grand unified scheme of how science works” (Psillos, cit.: 100). After 1950s—and certainly in the early 1980s, when philosophy of biology became an academic field—there was a consensus on the basic fact that science employs a variety of methods. Indeed, biology, together with other sciences like the human and the social sciences, had shaken the Modern idea of the scientific method.

Nevertheless, some important works have been moved by the need of finding new descriptions of science that would fit biology better than received ones, in order to legitimately include biology among the natural sciences. Political and intellectual movements, such as Intelligent Design, contribute to urging philosophy of biology to tackle the demarcation of science: their strategy includes casting doubts on the scientific status of official biological sciences and presenting alternative, religiously inspired views as scientific theories (Sober 2007, Boudry et al. 2010). The creationist movement is present in the philosophers’ mental map worldwide since at least 1981, when philosopher of biology Michael Ruse was a key witness for the plaintiff in the trial McLean vs. Arkansas, a challenge to the state law permitting the teaching of creation science in the Arkansas school system. The federal judge ruled that the state law was unconstitutional. A significant part of the written opinion (see http://www.talkorigins.org/faqs/mclean-v-arkansas.html) was devoted to philosophical considerations on the epistemological characteristics of science (Claramonte Sanz 2011). Intelligent Design is a new form of scientific creationism, characterized by the wedge strategy, or public insistence on supposed flaws in the established biological sciences combined with supposed scientific demonstrations of an Intelligent Designer of the Universe. While there are different positions about whether philosophers and scientists should directly appear in public debates with creationists, Intelligent Design undoubtedly engages philosophers of biology in reflecting upon possible distinctions within science (between hypotheses, theories, and models, for example), between science and pseudo-science, and between science and religious beliefs. With their competence, philosophers can help scientists in what sociologist Thomas F. Gieryn termed “boundary work” (1999). The need of working on demarcation can also stimulate critical revisions of the patterns of scientific explanation in biology. In a book on evidence, Elliott Sober (2008) stated in a provocatively:

If the only thing that evolutionary biologists do is go around saying ‘that’s due to natural selection’ when they examine the complex and useful traits that organisms have, they are engaged in the same sterile game that creationists play when they declare ‘that’s due to intelligent design’. Assumptions about natural selection of course can be invented that allow the hypothesis of natural selection to fit what we observe. But that is not good enough: the question is whether there is independent evidence for those auxiliary propositions. (Sober 2008, p. 189)

The semantic view of scientific theories is an example of a general account of science that was put to work in justifying evolutionary biology as a model-based science, in face of the inveterate poor fit of the received view based on syntactic theories and universal laws (Lloyd 1983, 1984, 1988). Under the semantic view, models constitute the core of scientific work; theories are to be seen as combinations of families of models plus hypotheses about the empirical scope of those models. This framework contrasts with the received view which described theories as sets of sentences—including laws—about the world. In the semantic view, laws are conceived in a new way: from universal empirical claims about the world, they become specifications of models, embedded in them. An important argument for philosophers of biology in the 1980s was that the semantic view was not developed ad hoc for evolutionary theory: it had already been successful in describing Newtonian mechanics, equilibrium thermodynamics, and quantum mechanics. However, it was also said, these sciences fitted the received view as well. The semantic view was very demanding in terms of formalization, so it worked only for a very limited area of biology Not all scientific work consists in model building, so the semantic view is hardly considered an exhaustive view of science, and things work so differently for different kinds of models (for example, mathematical vs. non-mathematical) that the existence of a unitary view was questioned (Downes 1992). But new versions of the semantic view continue to be the topic of papers and debates in philosophy of biology.

d. Generalization as a Possible Distinctive Feature of Philosophy with Respect to Biology

As we have seen, philosophy of biology formulates various degrees of generalizations about biology. These generalizations may concern biology as a whole or some subfield of biology. Even in close-distance case studies, where generalization seems far from the goals, it can be shown that generalizations are made (2.b). Philosophy is a generalizing activity. This feature of philosophy may be a way to approach a tricky issue, namely the distinction between biology and philosophy of biology.

Intuitively, philosophy is different from science. The methods of philosophy are based, for example, on logical differences and implications (Hull 1969: 162) or on thought experiments (Rosenberg and McShea 2008: 6), while the methods of science are based on empirical phenomena. But this distinction is not as clear-cut as it might seem. Scientists also use thought experiments. Einstein famously used them for developing both special relativity and general relativity, and Darwin worked intensely with thought experiments—to mention only two representative cases. Conceptual, foundational, and linguistic analyses are an integral part of scientific research too. This means that scientists autonomously do philosophy in their day-to-day work. Indeed, professional biologists, as thinkers, could and should shift from their own main activity to more philosophical ones whenever needed. The other way around, philosophers of biology usually want to contribute to the advancement of science. In a sense, they want to be part of science. Furthermore, the methods of philosophy are constantly evolving, stimulated by advancements in biology, so that it is not rare now to find philosophical studies that include mathematical analysis, computer simulations, or even empirical research. The science-philosophy distinction is thus very blurred. Given this situation, is there any possible demarcation between philosophy and biology?

David Hull (2002) characterized philosophy of biology as meta-science:

Knowing some science can be of great assistance to philosophers of science, but philosophers of science as philosophers of science do not do science…. For example, the equation F = ma is part of science, in particular physics. The claim that F = ma is a law of nature is part of the domain of philosophy of science. To the extent that science and philosophy can be distinguished in practice, scientists tell us what the world is like, and philosophers of science tell us what science is like. (Hull 2002, p. 117)

One interpretation of philosophy of science as meta-science, which seems faithful to Hull’s idea, is that philosophy of biology seeks generalizations about biology.

The idea of philosophy of biology as generalization about biology has descriptive advantages, but it would be contested by some philosophers. In fact, no idea in the literature about the specificity of philosophy remains uncontroversial. When it comes to ambiguous science-philosophy cases, philosophers debate the right way of doing philosophy. They are much clearer in saying what it means to be a scientist than they are in saying what it means to be a philosopher.

In a view of philosophy as meta-science, the mark of philosophy is the leaning towards generalizations about science. This is not a yes/no requirement; it comes in degrees. Sometimes, for instance, works in biology make generalizations about biology (see Mayr and Gould’s examples below). These works sometimes become philosophical while other times they remain scientific because they generalize only to a degree which is functional to the scientific activity. This continuity accounts for the demarcation uncertainties that are going to persist in the field.

3. Philosophy Flanking Biology

 As is demonstrated in other parts of this article, philosophy can take a theme and develop it with great autonomy from science or adopt a critical focus on science and on the complex relationship between science and society. According to many authors, however, there is also potential for philosophy to actively and directly contribute to the advancement of life sciences. While it is commonly accepted that biologists can undertake philosophy of biology, it is more controversial whether philosophers can do biology in a proper sense. But biologists have generally been a receptive community. In 2002, David Hull had written:

Philosophers are attempting to join with biologists to improve our understanding of…biological phenomena. As such, they run the risk of being considered by biologists to be ‘intruders’. In point of fact, biologists have been amazingly receptive to philosophers who have turned their hand to philosophy of biology with a significant emphasis on “biology.” (Hull 2002: 124)

When philosophers are close to scientific practice and current problems, they see how they can help science in advancing towards its aims, either by criticizing existing ideas and practices or by proposing revised ones. In Hull’s line of thought, philosophers are encouraged to devote time and effort to work with biologists, or to formulate problems in ways that are interesting and understandable to biologists. Philosophical treatments that are too extensive, stereotypical, or posed in a way that is not relevant to biologists, or those with an excess of philosophical formalization that prevents access to scientists, should be avoided: “Formalization may be an excellent way of working out problems in the philosophy of science. It is not a very good way of communicating the results” (Hull 1969: 178).

Philosophers try to help biologists to better frame their questions, for example by “uncovering presuppositions and making them explicit” (Sober 1984), by analyzing conceptual foundations (Pigliucci and Kaplan 2006), or by pursuing the detection, analysis, and sometimes solution of theoretical and methodological problems. A great deal of the philosophy of biology consists in working on scientific language to clarify the meaning of concepts such as life, purpose, progress, complexity, genetic program, adaptation, and so on (Rosenberg and McShea 2008: 4). Since these concepts frame the scientific questions, philosophy of biology can “clarify, broaden or narrow the domain of theories, uncovering ‘pseudo-questions’” (Rosenberg and McShea, cit.: 6). Critical analysis and conceptual clarification are particularly valued tasks, and conceptual clarity is seen as a necessary virtue of philosophy of biology. One mode of work consists in looking, together with biologists, at concepts or mathematical models and their interpretations. This line of work is exemplified hereafter (3.a, 3.b): two traditional areas where philosophy has contributed by casting some light are the connections among biological enterprises like taxonomy, classification, and systematics (3.a), and the abstract descriptions of natural selection (3.b).

a. Clarifying Taxonomy, Classification, Systematics, Phylogeny, Homology

In biology, taxonomy consists in the recognition of natural groups, that is the taxa (singular: taxon). Classification deals with the categories and ranks to be assigned to taxa (for example, species, genus, family). Systematics is, by definition, a systematic study of the living world in search for order, or, in other words, the search for the relationships among taxa. And phylogeny is the reconstruction of the temporal scheme of common descent and relatedness among taxa. In fact, despite these gross characterizations, the four activities are intertwined and depend on each other. Their distinctions, definitions, and relationships are a traditional matter of reflection for philosophy of biology (Wilkins and Ebach 2011).

In the 1960s, philosophers, perhaps thanks to the heritage of the Western philosophical tradition, proved to be particularly ready and equipped for helping scientists understand their own various ways of finding order in the living world. According to Hull (1969), one of the important early contributions of philosophy of biology to logical clarity was the taxon-category distinction, the distinction between “individuals, classes, and classes of classes” (p. 171). Philosophers were able to contribute, for Hull, once they accepted to arrange their formalisms to communicate with biologists. Hull himself (1976) long argued for species having the ontological status of evolutionary individuals unlike taxa in other categories. Hull relied on the Biological Species Concept (BSC) that defines species as reproductive communities. In accordance with the BSC, whereas a taxon is determined by a set of shared characteristics, the species-rank of the taxon—its membership in the species category—is determined by actual evolutionary relationships, in particular by interbreeding habits. With a similar line of reasoning, Hull defended the idea of species as historical individuals, as opposed to classes or types. Species, perhaps, are leaving the limelight of philosophical reflection as a consequence of the body of discoveries about the fluidity of their boundaries, the heterogeneity of their phenomenology, and the rarity of canonic biological species across the biological world. But the debate on individuality was complex and lively, and is still partly open today (Wilkins 2011).

In early years, philosophy of biology demonstrated its value also by analyzing the entanglement among biological hypotheses—explicit and implicit—in different domains. For example, philosophers refuted the idea of taxonomy as a theory-free activity, prior to functional attributions and to evolutionary hypotheses. Character recognition does bring into play theoretical considerations: the definition of “kidney,” for instance, presupposes physiological knowledge of kidney function, and/or hypotheses on the evolutionary derivation of kidneys. The acceptance of evolutionary theory in all fields of biology, completed during the first half of the 20th century, triggered hot debates on its relevance to taxonomy, systematics, and classification (Hull 1970): does taxonomy have to reflect evolution? And in what sense? Could adaptation by natural selection be a criterion for systematics, or is pure common descent the candidate? Philosophers took part in trying to clarify these issues.

The network of assumptions and hypotheses of biology is an enduring object of study for philosophy. There is a certain consensus on the fact that phylogeny should constrain systematics; that is, systematics should reflect phylogeny as much as possible. Many authors even equate the two tasks altogether: “Practitioners of systematics study the historical pattern of evolution among groups of living things, i.e., phylogeny” (Haber 2008). Yet, taxonomy, classification, systematics, and phylogeny are unequally performed by different professionals upon significantly uneven living and fossil taxa, and there are many conflicting needs and goals. Species concepts in paleontology are by force very different from those that fit neontology (Wilkins 2011), an issue that yields differently organized classifications (Wilkins & Ebach 2013). Even among biologists who study currently living organisms, the ideas on how to integrate taxonomy and classification with systematics and phylogeny vary much across specialists at all degrees of specialization—for example,entomologists, botanists, and mammalogists. Furthermore, a classification of domestic animals or plants is likely to incorporate much morphological, physiological, and ecological information beyond phylogenetic relationships if it is to be of any practical use to the knowledge communities that are involved in rearing. And when it comes to decide what information is relevant to identify endangered species (or other units) that should be the object of conservation biology, different ideas —and ethical stances are on the table (Casetta and Marques da Silva 2015).

Philosophy of biology can make or support concrete proposals about how to integrate and refine the various ways of ordering the living world. Philosopher Marc Ereshefsky (1997), for example, has been arguing for systems of nomenclature that are alternative to the Linnean hierarchy (species-genera-Orders etc.). The reason for this position is that all scientific changes that have been happening after Linnaeus’ century—Darwinism, neo-Darwinism, cladistics and new methods in systematics—have turned the hierarchy into an obstacle rather than an aid to taxonomical work.

A specific challenge for philosophy of biology isphylogenetic trees, which are ever revisable hypotheses corroborated to varying degrees. Epistemological and methodological discussions concern the ways of building, interpreting, testing, and revising trees, as well as of relating them to other domains of knowledge such as taxonomy or adaptation. “So what can biologists meaningfully say about phylogeny?” philosopher Matt Haber asks (2008).

Broadly, two different issues have been at the center of recent systematics debates: given epistemic limitations, whether any inference of phylogeny may justifiably be drawn; and given an affirmative answer, what methods ought biologists use to justifiably infer phylogenies, and what are the limits of these inferences? (Haber 2008, p. 231)

Competing and sometimes conflicting methods have been developed to make phylogenetic inference more exact, manageable, or informative. In this domain, methodological differences raised heated conflicts among scientists. The parsimony principle was questioned in its legitimacy and importance. The principle states that the most economic hypothesis has to be more true (Sober 1988). More generally, trees have been examined for their theory-ladenness, that is, their sensitivity to background theoretical assumptions. Other contentious matters were the acceptability of different kinds of data ( genetic vs. morphological, for example), and the limits of the domain of phylogenetic inference. Some workers maintain that methods in phylogenetics should be able to detect homologies (see also 2.a) with certainty, discerning them from analogies. Many others argue under different conceptions that homology detection should rely on large amounts of data, mathematical models of evolution, and probability. Supporters of cladistics reject probability and likelihood for being an unstable ground on which to draw evolutionary trees. They put more confidence in the cladist practictioner’s ability to recognize derivation between characters (Hull 1970, Haber 2008). Meanwhile, the great majority of phylogenetic trees are built by relying on huge sets of genetic sequences and a few morphological characters by means of more and more cost-effective computer programs (this is an example of novel computer-based scientific techniques posing new philosophical problems, see more below). “Total evidence” methods (Sober 2008) address the challenge of building phylogenetic trees by integrating all the available evidence—morphological, geological, ecological, and fossil.

The question of homology is a last example of philosophical enquiry into entangled theoretical backgrounds and hypotheses: “when are two instances of a character to be considered instances of the same character and in what sense?” (Hull 1969: 174). Griffiths and Brigandt (2007) recognize different concepts of homology. The taxic approach to homology—the best known in systematics and philosophy—uses points of resemblances between organisms (shared character states) to diagnose their evolutionary relationships. The transformational approach focuses on the different states in which the same character can exist and be transformed by evolution. A third approach to homology has emerged in conjunction with findings of evolutionary developmental biology (EvoDevo): homology at the phenotypic level is potentially decoupled from homology of developmental processes and homology at the level of the genome (a phenomenon called hierarchical disconnect, see also 2.a). The level of depth thus becomes a necessary specification for homology. Griffiths and Brigandt (cit.) see the different concepts of homology as not only compatible but strongly complementary, and trigger a series of reflections on their relationships.

b. Formulating Natural Selection

Charles Darwin (1859) conceived natural selection as the mechanism of change, splitting, and divergence of lineages. Natural selection is thus the most essential notion of evolutionary theory. Darwin’s formulation of natural selection, substantiated by many empirical studies, was essentially verbal. After Darwin, definitions of natural selection underwent diversification as well as progressive refinement. Mathematical models of natural selection, created in the 20th century, made the process more precise, as population genetics introduced technical terms such as fitness and selection pressures (Haldane 1924), and established different formulas to quantify natural selection and to measure its intensity and effects. But population genetics and mathematics didn’t exhaust the task of formulating natural selection in a theoretical and general way, and philosophers of biology eventually became very actively involved. A classical abstraction of natural selection was elaborated by Richard Lewontin (1970). It was based on variation, fitness, and heritability. Today philosophers acknowledge the value of that account, but they also criticize it as, at once, too simple and too demanding (Godfrey-Smith 2009), ascribing it to a category of recipe descriptions that list the supposedly few and simple ingredients of natural selection. David Hull and Richard Dawkins, for example, independently introduced the original distinction between replicator and interactor, or vehicle. A replicator is anything that passes its structure on largely intact, while an interactor is a cohesive unit whose action in the environment makes a difference in the replication of the replicators it carries. Richard Dawkins (1976), interpreting the work of William Hamilton and George C. Williams (1966), famously took the basic idea of kin selection and developed it into the gene’s eye view, a general view of evolution where selection in the long run is seen as operating basically on genes and organisms are seen as their vehicles. Kin selection (Maynard-Smith 1964) is based on shared inheritance among relatives. Social donor traits are expected to spread in the population if they increase the fitness of the donor’s close relatives, which are likely bearers of the same traits. Inclusive fitness (Hamilton 1963, 1964) is the fitness of a trait deriving from the bearer’s survival and reproduction plus survival and reproduction of relatives (proportionally to the amount of genetic sharing). Dawkins’s “selfish gene” metaphor, based on these mathematical findings, was welcomed by many biologists as a clarifying device in their day-to-day work.  Its effects in the public perception of science are a different story that will be addressed later in the article. Some philosophers of biology got involved in the metaphor, either to develop it (for example, Dennett 1995) or, more often, to criticize it (see Oyama 1998). Several philosophers tried to find other ways of characterizing natural selection. We have already mentioned Sober’s (1984) work on this problem (2.a). Sober (1983) also described evolutionary theory as a theory of forces and population genetics as a theory of “equilibrium models.” Abstract accounts of natural selection and its enabling conditions flourished again with the turn of the 21st century. For Okasha (2006), a replicators-interactors account is too demanding and narrow, imposing unnecessary requirements for natural selection to occur. For Godfrey-Smith (2009), the concept of a population is the crucial one: some features of a population render it “paradigmatically Darwinian”, making natural selection happen. One motivating idea of such descriptions is their potential use for evaluating the operation of natural selection among units at different levels and in different domains (Rosenberg and McShea 2008). We will see below the example of the domain of culture: what kind of selection, if any, is plausible in the cultural domain?

Darwin thought natural selection happened chiefly among individuals: the individual organism was the unit of selection. But late Darwin introduced “group selection” for explaining some traits of humans and social insects (Darwin 1971). Groups were foreshadowed as a larger unit of selection, and group selection seemed to be the explanation for traits that jeopardize individual interest, such as cooperation and abnegation. In the 1960s, evolutionary modeling showed that group selection required repeated isolation, mixture, and re-isolation, namely conditions too narrow to be found in nature with any significant frequency (Maynard-Smith 1964). Group selection was disavowed; traits would not evolve simply because they are good for a group, they have to be selectively advantageous in inter-individual competition from their inception. In the 1970s, some philosophers of biology participated in a movement along with evolutionary biologists and social scientists to try and develop group selection into a scientifically respectable concept (Wilson 1975). In the meantime, kin selection had emerged as a potential alternative explanation for group-beneficial, unselfish traits, leading many scientists to conclude that group selection may be apparent and re-described as kin selection if group members are relatives. Philosophers got directly involved in the debate, sometimes working directly with biologists. In 1984, Elliott Sober, citing Williams (1966) among others, talked about the “mirage” of group fitness, seen as a mere statistical summation of individual fitnesses: “Selection works for the good of the organism; a consequence may be that some groups are better than others. However, it does not follow that selection works for the good of the group” (Sober 1984: 2). In many cases, the individual-based hypothesis is simpler than the group-level characterization, so the principle of parsimony would recommend not adding explanatory mechanisms. However, more in general, “Group, kin, and individual selection need to be disentangled, their difference made clear” (Sober 1984: 4). In fact, conceptual difficulties and fallacies in the units of selection framework were the attractor for philosophers to this debate. Sober changed his mind about group selection through a more careful conceptual analysis and by joining the work of biologist D.S. Wilson (Sober and Wilson 1998). Most philosophers now think that unselfish traits may be explained or made plausible by some combination of trait-level selection, organismal selection, and group selection in a weak sense, together with a multiplication of hierarchies of evolutionary entities and an “extended taxonomy of fitness” that contemplates co-opted by-products and functional shifts (Pievani 2011).

Beyond gene, individual, and group selection, some authors have also attempted to recognize higher levels, such as family selection, species selection, and clade selection, although authoritative biologists such as Ernst Mayr contested this idea: “in no case are these entities as such the object of selection. Selection in these cases always takes place at the level of individuals” (Mayr 1997). Today, for many biologists, the question of what unit is the “true” fundamental unit of selection has been satisfactorily settled—there are several—but there are now new theoretical and empirical questions. Given that multiple levels of vehicles exist, how does natural selection affect selection at lower or higher levels, and how are higher-level vehicles created by lower-level selection (Keller 1999)? The explanatory scope of multi-level selection, as philosophers have often emphasized, is challenged by the major evolutionary transitions in the history of life (Maynard Smith and Szathmáry 1995). Philosophers tend to describe a major transition in evolution as a phase of emergence of a new level—with new units—of selection. The new level contrasts or suppresses selection at the lower level, a process baptized “de-Darwinization” by Godfrey-Smith (2009).

4. Who Can Do Philosophy of Biology?

As David Hull observed in 2002, philosophers “are attempting to join with biologists to improve our understanding of…biological phenomena” (p. 124). As we have seen (2.d), a demarcation between the two profiles—the philosopher and the biologist—is possible but labile, and the distinction is perfectly compatible with any one scholar doing both, even simultaneously (cf. Pradeu 2009). Consider now how Hull’s quote goes on: “But sometimes the tables are turned. Biologists take up traditional philosophical topics and attempt to treat them even if they are not professional philosophers” (Ibidem). Biologists can turn to philosophy in two different ways: by reflecting philosophically on their own work or by naturalizing philosophical problems. In the first case, biologists get interested in issues of epistemology or methodology, and, from the ground of their work, they extrapolate ways of thinking or modes of inference. Naturalization happens when science becomes capable to say something significant and constraining about a traditional topic of philosophy, exemplified here by the origin of morality.

a. Philosophical Biologists

It is not infrequent that biologists undertake philosophical reflections on their own work. This is eased by the fact that some tasks of philosophy, such as conceptual analysis or linguistic clarification, are integral parts of scientific work (see also 2.d). Indeed, one might say that biologists, being experts about their own theories, are sometimes more qualified than philosophers to reflect on their own work. But what about the tendency to generality that characterizes philosophy of biology (section2)? Well, the generalizing route, too, can be followed by working scientists. A biologist’s philosophical effort may certainly vary in depth, richness, reach, and influence, depending on many factors such as the range of his or her interests in terms of both philosophical background and aims. With a good philosophical background, a biologist can match more effectively with debates in philosophical areas. Their aims may go beyond strict functionality for their own research and reach a genuine desire for capturing, defending, or criticizing something deep about their own science. Two examples—among many others—of very influential, philosophically-oriented biologists are ornithologist Ernst Mayr and paleontologist Stephen Jay Gould.

i. Mayr and Population Thinking

Ernst Mayr (1904-2005) was one of the greatest evolutionary biologists of the20th century, but he also increasingly worked in the history and philosophy of biology. He was “a crucial link between professional philosophers of science and professional biologists” (O’Malley 2010b: 530-1). Some of his areas of reflection were the distinction between proximate and ultimate causes, the nature of the neo-Darwinian synthesis, and the centrality of speciation and species defined by his Biological Species Concept (see 3.a). This article will focus on Mayr’s idea of population thinking as an example of how “one scientist uses ‘scientific concepts’ to forge a conceptual tool with a wider range of historical and philosophical applicability” (Chung 2003: 278).

The population notion was already central in Mayr’s work in 1942. His main concern was the methodology of systematics. Mayr had promoted a new systematics, looking for variation in large samples as opposed to few type specimens, and considering geography, genetics, and other sources as opposed to morphology only. As Carl Chung reconstructs, Mayr was first to formulate the distinction between typological and population thinking in 1955. A few years later, Mayr (1959) started pushing population thinking as the major innovation introduced by Charles Darwin and developed by biology as a natural historical science, different and autonomous from other sciences. According to population thinking, “no two individuals or biological events are exactly the same and processes in biology can be understood only by a study of variation” (Mayr 1955, cit. in Chung 2003: 288). In opposition, Mayr described typological thinking as having its deep roots in the Western cultural tradition, particularly in Platonic philosophy: “Implicit in this concept is that variation as such is unimportant since it represents only the ‘shadows’ of the eidos” (Mayr 1955: 485), the essences or ideas that lie behind diversity. As Chung points out, the main battlefield for this opposition was the concept of species. For Mayr, the incarnation of typological thinking in biology is the morphological species concept, according to which individuals belong to the same species by virtue of sharing their morphological characteristics. By contrast, biological species concepts are population concepts. The latter take into account variation, and in particular either reproductive barriers (two populations are different species if they coexist with no fertile mating) and/or geographical variation with reproductive flow between populations.

What is interesting here is the philosophical reach of Mayr’s ideas: he consciously worked them up from his systematics field work, used them for a philosophical interpretation of biology, and tied them to broader philosophical themes. In Chung’s words, the enterprise was “an attempt to liberate certain key ideas of the ‘students of diversity’ from their disciplinary constraints, and to render them more generally applicable by repackaging them into a broader historical and philosophical distinction that pertains to all of biology” (Chung 2003: 294-5). For sure, one goal was

to legitimize the natural historical sciences, including systematics, taxonomy, and evolutionary biology, against the criticisms of ‘the new biology’—molecular, reductionistic, and drawing explicit inspiration from the physical sciences…philosophically he could argue for the ‘in principle’ need for an evolutionary (population thinking) approach in order to offer adequate and complete explanations of biological phenomena. (Chung: 295)

Population thinking, once devised, shaped Mayr’s own interpretations of the history of biology, and provided him solutions to philosophical problems such as the method and the autonomy of biology. It was generally taken up by philosophers and philosophically-minded scientists such as Michael T. Ghiselin—notice the philosophical title of his 1997 book Metaphysics and the Origin of Species. There would be many other examples from Mayr’s work in which he developed philosophical ideas out of science with precise polemic aims and great influence on all scholars of biology. Statements like the distinction between ultimate and proximate causes became common currency for scientists. This does not mean that Mayr’s ideas weren’t criticized, as they increasingly are (Ariew 2003, Laland et al. 2011, 2013).

ii. Gould and Adaptationism

Scientists may end up doing philosophy if they get interested in the inferential structure of their own field. Inference, that is reasoning and its rules,is a classic topic in philosophy of science. Induction, deduction, and inference to the best explanation (IBE) are basic, well known types of inference, but, in scientific practice, they multiply and get combined and put to work in different ways, producing interesting conceptual and philosophical problems. Stephen Jay Gould (1942-2002) was a prolific writer (see more in section 5.d). Among his many favorite targets were a few inferential patterns employed by evolutionary biologists. The adaptationist inference was definitely one of the main ones since the famous Spandrels paper with Richard Lewontin (1979). Other biologists, like G.C. Williams (1966), had advanced proposals for revising adaptationist inferences on different grounds. Gould’s campaign can be seen as an expression of him being a paleontologist with great interest in inferential patterns and with a view of evolution directly inspired to Darwin’s works.

Adaptationism consists in explaining biological phenomena by claiming that they are adaptations. In the Spandrels paper, Gould and Lewontin used the metaphor of the San Marco cathedral in Venice to argue that even structures that exploit fundamental functions can nonetheless result, originally, as structural byproducts of a whole architecture. They wanted to criticize their colleagues’ habit to tolerate lazy tests of adaptive hypotheses that “consisted in little more than a decent qualitative fit between observed behaviour or form, and a set of posited adaptive pressures and constraints” (Lewens 2008: 180). As Gould elaborated after the Spandrels paper, between structures and functions there is no one-to-one strict correspondence, but rather, redundancy. Functions are distributed over several parts of organisms, and conversely any part we may call a trait or structure is involved in several mechanisms, functions, and processes in the organism’s life. In the appreciation of trade-offs between structural internal constraints and selected functions, Gould saw a revival of Charles Darwin’s original attention to “contrivances” (1877). Adaptive explanations will rarely suffice. In any case, they will need to be made testable and tested (Pievani and Serrelli 2011). Since 1982, together with Elisabeth Vrba, Gould proposed and promoted the neologism “exaptation” to address what he saw as two evolutionary mechanisms, distinct from adaptation, involving nonetheless natural selection and primary functions: functional shift of a structure with previously different purposes, a process already identified by Darwin; and functional cooptation of a trait whose origin is non-adaptive— for example, a side effect due to a developmental constraint or a random insertion.

Gould’s ideas about adaptationism and other aspects of biological inference triggered cascades of philosophical reflections, for example on the multiplicity of meanings of adaptation: by adaptation we may mean either something ensuring or increasing fitness or something that seems designed for the performance in a particular environment, in a range of environments, or for a particular function. We may intend something which is being positively selected, or something whose existence is due to natural selection in the past (Godfrey-Smith 2001, Lewens 2008). Biologists with philosophers, or philosophically-minded biologists, reflected on adaptationism often reacting, in one sense or the other, to Gould’s school of thought. Some scholars elaborated on the dubious ontology and instrumental nature of adaptations, while many others interpreted the challenge as the necessity of making adaptationist hypotheses testable (Pievani and Serrelli, cit.). Recently, plant biologist Mark E. Olson (2012) acknowledged the relevance of the “post-Spandrels consensus” on the importance of constraints in evolution among biologists. But he also pointed out post-Spandrels proliferation of contradictory “selection vs. constraints” and “externalist vs. internalist” explanations for the same data. These contradictions were due, for Olson, to Gould’s vagueness in defining “constraint,” as well as to the lack of experimental techniques for exploring the accessibility of unobserved forms. But today’s embryological, manipulative, and comparative empirical strategies allow for experimental exploration of morphologies that are not observed in nature. By combining these techniques, biologists can turn “internalism” and “externalism” from a-priori positions into case-by-case, testable hypotheses. Also, selection and constraint are more properly seen as complementary and not in mutual contrast as, for Olson, the Spandrels paper tended to suggest. For Olson, thus, a developmental “renaissance of adaptationism” is under way.

As customary, philosophical issues raised by biologists have been taken up and elaborated further by philosophers. Commenting on the philosophical literature on adaptationism, Godfrey-Smith (2001) distinguished three different issues on which adaptationist, anti-adaptationist and moderate positions can be taken up. The empirical issue is whether or not natural selection is a powerful and ubiquitous force in the natural world, with few constraints coming from biological variation, and with no comparable, competing causal factor. The explanatory issue is whether the most important questions in biology are about the fitting of organisms and environments, given that natural selection is the only answer to such big problems (other processes and explanations are good for less important questions). The methodological issue is whether or not starting with adaptive hypotheses—and holding to them—is good scientific practice. In 2009, a special issue of Biology & Philosophy was published as both a celebration and a critical appraisal of the influence of the Spandrels paper. Therein, Lewens (2008) elaborated on Godfrey-Smith’s taxonomy, recognizing seven types of adaptationism, and arguing for the importance of asking a prerequisite question: “what is a trait?”.

b. Philosophical Issues Naturalized

The relevance and implications of biology for humanity became a thought-provoking and heartfelt issue as soon as intellectuals and laypeople reacted to Darwin’s works (1859, 1871). More than a century later, E.O. Wilson (1975) proposed a “new synthesis” as a project of explaining the most diverse human behavioral and psychological traits by means of evolutionary hypotheses. Wilson’s proposal led to sociobiology and more recently to evolutionary psychology (Barkow et al. 1992). Human behavior, mind, morality, and systems of beliefs constitute the most interesting targets of possible, and controversial, naturalization. Naturalization is what happens when matters that are traditionally philosophical become empirically accessible by some scientific approach or method. David Hull wrote provocatively that “Philosophy lost physics, then biology, then psychology. Geometry, logic and mathematics became separate disciplines with no necessary ties to philosophy,” therefore philosophers hold very tenaciously to “epistemology, metaphysics, ethics and aesthetics” (Hull 2002: 124), because many of their other objects have been taken by science. But Hull also optimistically observed that one of the strengths of philosophy of biology “is that philosophers and biologists have ignored this distinction, working with each other on both sides of the divide” (Ivi: 117). In fact, naturalization is critically analyzed by philosophy, and there are many different philosophical positions on naturalism. So, naturalization is a fruitful object of study for philosophy of biology rather than a topic thief, and philosophy has a warranted place that is not going to evaporate by naturalization.

Philosophers’ reactions to Wilson’s “new synthesis” were, for Hull, a virtuous example of interactivity. After an initial wholesale opposition, many of them validated the challenge of naturalizing humans. Our species has to be seen as a proper part of the biological world, not as separated by any ontological divide. At the same time, philosophers highlighted inferential errors in biological explanations of human behaviors, epistemological limits in reconstructing the past, and ethical risks. They pursued theoretical refinements of the project, by improving multi-level selection models, for example. Many criticized the logical-deductive architecture of sociobiology and evolutionary psychology, frequently built on ad hoc hypotheses and adaptationist just-so stories. The debate is ongoing, and many arguments have been developed. For example, if many human psychological mechanisms are evolutionary novelties due to the interaction of ancestral genes and new environments, then many of these mechanisms are not adaptations and adaptive thinking in evolutionary psychology will fail to identify or explain them. More cautious, problematized, and integrated endeavors of biological explanation applied to humans are emerging, also under the stimulation of philosophy of biology (Sterelny 2003).

i. An Example: The Biology of Morality

It was for explaining human morality that Darwin hinted at ‘group selection’ (see 3.b), and the endeavor wasn’t finished with him. Evolutionary ethics, the association of morality with natural selection and evolution, provides good ground to Rosenberg and McShea’s remark that biology “…is the only scientific discipline that anyone has ever supposed might be able to answer the questions of moral and political philosophy” (2008: 3). Fast growing fields like neurobiology or cognitive neurosciences seem to be making biology more and more capable of addressing topics such as the origins of morality. Philosopher Patricia Churchland (2011), for example, studied the scientific literature and hypothesized a particular pattern of neural activation that would constitute a “neurobiological platform” for morality. Churchland highlighted the role played in this platform by molecules such as oxytocin, an ancient and simple peptide, found in all vertebrates. In mammals, oxytocin “is at the hub of the intricate network of mammalian adaptations for caring for others” (Churchland 2011: 14). In fact, morality would share its neurobiological platform with other familiar phenomena of human life—attachment and bonding. More generally, “the palette of neurochemicals affecting neurons and muscles is substantially the same across vertebrates and invertebrates” (Churchland 2011: 45). Among mammals, then, there is a wide “range of social patterns…, but underlying them are probably different arrangements of receptors for oxytocin and other hormones and neurochemicals” (p. 32). The striking thing is, for Churchland, that modest modifications in existing neural structures can lead to new outcomes. Morality and other phenomena would result from a not-so-exceptional modification of a pre-existing platform involved in mammalian parental cares. Churchland’s picture of evolution is again a familiar Gouldian one (4.a.ii):

Biological evolution does not achieve adaptations by designing a whole new mechanism from scratch, but modifies what is already in place, little bit by little bit. Social emotions, values, and behavior are not the result of a wholly new engineering plan, but rather an adaptation of existing arrangements and mechanisms that are intimately linked with the self-preserving circuitry for fighting, freezing, and flight, on the one hand, and for rest and digest, on the other. (Churchland 2011: 46)

Does the evolutionary continuity from mammalian parental care to morality constrain ethics and traditional philosophical theories of morality? Churchland’s view, while being only an example, is interesting in its intermediate position between strict biological determinism and cultural determinism. While biology can provide information and explanation on the platform for morality, the complexity of cultures provides scaffolding for moral development and definition so that moral decision remains a practical, dialogic, and social problem. However, the more general question is whether and how should moral philosophy—a large and highly technical field—take into any account what the sciences are discovering. The answers to this question are expected to come from philosophical studies of naturalization (Dupré 2001, De Caro and Macarthur 2004). In any case, most philosophers of biology recognize a naturalistic fallacy in the idea that knowing more about the natural world would suffice for making moral, political, and social decisions.

ii. Philosophy Versus Naturalization?

To some philosophers, the naturalization of philosophical problems is rather uncontroversial, to the point that “the difference between philosophy and theoretical science is not a matter of kind but of degree,” and the domain of philosophy is partly “the sum of all the questions to which science cannot (yet) answer” (Rosenberg and McShea 2008: 5). For many others, defining philosophy as some kind of underdeveloped science is an expression of scientism and a category mistake regarding fields of knowledge. Many philosophers point out that the biological explanation of social actions, behaviors, and culture, may imply a Darwinian dimension without boiling down to it. Naturalism is related to, but different from, other very general issues, like determinism or reductionism. Some philosophers draw a distinction between a strong “scientistic” naturalism and a pluralistic, or “liberalized,” naturalism (De Caro and Macarthur 2004). Scientistic naturalism considers philosophy as a branch of the natural sciences. Liberalized naturalism includes different epistemic levels of analysis of human nature—from natural sciences to humanities—that share the exclusion of non-natural causes or principles.

Emphasizing the impact of biology on human capacities, social institutions, and ethical values is also a way to justify philosophy of biology as useful or even indispensable to philosophy and, more generally, to the humanities. Some presentations of philosophy of biology tend to justify the field by its particularity of “concerning human affairs” (Rosenberg and McShea 2008: 8) and its being ultimately oriented to them. Some authors (Pradeu 2009) dislike this anthropocentric strategy in philosophy of biology and think the field could be otherwise justified. Under these overarching debates, human organisms and the human species are understandably a hotspot of problems for philosophy of biology, and biologists and philosophers must confront the growing biological knowledge of humans.

The neurobiology of morality (4.b.i) is not automatically a subtraction of morality as a philosophical problem. Indeed, many reflections from a scientifically-informed philosophy are of primary importance to maintain vigilance and scrutiny: what are the aims and uses of these biological studies of humans? Could they be used, for example, for a classification of people with consequences on the distribution of rights? Would this be justified? Would the consequent choices and decisions be acceptable? How are scientific results co-opted in clinical practice and health care? How are communication issues with patients handled? How influential are social and cultural biases in the construction of the object of research? Is this acceptable and justifiable? What ideas of morality underlie the studies? Scientists working in these difficult grounds will constantly exercise philosophical thought (see also 5.b). For them, the rich tradition of moral philosophy might constitute a precious aid.

Meanwhile, philosophy of biology can inform philosophy about new and obsolete approaches in the scientific explanation of morality. Evolutionary ethics is not necessarily tied to adaptationism (4.a.ii) or genetic determinism. The evolution of morality seems to resemble more bricolage than engineering. Against stereotypical and simplified views of evolution, morality cannot be identified with a set of genes, even though genes do not necessarily lose their importance, and the question about heritable patterns remains crucial in order to define what morality is in evolution.

Philosophy of biology can fight easy deterministic conclusions or identifications between naturalism and determinism while promoting useful definitions of the concepts involved: what is to be intended for morality in different contexts, and why finding an underlying arrangement of receptors is not going to replace the need for ethical reasoning and moral philosophy. Science is a form of knowledge and therefore subject to epistemology, conceptual analysis and change, and ethical reasoning. On the other hand, biology constantly brings new fuel to philosophical inquiry, even, perhaps especially, when philosophical issues get naturalized.

5. Philosophy Bringing the Life Sciences out of Their Research Context

There are multiple senses in which philosophy of biology brings the life sciences out of their research contexts. First, philosophy of biology can study and sometimes aid interactions among the life sciences or between them and other sciences. Second, sometimes philosophy is seen as capable of developing messages, ways of thinking, and their consequences and implications to elaborate a “philosophy of nature” and an overall vision of the living world (Godfrey-Smith 2009, p. 15). Third, philosophy can reflect on the roles and meanings of science and on the interactions between science and society with an approach different from that of the social sciences. In this way, it can assume critical points of view towards biology and reflect on how scientific claims are, could be, and should be received and elaborated by the public.

a. Philosophy of Biology at Intersections

What happens when different scientific fields or points of view come into intimate contact? Philosophy of biology has always been happily and effectively involved in this matter. A classic example (seen above, 3.a) is the analysis of coexistence, interaction, and implicit reciprocal dependencies between “the morphological, the physiological, and the genetical” viewpoints in taxonomy (Hull 1969: 176-7). Philosophers of biology often notice attractions and tensions and call for integrations. They attend to emergent relationships among life sciences, as in the topical cases of micro- and macroevolution (Serrelli and Gontier 2015b) and evo-devo (see below), or between biology and other sciences, as in cultural evolution studies. Classic debates in philosophy of science, for example on reductionism, provide conceptual coordinates for thinking about these connections.

A major development in biology began in the 1980s, when technical and theoretical advancements enabled the molecular study of development, opening the possibility of relating development to evolution in different ways (Gilbert et al. 1996, Minelli 2010, Olson 2012). Evo-devo, evolutionary developmental biology, was born. The contact was all the more significant because embryology, a discipline with an ancient tradition, had long been seen as far removed from evolutionary biology. Many evo-devo protagonists and observers framed evo-devo within the insufficiency of the mathematical study of the intergenerational transmission of genes. This hegemonic field, population genetics, considered development as a “black box” and assumed a linear relation between genotype and phenotype (Laubichler 2010). In doing so, it would be blind to fundamental kinds of evolutionary innovation, incapable of addressing macroevolutionary change (Minelli 2010), and, more deeply, non-mechanistic, based on “conceptual abstractions” (Laubichler, cit.). Yet, many scholars are pursuing a pluralistic integration. They point out that

To deny the internal consistency and the explanatory power of [the research program developed from population genetics] would be obviously foolish…. The objection is that there can (and should) be more to evolutionary biology than a research program restricted to the concepts and tools of population genetics (Minelli, cit.: 216).

Some scientists and philosophers focus on a broader polemic target: the Modern Synthesis, that is the foundation of evolutionary biology as it is practiced today, which happened between the 1910s and 1940s and was further canonized mainly by Ernst Mayr in the subsequent decades. Critics analyze how the Modern Synthesis excluded lines of research as non-legitimate or irrelevant to evolution, while successfully pulling together Darwin’s natural selection, Mendel’s theory of inheritance, mathematical models of population genetics, and the work of the most disparate fields in biology (Gilbert et al. 1996, Serrelli 2015). Some philosophers and scientists are therefore proposing the idea of an Extended Evolutionary Synthesis (Pigliucci and Müller 2010). These movements are very interesting to philosophers of biology. They provide new access to traditional philosophical issues about scientific change, including the unity and disunity of science (Fodor 1974, Callebaut 2010). They also invoke a necessary collaboration between history and philosophy of science, as shown by works that revised the received views on the nature of the divorce of embryology and evolution in the 1930s (Love 2003, Griesemer 2007 cf. section 7).

Philosophy of biology’s remarkable interest in cultural evolution studies is an example involving relationships between biology and other fields. The idea of similarities across biological and cultural evolution was already suggested by Darwin and his contemporaries, and several approaches were formalized in the second half of the 20th century (Cavalli Sforza and Feldman 1981, Boyd and Richerson 1985). Cultural change and stability are understood in terms of variation, selection, and inheritance/transmission of cultural traits, only requiring some correlation in cultural transmission from cultural parents (models from whom a cultural trait is acquired) to offspring (individuals acquiring the cultural trait). These evolution-inspired approaches to culture allow for a variety of unique mechanisms for cultural transmission, and incorporate processes like drift and multi-level selection. Capitalizing on such approaches, some social scientists are proposing that methods, findings, and theories be systematically exchanged between the biological and cultural sciences, particularly between disciplines that lie at the same level on a micro-to-macro scale (Mesoudi et al. 2006). This trans-disciplinarity is sometimes presented as the beginning of a late and needed evolutionary synthesis in the social sciences, similar to the Modern Synthesis in biology, but some philosophers think this is an overstatement while others think the claim is based on a naïve epistemological conception (Serrelli 2016a).

Similar proposals call philosophers of biology into question. The abstract formulations and philosophical issues of natural selection, multi-level selection, and drift (3.b) can help to evaluate the appropriateness of transfers of methods and theories (Godfrey-Smith 2009). General topics like population thinking (4.a.i) and adaptationism (4.a.ii), or reductionism will emerge in cultural evolution too. Progressionism is another very basic issue here. The misleading idea of progress was in fact roughly applied in anthropology by evolutionists in the past with potentially discriminatory effects and reductionist justifications of essential diversities within the human species. Now, philosophy can assist cultural anthropology in updating old stereotyped worries about progressionism, overcoming a derived prejudice against naturalism as such. A coherent and analytical criticism of any form of teleological and progressive evolutionism could thus be a way to reconstruct the broken bridge between cultural anthropology and evolutionary studies (Panebianco and Serrelli 2016a).  Cultural evolution also raises deep epistemological problems, not only about definitions of terms like ‘culture’, but also about the knowledge processes that are ongoing in this intensification of contact between the biological and social sciences (Serrelli 2016b, Panebianco and Serrelli 2016b).

b. Biology’s Critical Friend

As we have seen, philosophy of biology is expected to help science and to work hard to keep up with scientific research. But many authors point out that philosophy must not forget its critical role towards science. One perspective on philosophy of science sees it as an enterprise that aims “to understand the life of science, that is, to understand the development of science as a Wittgensteinian family of ongoing human epistemic practices” (Reydon 2005: 252, cf. Grene and Depew 2004). Science can be seen as a part of culture and as a sub-system of society. This critical perspective on science, with its complex relationship with society, is always available to philosophers. Authors like Linda Van Speybroeck expressed the concern that philosophy of science might become “…a servant of science, leaving science undisturbed and calling only for a ‘justification’ of the philosophical practice against scientific standards” (Van Speybroeck 2007: 54).

Stephen Jay Gould’s critique of progressionism in paleontology and paleoanthropology, and his opposition to human racial classification and measurements of intelligence, are examples of constructive criticisms of the scientific enterprise. Current knowledge overwhelmingly shows that human evolution is a bushy tree of coexistent and sometimes interacting hominin species (compare Ruse 2012). But cultural and psychological biases have shaped science in some periods. Human evolution was expected to be, and represented as, a ladder of progress, a sequence of progressively more evolved hominid species, substituting one another, and approaching Homo sapiens, the climax species, often represented as a European male (Eldredge and Tattersall 1982). Stephen Jay Gould, in books such as Wonderful Life (1989), with a peculiar method that could be called an archaeology of scientific ideas (Pievani 2012a), coined expressions like the “iconography of hope” and tied them to social history, to some deep teleological preferences, and to our habit of using the present as a key for understanding the past. For many years, Gould fought for human evolution to be separated from human hopes, satisfying tales, and “great narratives.” He concentrated on the jargon, distinguishing evolution and progress, trend and finality, and stressing the ambiguous and appealing fashion for terms like “missing link.” Gould’s critical metaphors put paleoanthropology in contact with the larger cultural context and deeper psychological roots, helping to reinforce the theoretical normalization of human evolution into a branching model of diversification of species, typical of the broader phylogenetic tree of primates.

Gould’s way of reasoning was particularly apt in showing that science is a human activity. Scientists work on the ground of cultural and social biases; they are not naïve collectors of neutral facts. Although Gould was not a sociological relativist, he showed in many cases the importance of history in shaping science, where ideas may also be dismissed and then taken up again. This attitude is also found in Eldredge and Gould’s (1972) famous “punctuated equilibria” paper, where the two paleontologists revealed the deep theory-laden nature  of fossil interpretation. The ubiquitous pattern of evolutionary stasis over geological time periods was neglected since Darwin (1859), who considered fossils a constitutionally incomplete documentation. But the pattern of stasis and punctuation was, as Eldredge and Gould pointed out, data, not lack of data. “Phyletic gradualism” was a consolidated assumption that had been blinding even paleontologists towards their own data, while perpetuating the subordinate position of paleontologists with respect to theoretically important fields. In fact, Eldredge and Gould formulated a theory that explained punctuations as speciation events, and stasis by other processes. In doing so, they posed several problems, among which was the legitimation of paleontology as a theoretically relevant field.

In The Mismeasure of Man (1981) Gould argued against the general scientific attitude of attaching “universal essences” to human disparities by measuring what is not measurable, like intelligence quotient, an artificial construct. He also exposed unconscious manipulations of the anthropometric measurements and ranking of skulls in Samuel G. Morton’s work (Crania Americana, 1839): Morton believed in “races” and in their polygenic origin, and he believed human intelligence was a unitary and inheritable object. Gould found his measurements as biased by his self-confirming preconceptions. For Gould, any scientist is an unconscious victim of his or her preconceptions. Recent studies of Morton’s material (Lewis et al., 2011) rehabilitated Morton’s original measurements. The authors hypothesized that Gould’s severe analysis was biased by his own aprioristic egalitarian and liberal cultural beliefs. “In a paradoxical way Steve had proved his own point”, Ian Tattersall observed (2013). No scientist can be immune to this kind influences, against which, for Gould, “vigilance and scrutiny” are the necessary, although insufficient, palliations. Along these lines, philosophy of biology is invited to take on its critical constructive role towards science.

c. Developing Messages from Biology

Many topics in philosophy of biology are evidently relevant in the relationship between biology and the public. Deciding the meaning of concepts, like natural selection, fitness, or function, requires an understanding of the theories and their domains (Rosenberg and McShea 2008). Philosophy of biology clarifies, for example, that “‘natural selection’ is not an entirely apt name for the process [it identifies], as it misleadingly suggests the notions of choice, desire, and belief built into the theological account of adaptations” (p. 17) and corrects popular descriptions of evolution that make it look tautological and unscientific. The notoriously slow-changing world of science education and public understanding is under pressure by the impressive rate of life sciences growth. In the popularization of science, hominid evolution is still depicted in a linear sequence of species from simple to complex, from inferior to superior, from archaic to modern. It exhibits intuitive power, not to neglect the prospective endurance of other influential pictures, like the tree of life, that are being questioned and made complex. And any time a naturalistic study of humans is carried out, misunderstandings and dangerous “genes-for-morality”-kind interpretations are just around the corner, both in the form of quick reductionistic and deterministic claims and in the form of a-priori anti-naturalistic positions. Even important scientific advances, like evo-devo and other fields that are pushing for an Extended Evolutionary Synthesis, can be negligently interpreted as breaking-offs, invalidating all the previous knowledge. Opponents of science can use this to covertly reintroduce non-naturalist and non-scientific explanations. Instead, philosophy of biology could help citizens become more scientific, and more able to exploit the directional role they have towards science (cf. Haarsma et al. 2014-2015).

According to a principle called “scientific citizenship,” science should be properly understood by everyone in society. A fundamental pillar is a shared awareness of “the nature of science” which is the wealth of philosophical reflections about what is science, what is biology, how is our scientific knowledge best acquired, and how much can we be confident in it (Matthews 1994). What are the ways of thinking used in biology that can help people to get a grasp of it? Are they similar to everyday ways of reasoning? What are the possible conceptual traps and pitfalls? What kind of knowledge can we expect from ecological simulations? What are model organisms, and why is it important to establish and fund them? What kinds of predictions can really be made, for example, in medicine or ecology? Understanding the probabilistic nature of predictions would produce not only a more educated picture of nature, but also, for example, an “awakening of public opinion to environmental problems, hydro-geological instability, and maintenance of territory” (Pievani 2012b: 352).

Philosophy of biology’s endeavor can sometimes go far away from day-to-day biology, and work out general pictures, messages, and worldviews. A classic and pervasive example is “Universal Darwinism,” developed by Richard Dawkins, Daniel Dennett and others (Dawkins 1983) along a reasoning line from kin selection theory to a “gene’s eye view” to a “replicator view”. According to the “selfish gene” part of the argument (Dawkins 1976), reliably replicating molecules (precursors of genes) would be at the origin of life, and organisms including humans would be late-comer vehicles built up in all their minute details by genes that survived the competition. The selfish gene has been particularly appealing and controversial for its philosophical implications. We are unaware machines for our genes, whose interests furthermore sometimes conflict with (and win over) ours. The selfish gene view even came to be considered the official version of Darwinism, being the one defended and advocated on many public stages against non-scientific views. Universal Darwinism is a philosophical view, according to which natural selection, intended as the selective retention and accumulation of blind variations that prove to be stable and fit, is the fundamental mechanism in the Universe. Many philosophers of biology fought these views and all their philosophical implications, seeing them as a hardly justified reification of some methodologically operationalized idealizations by population genetics (Oyama et al. 2001). Other lines of attack were the incredible polisemy and theoretical complexity of “gene” and the ongoing revision of their causal power on the organism (Griffths and Stotz 2013). Critics were all the more motivated by their discontent with the picture of evolutionary biology that was being conveyed to the public and to philosophy by the selfish gene view. Many philosophers pointed out that natural selection is arbitrarily chosen as a process to be universalized into a general philosophical view (Godfrey-Smith 2009), and some outlined provocative alternative views like Universal Symbiogenesis (Gontier 2007).

Other philosophical views were elaborated from the importance of chance, randomness, and, more comprehensively, contingency in evolution. Philosophers point out studies demonstrating that chance variation can influence evolutionary outcomes without being constrained or directed: “Evolutionary divergence is sometimes due to differences in the order of appearance of chance variations, and not to differences in the direction of selection” (Beatty 2010: 39). Since the 1980s, the neutral theory of genetic evolution promoted by Motoo Kimura (Ohta & Kimura 1971), now developed in weak neutralism in the scientific community (Hartl & Clark 2007), exposed the huge proportion of neutral variation all around. Population genetics models show that important events like speciation can well happen in conditions of fitness neutrality (Gavrilets 2004). Much earlier, population geneticists demonstrated the importance of drift (see above, 2.a). But is evolution random or contingent at any spatio-temporal scale? Are there law-like tendencies in large-scale evolution? If so, do these concern adaptedness, complexity, or other features? Since environments change over time, “what is adaptive” changes constantly through evolutionary time, so there is no strict commitment in the Darwinian theory to long-term adaptive progress or trends (Serrelli and Gontier 2015a).

Are trends towards greater complexity a better candidate? Some philosophers think so (McShea and Brandon 2010). Others think that nothing in the current understanding of evolution predicts a drive towards increased complexity. Another main disagreement concerns how to define complexity— is it through the integrated organization of interrelated parts in a whole or just through the number and diversity of parts? Some philosophers see evolution as a texture of contingent histories, the most ordinary—but most relevant to us—being the story of our species and their relatives. The human tree is just like that of other mammals. This emerging view is very different from the one inspired by Universal Darwinism:

Evolution is a process that abounds in redundancies and imperfections, and adaptation could be a collateral effect rather than a direct optimisation. Biology is a field of potentialities, and not determinations…. Complex organisms exist thanks to imperfections, to multiplicity of use and redundancy. (Pievani 2012a: 142)

For authors like Stephen Jay Gould, the preeminence of ecological contingencies and macro-evolutionary patterns, like mass-extinctions, in natural history seems to dismiss any idea of progress in evolution:

We are the offspring of the material and contingent relationships between localised populations and ever-changing environments. The massive contingency of human evolution means that particular events, or apparently meaningless details, were able to shape irreversibly the course of natural history. (Ivi: 139)

Some theorists identify the source of contingency in the complex interplay between ecological systems and genealogical entities at multiple and very diverse scales (Eldredge et al. 2016).

From the disruption of the idea of a great progressive tendency in evolution, in particular human evolution, some philosophers develop general implications. Evolutionary humility results from a naturalistic way of seeing Homo sapiens as a part of a contingent process and not as its culmination. From another point of view, the discovery of the determinant role of contingent ecological events like floods and earthquakes gives a new vision of nature. Nature is neither a harmonic Eden nor a wicked nemesis. “The expressions of violence and unpredictability of natural phenomena that shock our societies so much today are the normal ecological niches where we were born. We would not be here at this moment without them” (Pievani 2012b: 352).

6. Scientifically Up-to-Date Philosophy

In a famous paper titled “What the philosophy of biology is not,” David Hull wrote that any work in philosophy of biology should not skip “all the intricacy of evolutionary relationships, the difficulties with various mechanisms, the recalcitrant data, the wealth of supporting evidence” (Hull 1969: 162). Along the same line, philosophy of biology tends to be grounded on deep knowledge and understanding of current biology by maintaining a non-episodic familiarity with many fields that are outside philosophical specialization.

It is important to keep in mind that the life sciences and their objects change and grow. A basic example is the very definition of “life” (bios). Answers, as well as philosophical problems, for such a topic come, for example, from scientific research on the origin of life (Penny 2005) or the search for extra-terrestrial life. The origin of life has been considered as a backwards extension of the roots of the tree of common descent. It regresses from some universal “ancestral organisms” (LUCA, Last Universal Common Ancestor) to simpler, minimally living entities that might be referred to as “protoliving systems,” and it further dissolves into non-living matter along several dimensions (Malaterre 2010). In general, philosophers work together with biologists on interpreting the history of life, for example, on trying to make sense of major evolutionary transitions (Maynard Smith and Szathmáry 1995). The symbiogenesis of eukaryotes and the advent of multicellularity are examples of major transitions. Those few moments in the history of life may be seen as points of emergence of a new level of organization. Today the very idea of a tree of life is being challenged by evidence of massive transfers and blurred boundaries among its branches (O’Malley 2010a). The history of biology is a history of changing views of life and of its history, and philosophy of biology participates into this process.

In the 50 years of existence of philosophy of biology as an academic field, the expansion of life sciences has been explosive. Huge global problems like climate change, biodiversity loss, new forms of diseases, and needs for resource management in our societies have contributed to that growth. Life sciences, including biomedical, ecological, and microbiological fields, have faced challenges related to the Modern world, not only by being called into question or by actively catching opportunities to develop big research projects and programs but also by contributing to the very discovery and perception of those challenges. In parallel, life sciences have seen incredible technical advancements: cheaper and faster technologies for DNA sequencing and molecular analysis; computational methods allowing analysis of billions nucleotides in search for the most likely phylogeny and simulations of evolution of proteins or complex phenotypes; huge shared information databases like the genome projects (such as genomics, proteomics, metabolomics), particularly the Human Genome and the various “–omics” projects monitoring whole complexes of functional molecules; finally, the applications of these technical advancements in creating new kinds of organisms or in disease detection and therapy. A parallel phenomenon of the decades leading into the 21st century has been the explosion and availability of scientific literature and access.

What are the consequences for philosophy of biology? The discipline is supposed to have a role not only in understanding, describing, and communicating science but also in aiding the development of scientific programs. Given the explosive historical dynamics outlined above, almost every subfield of biology requires much of a philosopher to delve into its peculiar concepts, methods, objects, and conceptual issues. Hence, several presentations of philosophy of biology follow a field-by-field criterion, enumerating items like “philosophy of ecology” or “philosophy of molecular biology” (Griffiths 2011). Such multi-field presentations reflect the dynamic and lively development of biology. Meanwhile, periodical shifts of focus and emergence of new fields and techniques in the scientific literature, attract the curiosity and calling for the contribution of philosophers. But the identities of supposed sub-fields like “philosophy of ecology” or “philosophy of microbiology” are not crystallized, and rarely will a philosopher of biology self-limits to one sub-field. Therefore, the field-by-field approach is not followed here.

This first section provides a few examples of how philosophers of biology can chase the developments of some particular field of life sciences. In certain moments, this pursuit can lead to extensions of philosophy of biology itself to embrace not only new scientific knowledge, but also newborn ways of doing science. Furthermore, deep revisions of philosophical approaches themselves may be necessary to address new aspects of science, namely facets of scientific practice. The examples concern molecular studies of gene exchange that started in microbial evolution, advances in ecological modeling, and the construction and management of model organisms.

a. Questioning Influential Ideas

In recent years, several philosophers have become interested in the growing evidence for a variety of gene exchange mechanisms widespread in fungi, plants, and animals, not only in prokaryotes (unicellular organisms that have long been known to wildly transform, conjugate and acquire DNA by transduction). Following biologists such as W. Ford Doolittle, philosophers contrasted this evidence with the idea of a universal tree of life as a tree of sexually reproducing, genetically isolated species that multiply by genetic isolation. For O’Malley (2010b), the idea of a universal tree of life has come to be an unacceptable reach given the abundance of reticulation and lateral gene transfer (LGT) in all kinds of organisms, including animals. O’Malley proposed to give up

an animal-centric philosophy of evolution. Its key tenets are that the BSC (Biological Species Concept) is ‘universal’ to species-forming organisms, that bifurcating lines of descent are all that matter in ancestry reconstruction, and that the rest of life (non-animal) is simpler, less diverse, and less ‘true’ evolutionarily. The consequences of [such an] animal-centric philosophy of evolution are that it can include at best a severely truncated history of evolutionary events. (p. 544)

The animal-based, constraining idea of a universal tree of life was consolidated around the mid-20th century through an attempt of “excluding the messy”, such as prokaryotes and bacteria. For O’Malley, such exclusion was mainly due to the influence of ornithologist and philosopher Ernst Mayr.

As this example illustrates, philosophers, by following frontier developments of particular fields, can sense the need for a revision of overarching theoretical choices and frames of biology. At the same time, they can wriggle out of canonical problems and concepts and expand their philosophical interests: “It is definitely the case – for O’Malley – that philosophy of biology has undergone a rapid radiation of topics in the last decade and this has meant going beyond Mayr’s focus” (544). The resilience of issues like the species concept, molded on animal breeding, is pushed by new concepts amenable to philosophical analysis. Further, the very task and range of philosophy of biology happen to be questioned: “Mayr’s vision of philosophy of biology as the clarification of evolutionary concepts has…been challenged” (O’Malley 2010b: 531).

b. Understanding New Scientific Practices

According to philosopher Thomas Pradeu (2009), ecology has made “a conspicuous and very welcome entry” in philosophy of biology, and several philosophers advocate aspects of ecology as targets of philosophical attention. Ecology has its own epistemological issues, for example, the weakness of ecological laws, the debate around the idea of balance of nature, the complex problem of the predictive value of ecological models, and the involvement in environmental decision making (Cooper 2003, Mikkelson 2007, Plutynski 2008). Other problems concern the individuation of ecological units, scale-dependence, and generalizability of ecological models. Some philosophers got interested in new modeling methodologies of ecological inquiry, the area of the following example.

Ecologist Steven Peck (2008) contributed to the debate from the standpoint of an author of complex computer simulations. He pointed out the many differences between simulations and “analytic models” which can be written as mathematical equations. In simulations, many of the entities, “dispositions,” rules, and relationships, are not captured by any equation, rather they are directly written in computer code: “conditional if-then statements, looping structures, and calls to procedures” (Peck 2008: 390). Moreover, simulations are not at all complicated versions of analytic models because, as Peck explains,

The complex computer representation is an ecological system. One that you have complete control over, but which provides insights and allows complex behavior to bubble up from lower level processes and allows one to capture the emergent behavior often seen in ecological systems. (p. 387)

For Peck, models of this kind are provocative for philosophical issues, such as what are models? What are their aims? How do they work? A classic and influential framework by Richard Levins (1966) identifies three constraints among which a model has to trade-off—generality, precision, and realism. But building a simulation, for Peck, does not consist in adding more variables and parameters in order to capture more parts and processes of some targeted biological system. It is a creative effort yielding something autonomous with interesting but complex relationships with other models and with the world. Some notions in the philosophy of modeling can make a step in the direction of simulations, like the idea of “indirect representation” in which the model descriptions themselves are examined as opposed to “direct representation” in which representation is used to describe a real-world system (Godfrey-Smith 2006). But this distinction for Peck is not enough to capture the fact that simulations become experimental systems in themselves. Playing with the model means a “bracketing out of nature to explore the model itself [yielding] important insights to the model – separate from what it is supposed to represent” (Peck 2008: 395).

Complex computer simulations push philosophy of biology to reflect on new accounts of model building and interpretation, and also to deal with new ways in which scientific communities structure themselves.

c. Rethinking the Philosophical Approach from New Ways of Doing Science

Chasing the state of the art of life sciences—even of one or few fields at once—is very demanding. By doing it, however, philosophers of biology can continually fuel their thought. We have seen, for example, that philosophers, relying on molecular discoveries about gene exchange, can revise their agenda, downplaying traditionally-framed problems (for example, what is a species?) and even questioning great background pictures against which biology is thought (the tree of life, for example). By striving to understand scientific methodologies, including completely new ones such as computer simulations in ecology, philosophers are brought to probe their accounts of science to work out new ones, and to get a better hold on them and their connections. Related to all these efforts there is another tension in philosophy of science, which has been made explicit in the last few years: the philosophical orientation toward scientific practice (Boumans and Leonelli 2013). A recent trend, represented, for example, by the Society for Philosophy of Science in Practice (SPSP), is pointing out the limits of an exclusive use of conceptual analysis, proposing instead “a philosophy of scientific practice based on an analytic framework that takes into consideration theory, practice and the world simultaneously” (Boumans et al. 2011). Practice is defined as an ensemble of “organized or regulated activities aimed at the achievement of certain goals” and is an object of philosophical reflection.

Steven Peck’s analysis of ecological simulation, mentioned earlier, can also be seen as an example of a call to scientific practice. For Peck, simulations are more of experimental systems than representations. A simulation is useful because it helps “thinking more deeply and creatively into the nature of the problem” (Peck 2008: 396), but it can be appreciated only by considering the community in which the authors of the simulation are present and maintain the simulation (otherwise, the simulation dies because it cannot be understood or replicated by others). Crucial aspects are the authors’ willingness to expose the multiple perspectives that are encoded in the simulation, to confront them with other modelers and with “those who study the ecology of natural systems directly” (p. 399), and the authors’ engagement in discussing, modifying, and exploring the simulation further. In this new kind of engagement, Peck sees a “hermeneutic circle,” a “back and forth conversation among modelers, their models, and those who study the ecology of natural systems directly” characterized by each actor’s attempt to “understand her own perspective in light of others’ perspectives.” The hermeneutic circle is, for Peck, what “opens the door to deeper understanding of what the simulation model is showing us about the world” (p. 399). Scientific community practices are crucial for understanding what’s going on. Logical analysis, either of the model alone or of its relationships with some natural ecological system, does not seem to bring philosophy a long way.

Another example of a rising scientific practice is constituted by model organisms, a term introduced in the late 1990s and becoming more and more used. Official lists of model organisms include species such as the mouse, zebrafish, fruit fly, nematode worm, thale cress. Mice and other animals are extremely important in biomedical research due to the extrapolations to Homo sapiens that are considered possible with some conditions (Piotrowska 2013). In philosophy of biology, there has been interest in understanding what “model organisms” are and in demarcating them against the larger set of experimental organisms. Ankeny and Leonelli (2011) define model organisms as:

Non-human species that are extensively studied in order to understand a range of biological phenomena, with the hope that data and theories generated through use of the model will be applicable to other organisms, particularly those that are in some way more complex than the original model. (p. 313)

The two philosophers work out several concepts embedded in this definition in order to capture the specificity of model organisms. For the issue at hand, the most important aspect is the new kind of structured scientific communities that maintain a model organism stable in space and time. Examples of model organisms are the fly Drosophila melanogaster that has been studied since the dawn of genetics, knockout mice Mus musculus, and the plant Arabidopsis thaliana. The research community of a model organism performs intensive research with “a strong ethos of sharing resource materials, techniques, and data” (p. 317). While initially the organism can be chosen for experimental advantages (being easy to breed, for example), the cumulative establishment of techniques, practices, and results—for example, through databases and stock centers—leads to self-reinforcing standardization, comparability, and stability: “…the more the model system is studied, and the greater the number of perspectives from which it is understood, the more it becomes established as a model system” (Creager et al. 2007: 6).

To summarize, a recent stream in philosophy of biology associates the need for first-hand, recent, and deep scientific information with a need to consider newly emerged scientific practices that often involve innovative scientific communities and that are capable to generate new questions. It is no surprise that philosopher Leonelli, in a paper on different modes of abstraction that are performed by different communities on the model organism Arabidopsis thaliana, writes: “Focusing primarily on modelling practices, rather than on models thus produced, might prove a useful way to gain insight on some long-standing debates within the philosophy of scientific modelling and representation” (Leonelli 2007: 510). By following biology, philosophy rethinks its own methods and foundations.

7. History and Philosophy of Biology

Good philosophy of biology may be done non-historically, by working in a purely conceptual way. On the other hand, philosophy is sometimes specifically interested in grand historical processes. The inception of empirical and theoretical novelties and the birth of whole new fields provide the opportunity to probe classic accounts of scientific change such as Popper’s falsificationism (Popper 1935, 1963), Kuhn’s paradigm change (Kuhn 1962), Lakatos’s methodology of scientific programmes (Lakatos 1970), or Kitcher’s theoretical unification (Kitcher 1981). Philosophical categories such as reductionism can be tested in their capability to account for the historical relationships between biological fields. At the birth of molecular genetics, for example, a philosophical question was whether the older Mendelian genetics was being reduced to it. As Griffiths (2010) remarks, philosophers of biology achieved more adequate models of theory by debating whether or not the molecular revolution in biology was a case of successful scientific reduction.

At all scales, philosophy of biology has a constant need to refer to history, while some authors complain that philosophy of biology pays too little attention “to the question how and why things in the field have become the way they are today” (Reydon 2005: 149-150). The rediscovered interdependence between history and philosophy of biology may be seen in light of the more general problem of the “history and philosophy of science” (HPS) studies, which has been viewed in a more integrated way. The recursive and expansive dynamics between history and philosophy of biology proves to be a generator of dense and complex elaborations in all the involved fields.

James Griesemer (2007) constitutes an example of how philosophical views serve historical analysis. The topic is the molecular study of development at the heart of evo-devo (see also 5.a). Griesemer suggests a revision of conventional narratives that describe evo-devo as a union between genetics and development, the study of which was supposedly abandoned since the 1930s (see Gilbert et al. 1996). For Griesemer this separation between fields is artificial: embryology and genetics have always been “like the segments of a centipede: moving together with limited autonomy” (p. 376). Griesemer’s long, reframing argument goes through the philosophical characterization of scientists as process followers: “there is no doubt that scientists do follow processes, that this is an important and central activity in their work, and that they achieve causal understanding as a result of doing it” (p. 377). Research styles, are, for Griesemer, “commitments to follow processes in a certain way.” Griesemer constructs the idea of genetics and embryology as nothing but research styles that “package commitments to follow processes according to particular sorts of marking interactions and tracking conventions together with commitments to represent processes in particular ways” (p. 381). In this view, the separation between genes and development ceases to be considered as an ontological divide: “It does not follow from the divergence of research styles and representational practices in genetics and embryology that nature is divided into separate processes of heredity and development” (p. 414). An important ingredient of Griesemer’s view is the use of representations with their dual course. Before being pressed into service as tools for communicating results and interpretations, representations are “working objects, developed as bench or field tools for tracking phenomena and following processes” (p. 388). Scientists, to be able to follow processes, produce representations that, in turn, “provide reinforcing feedback that organizes attention into foreground and background concerns” (p. 380). This dynamic becomes important when representations survive the experimental work in which they are produced and get used by other scientists. In scientific communities, process-marking and process-following strongly reinforce each other in an attention-guiding feedback, operated by representations. Griesemer analyzes Gregor Mendel’s experiments and describes Mendel—universally considered as the founding father of genetics—as a process follower and as a developmentalist. The different and successive notations appearing in Mendel’s writings (for example, A + 2Aa + a; then A + A + a + a) are the representations sequentially devised by Mendel in order to follow his enduring interest, that is, the developmental process of hybrids. Therefore, in this account, Mendel was a developmentalist who offered lasting representations that, in turn, helped some of his followers to focus on patterns of intergenerational transmission, backgrounding development. A particular power is given, in this account, to representations like Mendel’s notation: “In Mendel’s demonstration of notational equivalence, the possibility of reinterpreting his work on the development of hybrid characters in terms of a factor theory of hereditary transmission was so strong that it took very careful analysis by historians to show that Mendel probably did not hold the factor theory with which he is credited” (Griesemer 2007: 404). The consolidation of research styles directly concealed the awareness of their unity: “theories of heredity entail methodologies of development, and conversely” (p. 414); “genetics entails an idealized and abstracted account of development and development entails an idealized and abstract account of heredity” (p. 417); and “the gene theory not only had an embryological origin, it never really left embryology at all” (p. 414). An implication for the present and future of evo-devo is that putting heredity and development “back together again” can be thought of, in part, as a problem of conceptual reorientation, change in theoretical perspective, and rehabilitation of research styles that were out of the limelight rather than extinct due to failure. The commonly suggested linear progress from embryology to genetics to evo-devo is a historical artifact: there is no “historical progression of fields or lines of work that take the scientific limelight in turn” (p. 417).

Whether the described implications are historical or philosophical is difficult to discern, particularly in such cases as evo-devo, which are in current philosophical focus. It is more productive to say that Griesemer’s analysis shows the interplay between philosophy and history of biology. The view of scientists as “process-followers” can indeed be seen as a methodological and a philosophical proposal.

Philosophically, the view of scientists as process-followers is pretty well elaborated and detailed with the related notions of marking (mental and manipulative), foregrounding and backgrounding, research styles, and so on. Griesemer explicitly confronts his proposed view with more widespread approaches. For example:

The notion of following a process unifies [commonly separated] descriptions of science as theoretical representation, as systematic observation, and as technological intervention, [and] cuts across many analytical distinctions commonly used to describe science (e.g., theory vs. observation, theory vs. experiment, hypothesis-testing vs. measurement, active manipulation vs. passive observation, scientific methods vs. scientific goals). (p. 375)

The marking activity, in particular, “is not easily assigned to either of the traditional categories of passive observation and active experiment. It is nonmanipulative yet active work” (p. 379), and since such work allows to identify causality, theory, and methodology are intimately related. This local metatheorizing can be taken up by other philosophers and elaborated further.

Historians, too, are invited to test the view in describing other cases in the history of biology. To historians, however, there is also a reflexive message: just as scientists mark and follow processes in the systems they study, so do historians with the social and historical processes they follow. History and philosophy of science, in a word, would consist in a “following of scientists while they follow nature” (p. 376). This means that history and philosophy of science itself may be subject to research styles, and may need theoretical reorientations to gain a new understanding. In general, for example, since narratives tend to follow historical developments within scientific styles, history often gives a false impression of scientific disciplines (embryology and genetics, for example) that are separate because their theories describe different processes, like development and heredity. So historians and philosophers are invited to abandon “limelight narratives” that, while making “instructive drama,” confound our understanding of social processes. In the specific case of study:

Embryology is not well tracked in narratives that foreground the success of genetics after the split…. We must instead follow the ‘bushy’ divergences and reticulations of several sciences as they spawn new lines of work if we are to understand, and follow, science as a process. (p. 376, emphasis added)

8. Conclusion

We have seen that philosophy of biology can be actively involved in some scientific activities. When this is not the case, philosophy of biology is largely an interpretive description or re-description of instances of biology. Generalities about science in terms of general descriptions of science, biology, or biology’s sub-fields, as well as in terms of general philosophical problems about science, are usually the goals, sources, and backgrounds of philosophy of biology. At the same time, tentative generalities about science—such as the just seen view of “scientists as followers of processes”—can serve historical reconstructions and translate into methodological proposals for historians, too. However, the service is reciprocal, since present or past historical cases substantiate—if they don’t suggest—new philosophical views of science, which is a specific tendency of philosophy. The conclusion is that not only philosophy and science, but also history, form an entangled, integrated whole in current research, thought, and practice.

9. References and Further Reading

a. Cited Examples

  • Ankeny, Rachel, Sabina Leonelli. 2011. “What’s so special about model organisms?” Studies In History and Philosophy of Science Part A 42(2): 313-323.
  • Ariew, André. 2003. “Ernst Mayr’s ‘ultimate/proximate’ distinction reconsidered and reconstructed.” Biology and Philosophy 18: 553–565.
  • Eds. Barkow, Jerome H., Leda Cosmides , and John Tooby . 1992. The Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York, NY: Oxford University Press.
  • Beatty, John., 2010. “Reconsidering the importance of chance variation.” Evolution – The Extended Synthesis. Eds. Pigliucci Massimo and Gerd B. Müller. Cambridge-London: MIT Press, pp. 21-44.
  • Boudry, Maarten, Stefaan Blancke, and Johan Braeckman. 2010. “How not to attack Intelligent Design creationism: philosophical misconceptions about methodological naturalism.” Foundations of Science 15(3): 227-244.
  • Eds. Boumans, Marcel., Hasok Chang, and Rachel Ankeny. 2011. “Philosophy of Science in Practice.” European Journal for Philosophy of Science 1(3).
  • Boumans, Marcel, and Sabina Leonelli. 2013. “Introduction: on the Philosophy of Science in Practice.” Journal for General Philosophy of Science 44(2): 259-261.
  • Boyd, Robert, and Peter J. Richerson. 1985. Culture and the Evolutionary Process. Chicago: University of Chicago Press.
  • Callebaut, Werner. 2010. “The dialectics of dis/unity in the evolutionary synthesis and its extensions.” Evolution – The Extended Synthesis. Eds. Pigliucci, Massimo, and Gerd B. Müller. Cambridge-London: MIT Press, pp. 443-481.
  • Casetta, Elena, and Jorge Marques da Silva. 2015. “Facing the Big Sixth: from prioritizing species to conserving biodiversity.” Macroevolution – Explanation, Interpretation, Evidence. Eds. Serrelli, Emanuelle, and Nathalie Gontier. Berlin: Springer, pp. 377-403.
  • Cavalli-Sforza, Luigi L., and Marcus W. Feldman. 1981. Cultural Transmission and Evolution: A Quantitative Approach. Princeton, NJ: Princeton University Press.
  • Chung, Carl. 2003. “On the origin of the typological/population distinction in Ernst Mayr’s changing views of species, 1942–1959.” Studies in History and Philosophy of Biological and Biomedical Sciences 34(2): 277-296.
  • Churchland, Patricia S. 2011. Braintrust. What Neuroscience Tells Us about Morality. Princeton, NJ: Princeton University Press.
  • Claramonte Sanz, Vincente. 2011. Diseño Inteligente: La Pseudociencia del Siglo XXI: Aspectos Filosóficos, Sociológicos y Políticos de la Nueva Ideología Antievolucionista, Editorial Académica Española.
  • Cooper, Gregory. 2003. The Science of the Struggle for Existence: On the Foundations of Ecology. Cambridge University Press.
  • Eds. Creager, Angela N. H., Elizabeth Lunbeck, and M. Norton Wise. 2007. Science without Laws: Model Systems, Cases, Exemplary Narratives. Durham and London: Duke University Press.
  • Darwin, Charles R. 1859. On The Origin of Species. John Murray, London, sixth ed. quoted, 1872 edition.
  • Darwin, Charles R. 1871. The Descent of Man, and Selection in Relation to Sex. John Murray, London.
  • Darwin, Charles R. 1877. The Various Contrivances by Which Orchids are Fertilized by Insects. John Murray, London.
  • Dawkins, Richard. 1976. The Selfish Gene. Oxford: Oxford University Press.
  • Dawkins, Richard. 1983. Universal Darwinism. Evolution from Molecules to Man. Ed. Bendall D. S. Cambridge: Cambridge University Press.
  • Eds. De Caro, Maria, and David Macarthur. 2004. Naturalism in Question. Cambridge (MA): Harvard University Press.
  • Dennett, Daniel C. 1995. Darwin’s Dangerous Idea: Evolution and the Meanings of Life. Simon & Schuster, New York.
  • Downes, Stephen M. 1992. “The importance of models in theorizing: a deflationary semantic view.” Proc. of the Phil. of Science Ass. (PSA), volume 1. The University of Chicago Press, pp. 142-153.
  • Dupré, John. 2001. Human Nature and the Limits of Science. Oxford: Oxford University Press.
  • Eldredge, Niles. 1999. The Pattern of Evolution. New York: Freeman and Co.
  • Eldredge, Niles, and Stephen J. Gould. 1972. “Punctuated equilibria: an alternative to phyletic gradual- ism.”  Models in palaeobiology. Ed. Shopf, Thomas J. M. San Francisco CA: Freeman.
  • Eldredge, Niles., Telmo Pievani, Emanuele Serrelli, and, Ilya Temkin. 2016. Evolutionary Theory: A Hierarchical Perspective. University of Chicago Press.
  • Eldredge, Niles, and Ian Tattersall. 1982. The Myths of Human Evolution. New York: Columbia University Press.
  • Ereshefsky, Marc. 1997. “The evolution of the linnaean hierarchy.” Biology and Philosophy 12(4): 493–519.
  • Ereshefsky, Marc. 2012. “Homology thinking.” Biology and Philosophy 27(3): 381-400.
  • Fodor, Jerry. 1974. “Special sciences and the disunity of science as a working hypothesis.” Synthese 28: 97-115.
  • Gavrilets, Sergey. 2004. Fitness Landscapes and the Origin of Species. Princeton: Princeton University Press.
  • Ghiselin, Michael T. 1997. Metaphysics and the Origin of Species. Albany, NY: State University Press of New York.
  • Gilbert Scott F., John M. Opitz, and Rudolf A Raff. 1996. “Resynthesizing evolutionary and developmental biology.” Developmental Biology 173:357-372.
  • Godfrey-Smith, Peter. 2001. “Three kinds of adaptationism.” Adaptationism and Optimality. Eds. Orzack, Steve H., and Elliott Sober. Cambridge University Press, pp. 335-357.
  • Godfrey-Smith, Peter. 2006. “The strategy of model-based science.” Biology and Philosophy 21(5): 725-740.
  • Godfrey-Smith, Peter. 2009. Darwinian Populations and Natural Selection. Oxford: Oxford University Press.
  • Gontier, Nathalie. 2007. “Universal symbiogenesis: an alternative to universal selectionist accounts of evolution.” Symbiosis 44: 167-181.
  • Gould, Stephen. J. 1981. The Mismeasure of Man. New York: W.W. Norton.
  • Gould, Stephen J. 1989. Wonderful Life. The Burgess Shale and the Nature of History. New York: W. W. Norton.
  • Gould, Stephen J., and Richard C. Lewontin. 1979. “The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme.” Proceedings of the Royal Society of London Part B: Biological Sciences 205: 581-59.
  • Gould, Stephen J., and Elisabeth S. Vrba. 1982. “Exaptation – a missing term in the science of form.” Paleobiology 8: 4-15.
  • Griesemer, James R. 2007. “Tracking organic processes: representations and research styles in classical embryology and genetics.” From Embryology to Evo-Devo: A History of Developmental Evolution. Eds. Laubichler, Manfred D., and Jane Mannenschein. Cambridge, MA: MIT Press, pp. 375-433.
  • Eds. Griffiths, Paul E., and Ingo Brigandt. 2007. “The Importance of Homology for Biology and Philosophy.” Biology and Philosophy 22(5).
  • Griffiths, Paul E., and Karola Stotz. 2013. Genetics and Philosophy: An Introduction. New York: Cambridge University Press.
  • Grote, Mathias, and Maureen A. O’Malley. 2011. “Enlightening the life sciences: the history of halobacterial and microbial rhodopsin research.” FEMS Microbiology Reviews 35(6): 1082-99.
  • Haarsma, Deborah, et al. 2014-2015. “The President’s notebook: reviewing Darwin’s Doubt.” BioLogos. BioLogos. Web. 20 July 2016.
  • Haber, Matt H. 2008. “Phylogenetic Inference.” A Companion to the Philosophy of History and Historiography. Ed. Tucker, Aviezer. Blackwell Publishing, pp. 231–242.
  • Haldane, John B. S. 1924. “A mathematical theory of natural and artificial selection.”Part I. Transactions of the Cambridge Philosophical Society 23: 19-41.
  • Hamilton, William D. 1963. “The evolution of altruistic behavior.” American Naturalist 97:354-356.
  • Hamilton, William D. 1964. “The genetical evolution of social behavior.” Journal of Theoretical Biology 7(1): 1-52.
  • Hartl, Daniel L., and Andrew G. Clark. 2007. Principles of Population Genetics, Fourth ed. Sunderland, Mass.: Sinauer Associates.
  • Ed. Keller, Laurent. 1999. Levels of Selection in Evolution. Princeton, NJ: Princeton University Press.
  • Kuhn, Thomas. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.”  Criticism and the Growth of Knowledge. Eds. Lakatos, Imre, and Alan Musgraves. Cambridge: Cambridge University Press.
  • Laland, Kevin N., Kim Sterelny, John Odling-Smee, William Hoppitt, and Tobias Uller. 2011. “Cause and effect in biology revisited: is Mayr’s proximate-ultimate dichotomy still useful?” Science 334(6062): 1512-1516.
  • Laland, Kevin N., John Odling-Smee, William Hoppitt, and Tobias Uller. 2013. “More on how and why: a response to commentaries.” Biology and Philosophy 28(5): 793–810.
  • Laubichler, Manfred D. 2009. “Evolutionary developmental biology does offer a significant challenge to the neo-Darwinian paradigm.” Contemporary Debates in Philosophy of Biology. Eds. Ayala, Francisco J., and Robert Arp. Wiley-Blackwell, chapter 11.
  • Leonelli, Sabina. 2007. “Performing abstraction: two ways of modelling Arabidopsis thaliana.” Biology and Philosophy 23(4): 509-528.
  • Levins, Richard. 1966. “The strategy of model building in population biology.” American Scientist 54(4): 421-431.
  • Lewens, Tim. 2008. “Seven types of adaptationism.” Biology and Philosophy 24(2): 161-182.
  • Lewis, Jason E., David DeGusta, Marc R. Meyer, Janet M. Monge, Alan E. Mann, Ralph L. Holloway. 2011. “The mismeasure of science: Stephen J. Gould versus Samuel G. Morton on skulls and bias.” PLOS Biology 9: e1001071.
  • Lewontin, Richard C. 1970. “The units of selection.” Annual Review of Ecology and Systematics 1: 1-18
  • Love, Alan C. 2003. “Evolutionary morphology, innovation, and the synthesis of evolutionary and developmental biology.” Biology and Philosophy 18(2): 309–345.
  • Machamer, Peter, Carl Craver, and Lindley Darden. 2000. “Thinking about mechanisms.” Philosophy of Science 67: 1-25.
  • Malaterre, Cristophe. 2010. “Lifeness signatures and the roots of the tree of life.” Biology and Philosophy 25(4): 643-658.
  • Matthen, Mohan. 2009. “Drift and ‘statistically abstractive explanation’. Philosophy of Science 76(4): 464-487.
  • Matthen, Mohan. 2010. “What is drift? a response to Millstein, Skipper, and Dietrich.” Philosophy and Theory in Biology 2: e102.
  • Matthen, Mohan, and André Ariew. 2009. “Selection and causation.” Philosophy of Science 76(2): 201-224.
  • Smith, John. 1964. “Group selection and kin selection.” Nature 201(4924): 1145-1147.
  • Maynard Smith, John, and Eörs Szathmáry. 1995. The Major Transitions in Evolution. New York: Oxford University Press.
  • Mayr, Ernst. 1955. “Karl Jordan’s contribution to current concepts in systematics and evolution.” Transactions of the Royal Entomological Society of London 107(1-14): 25-66.
  • Mayr, Ernst. 1959. “Where are we?” Cold Spring Harbor Symposium on Quantitative Biology 24: 409–440.
  • Mayr, Ernst. 1997. “The objects of selection.” Proceedings of the National Academy of Sciences USA 94(6): 2091-2094.
  • Mayr, Ernst. 1982. The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press.
  • Mayr, Ernst. 2004. What Makes Biology Unique? Cambridge: Cambridge University Press.
  • McShea, Daniel W., and Robert N. Brandon. 2010. Biology’s First Law: The Tendency for Diversity and Complexity to Increase in Evolutionary Systems. University of Chicago Press.
  • Merlin, Francesca. 2010. “Evolutionary chance mutation: a defense of the Modern Synthesis’ consensus view.” Philosophy and Theory in Biology 2: e103.
  • Mesoudi, Alex, Andrew Whiten, and Kevin N. Laland. 2006. “Towards a unified science of cultural evolution.” The Behavioral and Brain Sciences 29(4): 329-347; discussion: 347-383.
  • Mikkelson, Gregory M. 2007. “Ecology.” The Cambridge Companion to the Philosophy of Biology. Eds. Hull, David, and Michael Ruse. Cambridge: Cambridge University Press, pp. 372-387.
  • Millstein, Roberta L. 2002. “Are random drift and natural selection conceptually distinct?” Biology and Philosophy 17: 33-53.
  • Millstein, Roberta L. 2006. “Natural selection as a population-level causal process.” British Journal for the Philosophy of Science 57, pages 627–653,.
  • Millstein, Roberta L., Robert A. Skipper, and Michael R. Dietrich. 2009. “(Mis)interpreting mathematical models: drift as a physical process.” Philosophy and Theory in Biology 1: e002.
  • Minelli, Alessandro. 2009. “Evolutionary developmental biology does not offer a significant challenge to the neo-Darwinian paradigm.”. Contemporary Debates in Philosophy of Biology. Eds. Ayala, Francisco and Robert Arp. Wiley-Blackwell, chapter 12.
  • Morrison, Margaret. 2006. “Unification, explanation and explaining unity: The Fisher-Wright controversy.” The British Journal for the Philosophy of Science 57(1): 233-245.
  • Matthews, Michael R. 1994. Science Teaching. The Role of History and Philosophy of Science. New York-London: Routledge.
  • Ohta, Tomoko, and Motoo Kimura. 1971. “On the constancy of the evolutionary rate of cistrons.” Journal of Molecular Evolution 1(1):18-25.
  • Okasha, Samir. 2006. Evolution and the Levels of Selection. Oxford: Oxford University Press.
  • Olson, Mark E. 2012. “The developmental renaissance in adaptationism.” Trends in Ecology and Evolution 27(5): 278-87.
  • O’Malley, Maureen A., ed. 2010. The Tree of Life. Spec. issue of Biology and Philosophy 25(4).
  • O’Malley, Maureen A. 2010. “Ernst Mayr, the tree of life, and philosophy of biology.” The Tree of Life. Spec. issue of Biology and Philosophy 25(4): 529-552.
  • Oyama, Susan. 1998. Evolution’s Eye. Durham (NC): Duke University Press.
  • Oyama, Susan, Paul E. Griffiths, and Russell D. Gray eds. 2001. Cycles of Contingency. Cambridge (MA): The MIT Press.
  • Panebianco Fabrizio, and Emanuele Serrelli, eds. 2016. Understanding Cultural Traits: A Multidisciplinary Perspective on Cultural Diversity. Springer, Switzerland. DOI 10.1007/978-3-319-24349-8
  • Panebianco Fabrizio, and Emanuele Serrelli. 2016. “Cultural traits and multidisciplinary dialogue.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Springer, Switzerland, Chapter 1.
  • Peck, Steven L. 2008. “The hermeneutics of ecological simulation.” Biology and Philosophy 23(3): 383-402.
  • Penny, David. 2005. “An interpretive review of the origin of life research.” Biology and Philosophy 20(4): 633-671.
  • Pievani, Telmo. 2011. “Born to cooperate? Altruism as exaptation, and the evolution of human sociality.” Origins of Cooperation and Altruism. Eds. Sussman, Robert W. and C. Robert Cloninger. New York: Springer.
  • Pievani, Telmo. 2012. “Many ways of being human, the Stephen J. Gould’s legacy to Palaeo-Anthropology (2002-2012).” Journal of Anthropological Sciences 90: 33-49.
  • Pievani, Telmo. 2012. “Geoethics and philosophy of earth sciences: the role of geophysical factors in human evolution.” Annals of Geophysics 55(3): 349-353.
  • Pievani, Telmo, and Emanuele Serrelli. 2011. “Exaptation in human evolution: how to test adaptive vs exaptive evolutionary hypotheses.” Journal of Anthropological Sciences 89:9-23. [DOI 10.4436/jass.89015]
  • Pigliucci, Massimo, and Gerd B. Müller, eds. 2010. Evolution – The Extended Synthesis. Cambridge-London: MIT Press.
  • Piotrowska, Monika. 2013. “From humanized mice to human disease: Guiding extrapolation from model to target.” Biology and Philosophy 28(3): 439-455.
  • Plutynski, Anya. “Explanatory unification and the early synthesis.” The British Journal for the Philosophy of Science 56(3): 595-609.
  • Plutynski, Anya. 2008. “Ecology and the environment.”  The Oxford Handbook of Philosophy of Biology. Ed. Ruse, Michael. New York: Oxford University Press, pp. 504-524.
  • Popper, Karl R. 1935. Logik der Forschung, Vienna: Julius Springer Verlag. Eng. Tr. The Logic of Scientific Discovery. London: Hutchinson, 1959.
  • Popper, Karl R. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
  • Psillos, Stathis. 2012. “What is General Philosophy of Science?” Journal for General Philosophy of Science 43(1): 93-103.
  • Ramsey, Grant. 2013. “Can fitness differences be a cause of evolution?” Philosophy and Theory in Biology 5: e401.
  • Reydon, Thomas A. C. E. 2005. “Bridging the gap between history and philosophy of biology.” Metascience 14(2): 249-253.
  • Rosenberg, Alexander. 2006. Darwinian Reductionism. Or, How to Stop Worrying and Love Molecular Biology. Chicago: The University of Chicago Press.
  • Ruse, Michael. 2012. The Philosophy of Human Evolution. Cambridge & New York: Cambridge University Press.
  • Schaffner, Kenneth F. 1993. Discovery and Explanation in Biology and Medicine. Chicago: University of Chicago Press.
  • Schaffner, Kenneth F. 1998. “Model organisms and behavioral genetics: a rejoinder.” Philosophy of Science 65:276-288.
  • Serrelli, Emanuele. 2015. “Visualizing macroevolution: from adaptive landscapes to compositions of multiple spaces.” Macroevolution: explanation, interpretation and evidence. Eds. Serrelli, Emanuele, and Niles Gontier. Interdisciplinary Evolution Research series, Springer, pp. 113-162. DOI 10.1007/978-3-319-15045-1_4
  • Serrelli, Emanuele. 2016. “Evolutionary genetics and cultural traits in a ‘body of theory’ perspective.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Eds. Panebianco, Fabrizio, and Emanuele Serrelli. Springer, Switzerland.
  • Serrelli, Emanuele. 2016. “Removing barriers in scientific research: concepts, synthesis and catalysis.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Eds. Panebianco, Fabrizio, and Emanuele Serrelli. Springer, Switzerland.
  • Serrelli, Emanuele, and Niles Gontier, eds. 2015. Macroevolution: Explanation, Interpretation and Evidence. Springer. [DOI 10.1007/978-3-319-15045-1]
  • Serrelli Emanuele, and Niles Gontier. 2015. “Macroevolutionary issues and approaches in evolutionary biology.” Macroevolution: explanation, interpretation and evidence. Eds. Serrelli, Emanuele, and Niles Gontier. Interdisciplinary Evolution Research series, Springer, pp. 1-25. [DOI 10.1007/978-3-319-15045-1_1]
  • Skipper, Robert A., and Roberta L. Millstein. 2005. “Thinking about evolutionary mechanisms: natural selection.” Studies in the History and Philosophy of Biological and Biomedical Sciences 36: 327-347.
  • Sober, Elliott. 2007. “What is wrong with Intelligent Design?” The Quarterly Review of Biology 82(1): 3-8.
  • Sober, Elliott, and David S. Wilson. 1998. Unto Others: The Evolution and Psychology of Unselfish Behavior. Cambridge, MA: Harvard University Press.
  • Sterelny, Kim. 2003. Thought in a Hostile World: The Evolution of Human Cognition. Oxford: Blackwell.
  • Tattersall, Ian. 2013. “Stephen J. Gould’s intellectual legacy to anthropology.” Stephen J. Gould’s Legacy. Nature, History, Society. Eds. Daniele, Antonio G., Allesandro Minelli, and Telmo Pievani.  New York: Springer-Verlag.
  • Van Speybroeck, Linda. 2007. “Philosophy of biology: about the fossilization of disciplines and other embryonic thought.” Acta Biotheoretica 55(1): 47-71.
  • Walsh, Denis M. 2007. “The pomp of superfluous causes: the interpretation of evolutionary theory.” Philosophy of Science 74(3): 281-303.
  • Walsh, Denis M., Tim Lewens, and André Ariew. 2002. “The trials of life: natural selection and random drift.” Philosophy of Science 69(3): 452-473.
  • Waters, C. Kenneth. 1998. “Causal regularities in the biological world of contingent distributions.” Biology and Philosophy 13(1): 5-36.
  • Wilkins, John S. 2011. Species: A History of the Idea. Berkeley, CA: University of California Press.
  • Wilkins, John S., and Malte C. Ebach. 2013. The Nature of Classification: Relationships and Kinds in the Natural Sciences. Palgrave Macmillan.
  • Williams, George C. 1966. Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought. Princeton University Press.
  • Wilson, David S. 1975. “A theory of group selection.” Proceedings of the National Academy of Sciences USA 72(1): 143-146.
  • Wilson, Edward O. 1975. Sociobiology: The New Synthesis. Cambridge (MA): Harvard University Press.

b. Classics

i. First Generation

  • Beckner, Morton. 1959. The Biological Way of Thought. New York: Columbia University Press.
  • Hull, David L. 1964. “The metaphysics of evolution.” British Journal for the History of Science 3: 309-337.
  • Hull, David L. 1969. “What philosophy of biology is not.” Synthese 20: 157-84.
  • Hull, David L. 1970. “Contemporary systematic philosophies.” Annual Review of Ecology and Systematics 1(1): 19-54.
  • Hull, David L. 1974. Philosophy of Biological Science. Englewood Cliffs, NJ: Prentice-Hall.
  • Hull, David L. 1976. “Are species really individuals?” Systematic Biology 25(2): 174-191.
  • Grene, Marjorie G. 1959. “Two evolutionary theories, I–II.” British Journal for the Philosophy of Science 9: 110–27; 185–93.
  • Grene, Marjorie G., and Everett Mendelsohn, eds. 1976. Topics in the Philosophy of Biology. Dordrecht: D. Reidel.
  • Schaffner, Kenneth F. 1967. “Approaches to reduction.” Philosophy of Science 34: 137-147.
  • Schaffner, Kenneth F. 1967. “Antireductionism and molecular biology.” Science 157: 644-647.
  • Ruse, Michael. 1973. The Philosophy of Biology. London:Hutchinson.
  • Wimsatt, William C. 1972. “Teleology and the logical structure of function statements.” Studies in History and Philosophy of Science 3: 1-80.

ii. Second Generation

  • Amundson, Ron. 1988. “Logical adaptationism.” Behavioral and Brain Sciences 11: 505-506.
  • Amundson, Ron. 1989. “The trials and tribulations of selectionist explanations.” Issues in Evolutionary Epistemology. Eds. Hahlweg, Kai, and Clifford A. Hooker. State University of New York Press.
  • Beatty, John. 1980. “Optimal-design models and the strategy of model building in evolutionary biology.” Philosophy of Science 47: 532-561.
  • Beatty, John. 1982. “Classes and cladists.” Systematic Zoology 31: 25-34.
  • Brandon, Robert. 1990. Adaptation and Environment. Cambridge: MIT Press.
  • Burian, Richard M. 1988. “Challenges to the evolutionary synthesis.” Evolutionary Biology 23: 247-259.
  • Darden, Lindley. 1977. “William Bateson and the promise of Mendelism.” Journal of the History of Biology 10: 87-106.
  • Darden, Lindley. 1976. “Reasoning in scientific change: Charles Darwin, Hugo de Vries, and the discovery of segregation.” Studies in History and Philosophy of Science 7: 127-169.
  • Depew, David, and Bruce H. Weber. 1985. Evolution at a Crossroads: The New Biology and the New Philosophy of Science. Cambridge MA: Bradford Books/MIT Press.
  • Depew, David, and Bruce H. Weber. 1995. Darwinism Evolving: Systems Dynamics and the Genealogy of Natural Selection. Cambridge, MA: Bradford Books/MIT Press.
  • Dupré, John A. 1987. “Human kinds.”. The Latest on the Best. Ed. Dupré, John A. Bradford Books/MIT Press, pp. 327-348.
  • Gieryn, Thomas F. 1999. Cultural Boundaries of Science: Credibility on the Line. Chicago: University of Chicago Press.
  • Griesemer, James R. 1988. “Genes, memes and demes.” Biology and Philosophy 3:179-184.
  • Griesemer, James R., and Michael J. Wade. 1988. “Laboratory models, causal explanation and group selection.” Biology and Philosophy 3: 67-96.
  • Kitcher, Philip. 1981. “Explanatory unification.” Philosophy of Science 48: 507-531.
  • Kitcher, Philip. 1982. “Genes.” British Journal for the Philosophy of Science 33: 337-359.
  • Kitcher, Philip. 1984. “Species.” Philosophy of Science 51: 308-333.
  • Lloyd, Elisabeth A. 1983. “The nature of Darwin’s support for the theory of natural selection.” Philosophy of Science 50: 112-129.
  • Lloyd, Elisabeth A. 1984. “A Semantic approach to the structure of population genetics.” Philosophy of Science 51: 242-264.
  • Lloyd, Elisabeth A. 1988. The Structure and Confirmation of Evolutionary Theory. Greenwood Press.
  • Mills, Susan K., and John Beatty. 1979. “The propensity interpretation of fitness.” Philosophy of Science 46: 263-286.
  • Rosenberg, Alexander. 1978. “The supervenience of biological concepts.” Philosophy of Science 45: 368-386.
  • Rosenberg, Alexander. 1982. “On the propensity definition of fitness.” Philosophy of Science 49: 268-273.
  • Sober, Elliott. 1983. “Parsimony in systematics: philosophical issues.” Annual Review of Ecology and Systematics 14(1): 335-357.
  • Sober, Elliott. 1983. “Equilibrium explanation.” Philosophical Studies 43: 201- 210.
  • Sober, Elliott. 1984. The Nature of Selection. Evolutionary Theory in Philosophical Focus. Chicago: University of Chicago Press.
  • Sober, Elliott. 1988. Reconstructing the Past: Parsimony, Evolution, and Inference. Cambridge: Cambridge University Press.
  • Weber, Bruce H., James Smith, and David Depew. 1988. Entropy, Information, and Evolution: New Perspectives on Physical and Biological Evolution. Cambridge, MA: Bradford Books/MIT Press.

c. Contemporary

i. Reviews

  • Byron, Jason M. 2007. “Whence philosophy of biology?” The British Journal for the Philosophy of Science 58(3): 409-422.
  • Callebaut, Werner. 2005. “Again, what the philosophy of biology is not.” Acta Biotheoretica 23: 93-122.
  • Griffiths, Paul E. 2011. “Philosophy of biology.” The Stanford Encyclopedia of Philosophy (Summer 2011 Edition). Ed. Zalta, Edward N.. URL = <http://plato.stanford.edu/archives/sum2011/entries/biology-philosophy/>.
  • Hull, David L. 2002. “Recent philosophy of biology: a review.” Acta Biotheoretica 50: 117-128.
  • Gers, Matt. 2009. “The long reach of philosophy of biology.” Biology and Philosophy 26(3): 439-447.
  • Müller-Wille, Staffan. 2007. “Philosophy of biology beyond evolution.” Biological Theory 2(1): 111-112.
  • O’Malley, Maureen A., and John A. Dupré. 2007. “Size doesn’t matter: towards a more inclusive philosophy of biology.” Biology and Philosophy 22(2): 155-191.
  • Pradeu, Thomas. 2009. “What philosophy of biology should be.” Biology and Philosophy 26(1): 119-127.

ii. Some Monographs

  • Pigliucci, Massimo, and Jonathan Kaplan. 2006. Making Sense of Evolution. Chicago: University of Chicago Press.
  • Sober, Elliott. 2008. Evidence and Evolution: The Logic Behind the Science. Cambridge University Press.
  • Sober, Elliott, and David S. Wilson. 1998. Unto Others: the Evolution and Psychology of Unselfish Behavior. Cambridge (MA): Harvard University Press.

d. Anthologies and Textbooks

  • Ayala, Francisco, and Robert Arp, eds. 2009. Contemporary Debates in Philosophy of Biology. Wiley-Blackwell.
  • Callebaut, Werner, ed. 1993. Taking the Naturalistic Turn: Or, How Real Philosophy of Science is Done. Chicago:University of Chicago Press.
  • Grene, Marjorie G., and David Depew. 2004. The Philosophy of Biology: An Episodic History. Cambridge: Cambridge University Press.
  • Hull, David L., and Michael Ruse, eds. 1998. The Philosophy of Biology. Oxford: Oxford University Press.
  • Hull, David L., and Michael Ruse, eds. 2007. The Cambridge Companion to the Philosophy of Biology. Cambridge: Cambridge University Press.
  • Kampourakis, Kostas, ed. 2013. The Philosophy of Biology: A Companion for Educators. Springer Science & Business.
  • Linquist, Stefan, ed. 2010. Philosophy of Evolutionary Biology. Vol. 1. Berlington, VT: Ashgate.
  • Rosenberg, Alexander, and Robert Arp, eds. 2009. Philosophy of Biology: An Anthology. Wiley-Blackwell.
  • Rosenberg, Alexander, and Daniel W. McShea. 2008. Philosophy of Biology: A Contemporary Introduction. Routledge.
  • Ruse, Michael, ed. 1989. What the Philosophy of Biology is. Dordrecht, The Netherlands: Kluwer.
  • Ruse, Michael, ed. 2008. The Oxford Handbook of Philosophy of Biology. New York: Oxford University Press.
  • Sarkar, Sahotra, and Anya Plutynski. 2008. A Companion to the Philosophy of Biology. Blackwell Publishing.
  • Sober, Elliott. 1993. Philosophy of Biology. Westview Press/Oxford University Press.
  • Sober, Elliott. 2006. Conceptual Issues in Evolutionary Biology. Cambridge, MA: MIT Press.
  • Sterelny, Kim, and Paul E. Griffiths. 1999. Sex and Death: An Introduction to the Philosophy of Biology. Chicago: The University of Chicago Press.

e. Journals

i. Dedicated

  • Acta Biotheoretica (since 1935)
  • History and Philosophy of the Life Sciences (since 1979)
  • Biology and Philosophy
  • Journal of the History of Biology
  • Studies in History and Philosophy of Biological and Biomedical Sciences Journal of the History of Biology
  • Biological Theory
  • Studies In History and Philosophy of Science Part A
  • Philosophical Transactions of the Royal Society B
  • Philosophy and Theory in Biology (since 2009)

ii. Generalist

  • Philosophy of Science
  • The British Journal for the Philosophy of Science
  • Foundations of Science
  • Metascience
  • Studies in History and Philosophy of Science

f. Organizations

  • International Society for the History, Philosophy, and Social Studies of Biology (http://www.ishpssb.org/)
  • Philosophy of Science Association (PSA) (http://www.philsci.org/)
  • Society for Philosophy of Science in Practice (SPSP) (http://www.philosophy-science-practice.org/)
  • Inter-Divisional Teaching Commission (IDTC) of the International Union for the History and Philosophy of Science (IUHPS): http://www.idtc-iuhps.com/

g. Online Resources

  • T. Lewens, “The philosophy of biology: a selection of readings and resources” page at the University of Cambridge: http://www.hps.cam.ac.uk/research/philofbio.html

Author Information

Emanuele Serrelli
Email: emanuele.serrelli@epistemologia.eu
University of Milano-Bicocca
Italy

Ancient Greek Philosophy

vaseFrom Thales, who is often considered the first Western philosopher, to the Stoics and Skeptics, ancient Greek philosophy opened the doors to a particular way of thinking that provided the roots for the Western intellectual tradition. Here, there is often an explicit preference for the life of reason and rational thought. We find proto-scientific explanations of the natural world in the Milesian thinkers, and we hear Democritus posit atoms—indivisible and invisible units—as the basic stuff of all matter. With Socrates comes a sustained inquiry into ethical matters—an orientation towards human living and the best life for human beings. With Plato comes one of the most creative and flexible ways of doing philosophy, which some have since attempted to imitate by writing philosophical dialogues covering topics still of interest today in ethics, political thought, metaphysics, and epistemology. Plato’s student, Aristotle, was one of the most prolific of ancient authors. He wrote treatises on each of these topics, as well as on the investigation of the natural world, including the composition of animals. The Hellenists—Epicurus, the Cynics, the Stoics, and the Skeptics—developed schools or movements devoted to distinct philosophical lifestyles, each with reason at its foundation.

With this preference for reason came a critique of traditional ways of living, believing, and thinking, which sometimes caused political trouble for the philosophers themselves. Xenophanes directly challenged the traditional anthropomorphic depiction of the gods, and Socrates was put to death for allegedly inventing new gods and not believing in the gods mandated by the city of Athens. After the fall of Alexander the Great, and because of Aristotle’s ties with Alexander and his court, Aristotle escaped the same fate as Socrates by fleeing Athens. Epicurus, like Xenophanes, claimed that the mass of people is impious, since the people conceive of the gods as little more than superhumans, even though human characteristics cannot appropriately be ascribed to the gods. In short, not only did ancient Greek philosophy pave the way for the Western intellectual tradition, including modern science, but it also shook cultural foundations in its own time.

Table of Contents

  1. Presocratic Thought
    1. The Milesians
    2. Xenophanes of Colophon
    3. Pythagoras and Pythagoreanism
    4. Heraclitus
    5. Parmenides and Zeno
    6. Anaxagoras
    7. Democritus and Atomism
    8. The Sophists
  2. Socrates
  3. Plato
    1. Background of Plato’s Work
    2. Metaphysics
    3. Epistemology
    4. Psychology
    5. Ethics and Politics
  4. Aristotle
    1. Terminology
    2. Psychology
    3. Ethics
    4. Politics
    5. Physics
    6. Metaphysics
  5. Hellenistic Thought
    1. Epicureanism
      1. Physics
      2. Ethics
    2. The Cynics
    3. The Stoics
      1. Physics
      2. Epistemology
      3. Ethics
    4. Skeptics
      1. Academic Skepticism
      2. Pyrrhonian Skepticism
  6. Post-Hellenistic Thought
    1. Plotinus
      1. Intellect, Soul, and Matter
      2. The True Self and the Good Life
    2. Later Neoplatonists
    3. Cicero and Roman Philosophy
  7. Conclusion
  8. References and Further Reading
    1. Presocratics
      1. Primary Sources
      2. Secondary Sources
    2. Socrates and Plato
      1. Primary Sources
      2. Secondary Sources
    3. Aristotle
      1. Primary Sources
      2. Secondary Sources
    4. Hellenistic Philosophy
      1. Primary Sources
      2. Secondary Sources

1. Presocratic Thought

An analysis of Presocratic thought presents some difficulties. First, the texts we are left with are primarily fragmentary, and sometimes, as in the case of Anaxagoras, we have no more than a sentence’s worth of verbatim words. Even these purportedly verbatim words often come to us in quotation from other sources, so it is difficult, if not impossible, to attribute with certainty a definite position to any one thinker. Moreover, “Presocratic” has been criticized as a misnomer since some of the Presocratic thinkers were contemporary with Socrates and because the name might imply philosophical primacy to Socrates. The term “Presocratic philosophy” is also difficult since we have no record of Presocratic thinkers ever using the word “philosophy.” Therefore, we must approach cautiously any study of presocratic thought.

Presocratic thought marks a decisive turn away from mythological accounts towards rational explanations of the cosmos. Indeed, some Presocratics openly criticize and ridicule traditional Greek mythology, while others simply explain the world and its causes in material terms. This is not to say that the Presocratics abandoned belief in gods or things sacred, but there is a definite turn away from attributing causes of material events to gods, and at times a refiguring of theology altogether. The foundation of Presocratic thought is the preference and esteem given to rational thought over mythologizing. This movement towards rationality and argumentation would pave the way for the course of Western thought.

a. The Milesians

Thales (c.624-c.545 B.C.E.), traditionally considered to be the “first philosopher,” proposed a first principle (arche) of the cosmos: water. Aristotle offers some conjectures as to why Thales might have believed this (Graham 29). First, all things seem to derive nourishment from moisture. Next, heat seems to come from or carry with it some sort of moisture. Finally, the seeds of all things have a moist nature, and water is the source of growth for many moist and living things. Some assert that Thales held water to be a component of all things, but there is no evidence in the testimony for this interpretation. It is much more likely, rather, that Thales held water to be a primal source for all things—perhaps the sine qua non of the world.

Like Thales, Anaximander (c.610-c.545 B.C.E.) also posited a source for the cosmos, which he called the boundless (apeiron). That he did not, like Thales, choose a typical element (earth, air, water, or fire) shows that his thinking had moved beyond sources of being that are more readily available to the senses. He might have thought that, since the other elements seem more or less to change into one another, there must be some source beyond all these—a kind of background upon or source from which all these changes happen. Indeed, this everlasting principle gave rise to the cosmos by generating hot and cold, each of which “separated off” from the boundless. How it is that this separation took place is unclear, but we might presume that it happened via the natural force of the boundless. The universe, though, is a continual play of elements separating and combining. In poetic fashion, Anaximander says that the boundless is the source of beings, and that into which they perish, “according to what must be: for they give recompense and pay restitution to each other for their injustice according to the ordering of time” (F1).

If our dates are approximately correct, Anaximenes (c.546-c.528/5 B.C.E.) could have had no direct philosophical contact with Anaximander. However, the conceptual link between them is undeniable. Like Anaximander, Anaximenes thought that there was something boundless that underlies all other things. Unlike Anaximander, Anaximenes made this boundless thing something definite—air. For Anaximander, hot and cold separated off from the boundless, and these generated other natural phenomena (Graham 79). For Anaximenes, air itself becomes other natural phenomena through condensation and rarefaction. Rarefied air becomes fire. When it is condensed, it becomes water, and when it is condensed further, it becomes earth and other earthy things, like stones (Graham 79). This then gives rise to all other life forms. Furthermore, air itself is divine. Both Cicero and Aetius report that, for Anaximenes, air is God (Graham 87). Air, then, changes into the basic elements, and from these we get all other natural phenomena.

b. Xenophanes of Colophon

Xenophanes (c.570-c.478 B.C.E.) directly and explicitly challenged Homeric and Hesiodic mythology. “It is good,” says Hesiod, “to hold the gods in high esteem,” rather than portraying them in “raging battles, which are worthless” (F2). More explicitly, “Homer and Hesiod have attributed to the gods all things that are blameworthy and disgraceful for human beings: stealing, committing adultery, deceiving each other” (F17). At the root of this poor depiction of the gods is the human tendency towards anthropomorphizing the gods. “But mortals think gods are begotten, and have the clothing, voice and body of mortals” (F19), despite the fact that God is unlike mortals in body and thought. Indeed, Xenophanes famously proclaims that if other animals (cattle, lions, and so forth) were able to draw the gods, they would depict the gods with bodies like their own (F20). Beyond this, all things come to be from earth (F27), not the gods, although it is unclear whence came the earth. The reasoning seems to be that God transcends all of our efforts to make him like us. If everyone paints different pictures of divinity, and many people do, then it is unlikely that God fits into any of those frames. So, holding “the gods in high esteem” at least entails something negative, that is, that we take care not to portray them as super humans.

c. Pythagoras and Pythagoreanism

Ancient thought was left with such a strong presence and legacy of Pythagorean influence, and yet little is known with certainty about Pythagoras of Samos (c.570-c.490 B.C.E.). Many know Pythagoras for his eponymous theorem—the square of the hypotenuse of a right triangle is equal to the sum of the squares of the adjacent sides. Whether Pythagoras himself invented the theorem, or whether he or someone else brought it back from Egypt, is unknown. He developed a following that continued long past his death, on down to Philolaus of Croton (c.470-c.399 B.C.E.), a Pythagorean from whom we may gain some insight into Pythagoreanism. Whether or not the Pythagoreans followed a particular doctrine is up for debate, but it is clear that, with Pythagoras and the Pythagoreans, a new way of thinking was born in ancient philosophy that had a significant impact on Platonic thought.

The Pythagoreans believed in the transmigration of souls. The soul, for Pythagoras, finds its immortality by cycling through all living beings in a 3,000-year cycle, until it returns to a human being (Graham 915). Indeed, Xenophanes tells the story of Pythagoras walking by a puppy who was being beaten. Pythagoras cried out that the beating should cease, because he recognized the soul of a friend in the puppy’s howl (Graham 919). What exactly the Pythagorean psychology entails for a Pythagorean lifestyle is unclear, but we pause to consider some of the typical characteristics reported of and by Pythagoreans.

Plato and Aristotle tended to associate the holiness and wisdom of number—and along with this, harmony and music—with the Pythagoreans (Graham 499). Perhaps more basic than number, at least for Philolaus, are the concepts of the limited and unlimited. Nothing in the cosmos can be without limit (F1), including knowledge (F4). Imagine if nothing were limited, but matter were just an enormous heap or morass. Next, suppose that you are somehow able to gain a perspective of this morass (to do so, there must be some limit that gives you that perspective!). Presumably, nothing at all could be known, at least not with any degree of precision, the most careful observation notwithstanding. Additionally, all known things have number, which functions as a limit of things insofar as each thing is a unity, or composed of a plurality of parts.

d. Heraclitus

Heraclitus of Ephesus (c.540-c.480 B.C.E.) stands out in ancient Greek philosophy not only with respect to his ideas, but also with respect to how those ideas were expressed. His aphoristic style is rife with wordplay and conceptual ambiguities. Heraclitus saw reality as composed of contraries—a reality whose continual process of change is precisely what keeps it at rest.

Fire plays a significant role in his picture of the cosmos. No God or man created the cosmos, but it always was, is, and will be fire. At times it seems as though fire, for Heraclitus, is a primary element from which all things come and to which they return. At others, his comments on fire could easily be seen metaphorically. What is fire? It is at once “need and satiety.” This back and forth, or better yet, this tension and distension is characteristic of life and reality—a reality that cannot function without contraries, such as war and peace. “A road up and down is one and the same” (F38). Whether one travels up the road or down it, the road is the same road. “On those stepping into rivers staying the same other and other waters flow” (F39). In his Cratylus, Plato quotes Heraclitus, via the mouthpiece of Cratylus, as saying that “you could not step twice into the same river,” comparing this to the way everything in life is in constant flux (Graham 158). This, according to Aristotle, supposedly drove Cratylus to the extreme of never saying anything for fear that the words would attempt to freeze a reality that is always fluid, and so, Cratylus merely pointed (Graham 183). So, the cosmos and all things that make it up are what they are through the tension and distention of time and becoming. The river is what it is by being what it is not. Fire, or the ever-burning cosmos, is at war with itself, and yet at peace—it is constantly wanting fuel to keep burning, and yet it burns and is satisfied.

e. Parmenides and Zeno

If it is true that for Heraclitus life thrives and even finds stillness in its continuous movement and change, then for Parmenides of Elea (c.515-c.450 B.C.E.) life is at a standstill. Parmenides was a pivotal figure in Presocratic thought, and one of the most influential of the Presocratics in determining the course of Western philosophy. According to McKirahan, Parmenides is the inventor of metaphysics (157)—the inquiry into the nature of being or reality. While the tenets of his thought have their home in poetry, they are expressed with the force of logic. The Parmenidean logic of being thus sparked a long lineage of inquiry into the nature of being and thinking.

Parmenides recorded his thought in the form of a poem. In it, there are two paths that mortals can take—the path of truth and the path of error. The first path is the path of being or what-is. The right way of thinking is to think of what-is, and the wrong way is to think both what-is and what-is-not. The latter is wrong, simply because non-being is not. In other words, there is no non-being, so properly speaking, it cannot be thought—there is nothing there to think. We can think only what is and, presumably, since thinking is a type of being, “thinking and being are the same” (F3). It is only our long entrenched habits of sensation that mislead us into thinking down the wrong path of non-being. The world, and its appearance of change, thrusts itself upon our senses, and we erroneously believe that what we see, hear, touch, taste, and smell is the truth. But, if non-being is not, then change is impossible, for when anything changes, it moves from non-being to being. For example, for a being to grow tall, it must have at some point not been tall. Since non-being is not and cannot therefore be thought, we are deluded into believing that this sort of change actually happens. Similarly, what-is is one. If there were a plurality, there would be non-being, that is, this would not be that. Parmenides thus argues that we must trust in reason alone.

In the Parmenidean tradition, we have Zeno (c.490-c.430 B.C.E.). As Daniel Graham says, while “Parmenides argues for monism, Zeno argues against pluralism” (Graham 245). Zeno seems to have composed a text wherein he claims to show the absurdity in accepting that there is a plurality of beings, and he also shows that motion is impossible. Zeno shows that if we attempt to count a plurality, we end up with an absurdity. If there were a plurality, then it would be neither more nor less than the number that it would have to be. Thus, there would be a finite number of things. On the other hand, if there were a plurality, then the number would be infinite because there is always something else between existing things, and something else between those, and something else between those, ad infinitum. Thus, if there were a plurality of things, then that plurality would be both infinite and finite in number, which is absurd (F4).

The most enduring paradoxes are those concerned with motion. It is impossible for a body in motion to traverse, say, a distance of twenty feet. In order to do so, the body must first arrive at the halfway point, or ten feet. But in order to arrive there, the body in motion must travel five feet. But in order to arrive there, the body must travel two and a half feet, ad infinitum. Since, then, space is infinitely divisible, but we have only a finite time to traverse it, it cannot be done. Presumably, one could not even begin a journey at all. The “Achilles Paradox” similarly attacks motion saying that swift-footed Achilles will never be able to catch up with the slowest runner, assuming the runner started at some point ahead of Achilles. Achilles must first reach the place where the slow runner began. This means that the slow runner will already be a bit beyond where he began. Once Achilles progresses to the next place, the slow runner is already beyond that point, too. Thus, motion seems absurd.

f. Anaxagoras

Anaxagoras of Clazomenae (c.500-c.428 B.C.E.) had what was, up until that time, the most unique perspective on the nature of matter and the causes of its generation and corruption. Closely predating Plato (Anaxagoras died around the time that Plato was born), Anaxagoras left his impression upon Plato and Aristotle, although they were both ultimately dissatisfied with his cosmology (Graham 309-313). He seems to have been almost exclusively concerned with cosmology and the true nature of all that is around us.

Before the cosmos was as it is now, it was nothing but a great mixture—everything was in everything. The mixture was so thoroughgoing that no part of it was recognizable due to the smallness of each thing, and not even colors were perceptible. He considered matter to be infinitely divisible. That is, because it is impossible for being not to be, there is never a smallest part, but there is always a smaller part. If the parts of the great mixture were not infinitely divisible, then we would be left with a smallest part. Since the smallest part could not become smaller, any attempt at dividing it again would presumably obliterate it.

The most important player in this continuous play of being is mind (nous). Although mind can be in some things, nothing else can be in it—mind is unmixed. We recall that, for Anaxagoras, everything is mixed with everything. There is some portion of everything in anything that we identify. Thus, if anything at all were mixed with mind, then everything would be mixed with mind. This mixture would obstruct mind’s ability to rule all else. Mind is in control, and it is responsible for the great mixture of being. Everlasting mind—the most pure of all things—is responsible for ordering the world.

Anaxagoras left his mark on the thought of both Plato and Aristotle, whose critiques of Anaxagoras are similar. In Plato’s Phaedo, Socrates recounts in brief his intellectual history, citing his excitement over his discovery of Anaxagoras’ thought. He was most excited about mind as an ultimate cause of all. Yet, Socrates complains, Anaxagoras made very little use of mind to explain what was best for each of the heavenly bodies in their motions, or the good of anything else. That is, Socrates seems to have wanted some explanation as to why it is good for all things to be as they are (Graham 309-311). Aristotle, too, complains that Anaxagoras makes only minimal use of his principle of mind. It becomes, as it were, a deus ex machina, that is, whenever Anaxagoras was unable to give any other explanation for the cause of a given event, he fell back upon mind (Graham 311-313). It is possible, as always, that both Plato and Aristotle resort here to a straw man of sorts in order to advance their own positions. Indeed, we have seen that Mind set the great mixture into motion, and then ordered the cosmos as we know it. This is no insignificant feat.

g. Democritus and Atomism

Ancient atomism began a legacy in philosophical and scientific thought, and this legacy was revived and significantly evolved in modern philosophy. In contemporary times, the atom is not the smallest particle. Etymologically, however, atomos is that which is uncut or indivisible. The ancient atomists, Leucippus and Democritus (c.5th cn B.C.E.), were concerned with the smallest particles in nature that make up reality—particles that are both indivisible and invisible. They were to some degree responding to Parmenides and Zeno by indicating atoms as indivisible sources of motion.

Atoms—the most compact and the only indivisible bodies in nature—are infinite in number, and they constantly move through an infinite void. In fact, motion would be impossible, says Democritus, without the void. If there were no void, the atoms would have nothing through which to move. Atoms take on a variety, perhaps an infinite variety, of shapes. Some are round, others are hooked, and yet others are jagged. They often collide with one another, and often bounce off of one another. Sometimes, though, the shapes of the colliding atoms are amenable to one another, and they come together to form the matter that we identify as the sensible world (F5). This combination, too, would be impossible without the void. Atoms need a background (emptiness) out of which they are able to combine (Graham 531). Atoms then stay together until some larger environmental force breaks them apart, at which point they resume their constant motion (F5). Why certain atoms come together to form a world seems up to chance, and yet many worlds have been, are, and will be formed by atomic collision and coalescence (Graham 551). Once a world is formed, however, all things happen by necessity—the causal laws of nature dictate the course of the natural world (Graham 551-553).

h. The Sophists

Much of what is transmitted to us about the Sophists comes from Plato. In fact, two of Plato’s dialogues are named after Sophists, Protagoras and Gorgias, and one is called simply, The Sophist. Beyond this, typical themes of sophistic thought often make their way into Plato’s work, not the least of which are the similarities between Socrates and the Sophists (an issue explicitly addressed in the Apology and elsewhere). Thus, the Sophists had no small influence on fifth century Greece and Greek thought.

Broadly, the Sophists were a group of itinerant teachers who charged fees to teach on a variety of subjects, with rhetoric as the preeminent subject in their curriculum. A common characteristic among many, but perhaps not all, Sophists seems to have been an emphasis upon arguing for each of the opposing sides of a case. Thus, these argumentative and rhetorical skills could be useful in law courts and political contexts. However, these sorts of skills also tended to earn many Sophists their reputation as moral and epistemological relativists, which for some was tantamount to intellectual fraud.

One of the earliest and most famous Sophists was Protagoras (c. 490-c. 420 BC). Only a handful of fragments of his thought exist, and the bulk of the remaining information about him found in Plato’s dialogues should be read cautiously. He is most famous for the apparently relativistic statement that human beings are “the measure of all things, of things that are that they are, of things that are not that they are not” (F1b). Plato, at least for the purposes of the Protagoras, reads individual relativism out of this statement. For example, if the pool of water feels cold to Henry, then it is in fact cold for Henry, while it might appear warm, and therefore be warm for Jennifer. This example portrays perceptual relativism, but the same could go for ethics as well, that is, if X seems good to Henry, then X is good for him, but it might be bad in Jennifer’s judgment. The problem with this view, however, is that if all things are relative to the observer/judge, then the idea that all things are relative is itself relative to the person who asserts it. The idea of communication is then rendered incoherent since each person has his or her own private meaning.

On the other hand, Protagoras’ statement could be interpreted as species-relative. That is, the question of whether and how things are, and whether and how things are not, is a question that has meaning (ostensibly) only for human beings. Thus, all knowledge is relative to us as human beings, and therefore limited by our being and our capabilities. This reading seems to square with the other of Protagoras’ most famous statements: “Concerning the gods, I cannot ascertain whether they exist or whether they do not, or what form they have; for there are many obstacles to knowing, including the obscurity of the question and the brevity of human life” (F3). It is implied here that knowledge is possible, but that it is difficult to attain, and that it is impossible to attain when the question is whether or not the gods exist. We can also see here that human finitude is a limit not only upon human life but also upon knowledge. Thus, if there is knowledge, it is for human beings, but it is obscure and fragile.

Along with Protagoras was Gorgias (c.485-c.380 B.C.E.), another sophist whose namesake became the title of a Platonic dialogue. Perhaps flashier than Protagoras when it came to rhetoric and speech making, Gorgias is known for his sophisticated and poetic style. He is known also for extemporaneous speeches, taking audience suggestions for possible topics upon which he would speak at length. His most well-known work is On Nature, Or On What-Is-Not wherein he, contrary to Eleatic philosophy, sets out to show that neither being nor non-being is, and that even if there were anything, it could be neither known nor spoken. It is unclear whether this work was in jest or in earnest. If it was in jest, then it was likely an exercise in argumentation as much as it was a gibe at the Eleatics. If it was in earnest, then Gorgias could be seen as an advocate for extreme skepticism, relativism, or perhaps even nihilism (Graham 725).

2. Socrates

Socrates (469-399 B.C.E.) wrote nothing, so what stories and information we have about him come to us primarily from Xenophon (430-354 B.C.E.) and Plato. Both Xenophon and Plato knew Socrates, and wrote dialogues in which Socrates usually figures as the main character, but their versions of certain historical events in Socrates’ life are sometimes incompatible. We cannot be sure if or when Xenophon or Plato is reporting about Socrates with historical accuracy. In some cases, we can be sure that they are intentionally not doing so, but merely using Socrates as a mouthpiece to advance philosophical dialogue (Döring 25). Xenophon, in his Memorobilia, wrote some biographical information about Socrates, but we cannot know how much is fabricated or embellished. When we refer to Socrates, we are typically referring to the Socrates of one of these sources and, more often than not, Plato’s version.

Socrates was the son of a sculptor, Sophroniscus, and grew up an Athenian citizen. He was reported to be gifted with words and was sometimes accused of what Plato later accused Sophists, that is, using rhetorical devices to “make the weaker argument the stronger.” Indeed, Xenophon reports that the Thirty Tyrants forbade Socrates to speak publicly except on matters of practical business because his clever use of words seemed to lead young people astray (Book I, II.33-37). Similarly, Aristophanes presents Socrates as an impoverished sophist whose head was in the clouds to the detriment of his daily, practical life. Moreover, his similarities with the sophists are even highlighted in Plato’s work. Indeed, Socrates’ courtroom speech in Plato’s Apology includes a defense against accusations of sophistry (18c).

While Xenophon and Plato both recognize this rhetorical Socrates, they both present him as a virtuous man who used his skills in argumentation for truth, or at least to help remove himself and his interlocutors from error. The so-called Socratic method, or elenchos, refers to the way in which Socrates often carried out his philosophical practice, a method to which he seems to refer in Plato’s Apology (Benson 180-181). Socrates aimed to expose errors or inconsistencies in his interlocutors’ positions. He did so by asking them questions, often demanding yes-or-no answers, and then reduced their positions to absurdity. He was, in short, aiming for his interlocutor to admit his own ignorance, especially where the interlocutor thought that he knew what he did not in fact know. Thus, many Platonic dialogues end in aporia, an impasse in thought—a place of perplexity about the topic originally under discussion (Brickhouse and Smith 3-4). This is presumably the place from which a thoughtful person can then make a fresh start on the way to seeking truth.

Socrates practiced philosophy openly, did not charge fees for doing so and allowed anyone who wanted to engage with him to do so. Xenophon says:

Socrates lived ever in the open; for early in the morning he went to the public promenades and training-grounds; in the forenoon he was seen in the market; and the rest of the day he passed just where most people were to be met: he was generally talking, and anyone might listen. (Memorabilia, Book I, i.10)

The “talking” that Socrates did was presumably philosophical in nature, and this talk was focused primarily on morality. Indeed, as John Cooper claims in his introduction to Plato: Complete Works, Socrates “denied that he had discovered some new wisdom, indeed that he possessed any wisdom at all,” contrary to his predecessors, such as Anaxagoras and Parmenides. Often his discussions had to do with topics of virtue—justice, courage, temperance, and wisdom (Memorabilia, Book I, i.16). This sort of open practice made Socrates well known but also unpopular, which eventually led to his execution.

Socrates’ elenchos, as he recognizes in Plato’s Apology (from apologia, “defense”), made him unpopular. Lycon (about whom little is known), Anytus (an influential politician in Athens), and Meletus, a poet, accused Socrates of not worshipping the gods mandated by Athens (impiety) and of corrupting the youth through his persuasive power of speech. In his Meno, Plato hints that Anytus was already personally angry with Socrates. Anytus has just warned Socrates to “be careful” in the way he speaks about famous people (94e). Socrates then tells Meno, “I think, Meno, that Anytus is angry, and I am not at all surprised. He thinks…that I am slandering those men, and then he believes himself to be one of them” (95a). This is not surprising, if indeed Socrates practiced philosophy in the way that both Xenophon and Plato report that he did by exposing the ignorance of his interlocutors.

Socrates claims to have ventured down the path of philosophy because of a proclamation from the Oracle at Delphi. Socrates’ enthusiastic follower, Chaerephon, reportedly visited the Oracle at Delphi to ask the god whether anyone among the Athenians was wiser than Socrates. The god replied that no one was wiser than Socrates. Socrates, who claims never to have been wise, wondered what this meant. So, in order to understand better the god’s claim, Socrates questioned Athenians from all social strata about their wisdom. In Plato’s Apology, Socrates claims that most people he questioned claimed to know what they did not in fact know (21-22). As a result of showing so many people their own ignorance, or at least trying to, Socrates became unpopular (23a). This unpopularity is eventually what killed him. To add to his unpopularity, Socrates claimed that the Oracle was right, but only in the respect that he had “human wisdom,” that is, the wisdom to recognize what one does not know, and to know that such wisdom is relatively worthless (23b).

Xenophon, too, wrote his own account of Socrates’ defense. Xenophon attributes the accusation of impiety to Socrates’ daimon, or personal god much like a voice of conscience, who forbade Socrates from doing anything that would not be truly beneficial for him. Both Xenophon (4-7) and Plato (40b) claim that it was this daimon who prevented Socrates from making such a defense as would exonerate him. That is, the daimon did not dissuade Socrates from his sentence of death. In Xenophon’s account, The Oracle claimed that no one was “more free than [Socrates], or more just, or more prudent” (Apology 14). Xenophon’s version might differ from Plato’s since Xenophon, a military leader, wanted to emphasize characteristics Socrates exuded that might also make for good characteristics in a statesman (O’Connor 66). At any rate, Xenophon has Socrates recognize his own unpopularity. Also, like Plato, Xenophon recognizes that Socrates held knowledge of oneself and the recognition of one’s own ignorance in high esteem (Memorabilia, Book III, ix. 6-7).

Socrates practiced philosophy, in an effort to know himself, daily and even in the face of his own death. In Plato’s Crito, in which Crito comes to Socrates’ prison cell to persuade Socrates to escape, Socrates wants to know whether escaping would be just, and imminent death does not deter him from seeking an answer to that question. He and Crito first establish that doing wrong willingly is always bad, and this includes returning wrong for wrong (49b-c). Then, personifying Athenian law, Socrates establishes that escaping prison would be wrong. While he acknowledges that he was wrongly found to be guilty of impiety and corrupting the youth, the legal process itself ran according to law, and to escape would be to “wrong” the laws in which he was raised and to which, by virtue of being a life-long Athenian, he agreed to assent.

Plato’s Phaedo presents us with the story of Socrates’ last day on earth. In it, he famously claims that philosophy is practice for dying and death (64a). Indeed, he spends his final hours with his friends discussing a very relevant and pressing philosophical issue, that is the immortality of the soul. Socrates is presented to us as a man who, even in his final hours, wanted nothing more than to pursue wisdom. In Plato’s Euthyphro, Socrates aims to dissuade Euthyphro from indicting his own father for murder. Euthyphro, a priest, claims that what he is doing—prosecuting a wrongdoer—is pious. Socrates then uses his elenchos to show that Euthyphro does not actually know what piety is. Once he is thoroughly confused and frustrated, Euthyphro says, ““it is a considerable task to acquire any precise knowledge of these things [that is, piety]” (14b). Nevertheless, Euthyphro offers yet another definition of “piety.” Socrates’ response is the key to understanding the dialogue: “You could tell me in far fewer words, if you were willing, the sum of what I asked…You were on the verge of doing so, but you turned away. If you had given that answer, I should now have acquired from you sufficient knowledge of the nature of piety” (14c1-c4). It is, in other words, the very act of philosophizing—the recognizing of one’s own ignorance and the search for wisdom—that is piety. Socrates, we are told, continued this practice even in the final hours of his life.

3. Plato

Plato (427-347 B.C.E.) was the son of Athenian aristocrats. He grew up in a time of upheaval in Athens, especially at the conclusion of the Peloponnesian war, when Athens was conquered by Sparta. Debra Nails says, “Plato would have been 12 when Athens lost her empire with the revolt of the subject allies; 13 when democracy fell briefly to the oligarchy of Four Hundred…; [and]14 when democracy was restored” (2). We cannot be sure when he met Socrates. Although ancient sources report that he became Socrates’ follower at age 18, he might have met Socrates much earlier through the relationship between Socrates and Plato’s uncle, Charmides, in 431 B.C.E. (Taylor 3). He might have known Socrates, too, through his “musical” education, which would have consisted of anything under the purview of the muses, that is, everything from dancing to reading, writing, and arithmetic (Nails 2). He also seems to have spent time with Cratylus, the Heraclitean, which probably had an impact primarily on his metaphysics and epistemology.

Plato had aspirations for the political life, but several untoward events pushed him away from the life of political leadership, not the least of which was Socrates’ trial and conviction. While the authenticity of Plato’s Seventh Letter is debated among scholars, it might give us some insight into Plato’s biography:

At last I came to the conclusion that all existing states are badly governed and the condition of their laws practically incurable, without some miraculous remedy and the assistance of fortune; and I was forced to say, in praise of true philosophy , that from her height alone was it possible to discern what the nature of justice is, either in the state or in the individual, and that the ills of the human race would never end until either those who are sincerely and truly lovers of wisdom [that is, philosophers] come into political power, or the rulers of our cities, by the grace of God, learn true philosophy. (Letter VII)

Plato saw any political regime without the aid of philosophy or fortune as fundamentally corrupt. This attitude, however, did not turn Plato entirely from politics. He visited Sicily three times, where two of these trips were failed attempts at trying to turn the tyrant Dionysius II to the life of philosophy. He thus returned to Athens and focused his efforts on the philosophical education he had begun at his Academy (Nails 5).

a. Background of Plato’s Work

Since Plato wrote dialogues, there is a fundamental difficulty with any effort to identify just what Plato himself thought. Plato never appears in the dialogues as an interlocutor. If he was voicing any of his own thoughts, he did it through the mouthpiece of particular characters in the dialogues, each of which has a particular historical context. Thus, any pronouncement about Plato’s “theory” of this or that must be tentative at best. As John Cooper says,

Although everything any speaker says is Plato’s creation, he also stands before it all as the reader does: he puts before us, the readers, and before himself as well, ideas, arguments, theories, claims, etc. for all of us to examine carefully, reflect on, follow out the implications of—in sum, to use as a springboard for our own further philosophical thought. (Cooper xxii)

Thus, while we can indubitably highlight recurring themes and theoretical insights throughout Plato’s work, we must be wary of committing Plato in any wholesale fashion to a particular view.

b. Metaphysics

Perhaps the most famous of Plato’s metaphysical concepts is his notion of the so-called “forms” or “ideas.” The Greek words that we translate as “form” or “idea” are eidos and idea. Both of these words are rooted in verbs of seeing. Thus, the eidos of something is its look, shape, or form. But, as many philosophers do, Plato manipulates this word and has it refer to immaterial entities. Why is it that one can recognize that a maple is a tree, an oak is a tree, and a Japanese fir is a tree? What is it that unites all of our concepts of various trees under a unitary category of Tree? It is the form of “tree” that allows us to understand anything about each and every tree, but Plato does not stop there.

The forms can be interpreted not only as purely theoretical entities, but also as immaterial entities that give being to material entities. Each tree, for example, is what it is insofar as it participates in the form of Tree. Each human being, for example, is different from the next, but each human being is human to the extent that he/she participates in the form of Human Being. This material-immaterial emphasis seems directed ultimately towards Plato’s epistemology. That is, if anything can be known, it is the forms. Since things in the world are changing and temporal, we cannot know them; therefore, forms are unchanging and eternal beings that give being to all changing and temporal beings in the world, if knowledge is to be certain and clear. In other words, we cannot know something that is different from one moment to the next. The forms are therefore pure ideas that unify and stabilize the multiplicity of changing beings in the material world.

The forms are the ultimate reality, and this is shown to us in the Allegory of the Cave. In discussing the importance of education for a city, Socrates produces the Allegory of the Cave in Plato’s Republic (514a-518b). We are to imagine a cave wherein lifelong prisoners dwell. These prisoners do not know that they are prisoners since they have been held captive their entire lives. They are shackled such that they are incapable of turning their heads. Behind them is a fire, and small puppets or trinkets of various things—horses, stones, people, and so forth—are being moved in front of the fire. Shadows of these trinkets are cast onto a wall in front of the prisoners. The prisoners take this world of shadows to be reality since it is the only thing they ever see.

If, however, we suppose that one prisoner is unshackled and is forced to make his way out of the cave, we can see the process of education. At first, the prisoner sees the fire, which casts the shadows he formerly took to be reality. He is then led out of the cave. After his eyes painfully adjust to the sunlight, he first sees only the shadows of things, and then the things themselves. After this, he realizes that it is the sun by which he sees the things, and which gives life to the things he sees. The sun is here analogous to the form of the Good, which is what gives life to all beings and enables us most truly to know all beings.

The concept of the forms is criticized in Plato’s Parmenides. This dialogue shows us a young Socrates, whose understanding of the forms is being challenged by Parmenides. Parmenides first challenges the young Socrates about the scope of the forms. It seems absurd, thinks Parmenides, to suppose stones, hair, or bits of dirt of their own form (130c-d). He then presents the famous “third man” argument. The forms are supposed to be unitary. The multiplicity of large material things, for example, participate in the one form of Largeness, which itself does not participate in anything else. Parmenides argues against this unity: “So another form of largeness will make its appearance, which has emerged alongside largeness itself and the things that partake of it, and in turn another over all these, by which all of them will be large. Each of your forms will no longer be one, but unlimited in multitude” (132a-b). In other words, is the form of Largeness itself large? If so, it would need to participate in another form of Largeness, which would itself need to participate in another form, and so forth.

In short, we can see that Plato is tentative about what is now considered his most important theory. Indeed, in his Seventh Letter, Plato says that talking about the forms at all is a difficult matter. “These things…because of the weakness of language, are just as much concerned with making clear the particular property of each object as the being of it. On this account no sensible man will venture to express his deepest thoughts in words, especially in a form which is unchangeable, as is true of written outlines” (343). The forms are beyond words or, at best, words can only approximately reveal the truth of the forms. Yet, Plato seems to take it on faith that, if there is knowledge to be had, there must be these unchanging, eternal beings.

c. Epistemology

We can say that, for Plato, if there is to be knowledge, it must be of eternal, unchanging things. The world is constantly in flux. It is therefore strange to say that one has knowledge of it, when one can also claim to have knowledge of, say, arithmetic or geometry, which are stable, unchanging things, according to Plato. That is, it seems absurd that one’s ideas about changing things are on a par with one’s ideas about unchanging things. Moreover, like Cratylus, we might wonder whether our ideas about the changing world are ever accurate at all. Our ideas, after all, tend to be much like a photograph of a world, but unlike the photograph, the world continues to change. Thus, Plato reserves the forms as those things about which we can have true knowledge.

How we get knowledge is difficult. The problem of acquiring knowledge gave rise to “Meno’s Paradox” in Plato’s Meno. In their search for the nature of virtue, Meno asks Socrates, “How will you look for [virtue], Socrates, when you do not know at all what it is? How will you aim to search for something you do not know at all? If you should meet with it, how will you know that this is the thing that you did not know?” (Meno, 80d-e). If one wants to know X, this implies that he/she does not know X now. If so, then it seems that one cannot even begin to ask about X. In other words, it seems that one must already know X in order to ask about it in the first place, but if one already knows X, then there is nothing to ask. Even if one could ask, one would not know when he/she has the answer since one did not know what he/she was looking for in the first place.

Socrates answers this “debaters argument” with the theory of recollection, claiming that he has heard others talk about this “divine matter” (81a). The theory of recollection rests upon the assumption that the human soul is immortal. The soul’s immortality entails, says Socrates, that the soul has seen and known all things since it has always been. Somehow, the soul “forgets” these things upon its incarnation, and the task of knowledge is to recollect them (81b-e). This, of course, is a poor argument, but Plato knows this, given his preface that it is a “divine matter,” and Socrates’ insistence that we must believe it (not know it or be certain of it) rather than the paradox Meno mentions. Thus, Socrates famously goes on to show recollection in action through a series of questions posed to Meno’s slave. Through a series of leading questions, Meno’s slave provides the answer to a geometrical problem that he did not previously know—or more precisely, he recollects knowledge that he had previously forgotten. We might imagine that this is akin to the “light bulb” moment when something we did not previously understand suddenly becomes clear. At any rate, Socrates shows Meno how the human mind mysteriously, when led in the proper fashion, can arrive at knowledge on its own. This is recollection.

Again, the forms are the most knowable beings and, so, presumably are those beings that we recollect in knowledge. Plato offers another image of knowing in his Republic. True understanding (noesis) is of the forms. Below this, there is thought (dianoia), through which we think about things like mathematics and geometry. Below this is belief (pistis), where we can reason about things that we sense in our world. The lowest rung of the ladder is imagination (eikasia), where our mind is occupied with mere shadows of the physical world (509d-511e). The image of the Divided Line is parallel to the process of the prisoner emerging from the cave in the Allegory of the Cave, and to the Sun/Good analogy. In any case, real knowledge is knowledge of the forms, and is that for which the true philosopher strives, and the philosopher does this by living the life of the best part of the soul—reason.

d. Psychology

Plato is famous for his theory of the tripartite soul (psyche), the most thorough formulation of which is in the Republic. The soul is at least logically, if not also ontologically, divided into three parts: reason (logos), spirit (thumos), and appetite or desire (epithumia). Reason is responsible for rational thought and will be in control of the most ordered soul. Spirit is responsible for spirited emotions, like anger. Appetites are responsible not only for natural appetites such as hunger, thirst, and sex, but also for the desire of excess in each of these and other appetites. Why are the three separate, according to Plato? The argument for the distinction between three parts of the soul rests upon the Principle of Contradiction.

Socrates says, “It is obvious that the same thing will not be willing to do or undergo opposites in the same part of itself, in relation to the same thing, at the same time. So, if we ever find this happening in the soul, we’ll know that we aren’t dealing with one thing but many” (Republic, 436b6-c1). Thus, for example, the appetitive part of the soul is responsible for someone’s thirst. Just because, however, that person might desire a drink, it does not mean that she will drink at that time. In fact, it is conceivable that, for whatever reason, she will restrain herself from drinking at that time. Since the Principle of Contradiction entails that the same part of the soul cannot, at the same time and in the same respect, desire and not desire to drink, it must be some other part of the soul that helps reign in the desire (439b). The rational part of the soul is responsible for keeping desires in check or, as in the case just mentioned, denying the fulfillment of desires when it is appropriate to do so.

Why is the spirited part different from the appetitive part? To answer this question, Socrates relays a story he once heard about a man named Leontius. Leontius “was going up from the Piraeus along the outside of the North Wall when he saw some corpses lying at the executioner’s feet. He had an appetite to look at them bat at the same time he was disgusted and turned away” (Republic, 439e6-440a3). Despite his disgust (issuing from the spirited part of the soul) with his desire, Leontius reluctantly looked at the corpses. Socrates also cites examples when someone has done something, on account of appetite, for which he later reproaches himself. The reproach is rooted in an alliance between reason and spirit. Reason knows that indulging in the appetite is bad, and spirit, on reason’s behalf, becomes angry (440a6-440b4). Reason, with the help of spirit, will rule in the best souls. Appetite, and perhaps to some degree spirit, will rule in a disordered soul. The life of philosophy is a cultivation of reason and its rule.

The soul is also immortal, and one the more famous arguments for the immortality of the soul comes from the Phaedo. This argument rests upon a theory of the relationship of opposites. Hot and cold, for example, are opposites, and there are processes of becoming between the two. Hot comes to be what it is from cold. Cold must also come to be what it is from the hot, otherwise all things would move only in one direction, so to speak, and everything would therefore be hot. Life and death are also opposites. Living things come to be dead and death comes from life. But, since the processes between opposites cannot be a one-way affair, life must also come from death (Phaedo 71c-e2). Presumably Plato means by “death” here the realm of non-earthly existence. The souls must always exist in order to be immortal. We can see here the influence of Pythagorean thought upon Plato since this also leaves room for the transmigration of souls. The disordered souls in which desire rules will return from death to life embodied as animals such as donkeys while unjust and ambitious souls will return as hawks (81e-82a3). The philosopher’s soul is closest to divinity and a life with the gods.

e. Ethics and Politics

It is relatively easy to see, then, where Plato’s psychology intersects with his ethics. The best life is the life of philosophy, that is the life of loving and pursuing wisdom—a life spent engaging logos. The philosophical life is also the most excellent life since it is the touchstone of true virtue. Without wisdom, there is only a shadow or imitation of virtue, and such lives are still dominated by passion, desire, and emotions. On the other hand,

The soul of the philosopher achieves a calm from such emotions; it follows reason and ever stays with it contemplating the true, the divine, which is not the object of opinion. Nurtured by this, it believes that one should live in this manner as long as one is alive and, after death, arrive at what is akin and of the same kind, and escape from human evils. (Phaedo 84a-b)

It is the philosopher, too, who must rule the ideal city, as we saw in Plato’s seventh letter. Just as the philosopher’s soul is ruled by reason, the ideal city must be ruled by philosophers.

The Republic begins with the question of what true justice is. Socrates proposes that he and his interlocutors, Glaucon and Adeimantus, might see justice more clearly in the individual if they take a look at justice writ large in a city, assuming that an individual is in some way analogous to a city (368c-369a). So, Socrates and his interlocutors theoretically create an ideal city, which has three social strata: guardians, auxiliaries, and craftspeople/farmers. The guardians will rule, the auxiliaries will defend the city, and the craftspeople and farmers will produce goods and food for the city. The guardians, as we learn in Book VI, will also be philosophers since only the wisest should rule.

This tripartite city mirrors the tripartite soul. When the guardians/philosophers rule properly, and when the other two classes do their proper work—and do not do or attempt to do work that is not properly their own—the city will be just, much as a soul is just when reason rules (433a-b). How is it that auxiliaries and craftspeople can be kept in their own proper position and be prevented from an ambitious quest for upward movement? Maintaining social order depends not only upon wise ruling, but also upon the Noble Lie. The Noble Lie is a myth that the gods mixed in various metals with the members of the various social strata. The guardians were mixed with gold, the auxiliaries with silver, and the farmers and craftspeople with iron and bronze (415a-c). Since the gods intended for each person to belong to the social class that he/she currently does, it would be an offense to the gods for a member of a social class to attempt to become a member of a different social class.

The most salient concern here is that Plato’s ideal city quickly begins to sound like a fascist state. He even seems to recognize this at times. For example, the guardians must not only go through a rigorous training and education regimen, but they must also live a strictly communal life with one another, having no private property. Adeimantus objects to this saying that the guardians will be unhappy. Socrates’ reply is that they mean to secure happiness for the whole city, not for each individual (419a-420b). Individuality seems lost in Plato’s city.

In anticipation that such a city is doomed to failure, Plato has it dissolve, but he merely cites discord among the rulers (545d) and natural processes of becoming as the reasons for its devolution. Socrates says, “It is hard for a city composed in this way to change, but everything that comes into being must decay. Not even a constitution such as this will last forever. It, too, must face dissolution” (546a1-4). We may notice here that Plato cites human fragility and finitude as sources of the ideal city’s devolution, not the city’s possible fascistic tendencies. Yet, it is possible that the lust for power is the cause of strife and discord among the leaders. In other words, perhaps not even the best sort of education and training can keep even the wisest of human rulers free from desire.

It is difficult to overlook the sometimes moralistic and fascistic tendencies in Plato’s ethical and political thought. Yet, just as he challenges his own metaphysical ideas, he also at times loosens up on his ethical and political ideals. In Phaedo, for example, Plato has Phaedo recount the story of Socrates’ final day. Phaedo says that he and other friends of Socrates arrived at the prison early, and when they were granted access to Socrates, Xanthippe, Socrates’ wife was already there with their infant son (60a), which means that Xanthippe had been there all night. Socrates, to his own pleasure, rubs his legs after the shackles have been removed (60b), which implies that even philosophers enjoy bodily pleasures. Again, Phaedo says that Socrates had a way of easing the distress of those around him—in this case, the distress of Socrates’ imminent death. Phaedo recounts how Socrates eased his pain on that particular day:

I happened to be sitting on his right by the couch on a low stool, so that he was sitting well above me. He stroked my head and pressed the hair on the back of my neck, for he was in the habit of playing with my hair at times. (89a9-b3)

Plato, with these dramatic details, is reminding us that even the philosopher is embodied and, at least to some extent, enjoys that embodiment, even though reason is to rule above all else.

4. Aristotle

Aristotle (384-322 B.C.E.) was born Stagirus, which was a Thracian coastal city. He was the son of Nichomacus, the Macedonian court physician, which allowed for a lifelong connection with the court of Macedonia. When he was 17, Aristotle was sent to Athens to study at Plato’s Academy, which he did for 20 years. After serving as tutor for the young Alexander (later Alexander the Great), Aristotle returned to Athens and started his own school, the Lyceum. Aristotle walked as he lectured, and his followers therefore later became known as the peripatetics, those who walked around as they learned. When Alexander died in 323, and the pro-Macedonian government fell in Athens, a strong anti-Macedonian reaction occurred, and Aristotle was accused of impiety. He fled Athens to Chalcis, where he died a year later.

Unlike Plato, Aristotle wrote treatises, and he was a prolific writer indeed. He wrote several treatises on ethics, he wrote on politics, he first codified the rules of logic, he investigated nature and even the parts of animals, and his Metaphysics is in a significant way a theology. His thought, and particularly his physics, reigned supreme in the Western world for centuries after his death.

a. Terminology

Aristotle used, and sometimes invented, technical vocabulary in nearly all facets of his philosophy. It is important to have an understanding of this vocabulary in order to understand his thought in general. Like Plato, Aristotle talked about forms, but not in the same way as his master. For Aristotle, forms without matter do not exist. I can contemplate the form of human being (that is, what it means to be human), but this would be impossible if actual (embodied) human beings were non-existent. A particular human being, what Aristotle might call “a this,” is hylomorphic, or matter (hyle) joined with form (morphe). Similarly, we cannot sense or make sense of unformed matter. There is no matter in itself. Matter is the potential to take shape through form. Thus, Aristotle is often characterized as the philosopher of earth, while Plato’s gaze is towards the heavens, as it appears in Raphael’s famous School of Athens painting.

Form is thus both the physical shape, but also the idea by which we best know particular beings. Form is the actuality of matter, which is pure potentiality. “Actuality” and “potentiality” are two important terms for Aristotle. A thing is in potentiality when it is not yet what it can inherently or naturally become. An acorn is potentially an oak tree, but insofar as it is an acorn, it is not yet actually an oak tree. When it is an oak tree, it will have reached its actuality—its continuing activity of being a tree. The form of oak tree, in this case, en-forms the wood, and gives it shape—makes it actuality a tree, and not just a heap of matter.

When a being is in actuality, it has fulfilled its end, its telos. All beings by nature are telic beings. The end or telos of an acorn is to become an oak tree. The acorn’s potentiality is an inner striving towards its fulfillment as an oak tree. If it reaches this fulfillment it is in actuality, or entelecheia, which is a word that Aristotle coined, and is etymologically related to telos. It is the activity of being-its-own-end that is actuality. This is also the ergon, or function or work, of the oak tree. The best sort of oak tree—the healthiest, for example—best fulfills its work or function. It does this in its activity, its energeia, of being. This activity or energeia is the en-working or being-at-work of the being.

One more important set of technical terms is Aristotle’s four causes: material, formal, efficient (moving), and final cause. To know a thing thoroughly is to know its cause (aitia), or what is responsible for making a being who or what it is. For instance, we might think of the causes of a house. The material cause is the bricks, mortar, wood, and any other material that goes to make up the house. Yet, these materials could not come together as a house without the formal cause that gives shape to it. The formal cause is the idea of the house in the architect’s soul. The efficient cause would be the builders of the house. The final cause that for which the house exists in the first place, namely shelter, comfort, warmth, and so forth. We will see that the concept of causes, especially final cause, is very important for Aristotle, especially in his argument for the unmoved mover in the Physics.

b. Psychology

Aristotle’s On The Soul (Peri Psyche, often translated in the Latin, De Anima) gives us insight into Aristotle’s conception of the composition of the soul. The soul is the actuality of a body. Alternatively, since matter is in potentiality, and form is actuality, the soul as form is the actuality of the body (412a20-23). Form and matter are never found separately from one another, although we can make a logical distinction between them. For Aristotle, all living things are en-souled beings. Soul is the animating principle (arche) of any living being (a self-nourishing, growing and decaying being). Thus, even plants are en-souled (413a26). Without soul, a body would not be alive, and a plant, for instance, would be a plant in name only.

There are three types of soul: nutritive, sensitive, and intellectual. Some beings have only one of these, or some mixture of them. If, however, a soul has the capacity for sensation, as animals do, then they also have a nutritive faculty (414b1-2). Likewise, for beings who have minds, they must also have the sensitive and nutritive faculties of soul. A plant has only the nutritive faculty of soul, which is responsible for nourishment and reproduction. Animals have sense perception in varying degrees, and must also have the nutritive faculty, which allows them to survive. Human beings have intellect or mind (nous) in addition to the other faculties of the soul.

The soul is the source and cause of the body in three ways: the source of motion, the telos, and the being or essence of the body (415b9-11). The soul is that from which and ultimately for which the body does what it does, and this includes sensation. Sensation is the ability to receive the form of an object without receiving its matter, much as the wax receives the form of the signet ring without receiving the metal out of which the ring is made. There are three types of sensible things: particular sensibles, or those qualities that can be sensed by one sense only; common sensibles, which can be sensed by some combination of various senses; and incidental sensibles, as when I see my friend Tom, whose father is Joe, I say that I see “the son of Joe,” but I see Joe’s son only incidentally.

Mind (nous), as it was for Anaxagoras, is unmixed (429a19). Just as senses receive, via the sense organ, the form of things, but not the matter, mind receives the intelligible forms of things, without receiving the things themselves. More precisely, mind, which is nothing before it thinks and is therefore itself when active, is isomorphic with what it thinks (429a24). To know something is most properly to know its form, and mind in some way becomes the form of what it thinks. Just how this happens is unclear. Since the form is what is known, the mind “receives” or becomes that form when it best understands it. So, mind is not a thing, but is only the activity of thinking, and is particularly whatever it thinks at any given time.

c. Ethics

The most famous and thorough of Aristotle’s ethical works is his Nicomachean Ethics. This work is an inquiry into the best life for human beings to live. The life of human flourishing or happiness (eudaimonia) is the best life. It is important to note that what we translate as “happiness” is quite different for Aristotle than it is for us. We often consider happiness to be a mood or an emotion, but Aristotle considers it to be an activity—a way of living one’s life. Thus, it is possible for one to have an overall happy life, even if that life has its moments of sadness and pain.

Happiness is the practice of virtue or excellence (arete), and so it is important to know the two types of virtue: character virtue, the discussion of which makes up the bulk of the Ethics, and intellectual virtue. Character excellence comes about through habit—one habituates oneself to character excellence by knowingly practicing virtues. To be clear, it is possible to perform an excellent action accidentally or without knowledge, but doing so would not make for an excellent person, just as accidentally writing in a grammatically correct way does not make for a grammarian (1105a18-26). One must be aware that one is practicing the life of virtue.

Aristotle arrives at the idea that “the activity of the soul in accordance with virtue” is the best life for human beings through the “human function” argument. If, says Aristotle, human beings have a function or work (ergon) to perform, then we can know that performing that function well will result in the best sort of life (1097b23-30). The work or function of an eye is to see and to see well. Just as each part of the body has a function, says Aristotle, so too must the human being as a whole have a function (1097b30). This is an argument by analogy. The function of the human being is logos or reason, and the more thoroughly one lives the life of reason, the happier one’s life will be (1098a3).

So, the happiest life is a practice of virtue, and this is practiced under the guidance of reason. Examples of character virtues would be courage, temperance, liberality, and magnanimity. One must habitually practice these virtues in order to be courageous, temperate, and so forth. For example, the courageous person knows when to be courageous, and acts on that knowledge whenever it is appropriate to do so (1115a16-34). Each activity of any particular character virtue has a related excessive or deficient action (1105a24-33). The excess related to courage, for example, is rashness, and the deficiency is cowardice. Since excellence is rare, most people will tend more towards an excess or deficiency than towards the excellent action. Aristotle’s advice here is to aim for the opposite of one’s typical tendency, and that eventually this will lead one closer to the excellence (1109a29-1109b6). For example, if one tends towards the excess of self-indulgence, it might be best to aim for insensibility, which will eventually lead the agent closer to temperance.

Friendship is also a necessary part of the happy life. There are three types of friendship, none of which is exclusive of the other: a friendship of excellence, a friendship of pleasure, and a friendship of utility (1155b18). A friendship of excellence is based upon virtue, and each friend enjoys and contemplates the excellence of his/her friend. Since the friend is like another self (1166a31), contemplating a friend’s virtue will help us in the practice of virtue for ourselves (1177b10). A mark of good friendship is that friends “live together,” that is that friends spend a substantial amount of time together, since a substantial time apart will likely weaken the bond of friendship (1157b5-11)). Also, since the excellent person has been habituated to a life of excellence, his/her character is generally firm and lasting. Likewise, the friendship of excellence is the least changeable and most lasting form of friendship (1156b18).

The friendships of pleasure and use are the most changeable forms of friendship since the things we find pleasurable or useful tend to change over a lifetime (1156a19-20). For example, if a friendship forms out of a mutual love for beer, but the interest of one of the friends later turns towards wine, the friendship would likely dissolve. Again, if a friend is merely one of utility, then that friendship will likely dissolve when it is no longer useful.

Since the best life is a life of virtue or excellence, and since we are closer to excellence the more thoroughly we fulfill our function, the best life is the life of theoria or contemplation (1177a14-18). This is the most divine life, since one comes closest to the pure activity of thought (1177b30). It is the most self-sufficient life since one can think even when one is alone. What does one contemplate or theorize about? One contemplates one’s knowledge of unchanging things (1177a23-27). Some have criticized Aristotle saying that this sort of life seem uninteresting, since we seem to enjoy the pursuit of knowledge more than just having knowledge. For Aristotle, however, the contemplation of unchanging things is an activity full of wonder. Seeking knowledge might be good, but it is done for the sake of a greater end, namely having knowledge and contemplating what one knows. For example, Aristotle considered the cosmos to be eternal and unchanging. So, one might have knowledge of astronomy, but it is the contemplation of what this knowledge is about that is most wonderful. The Greek word theoria is rooted in a verb for seeing, hence our word “theatre.” So, in contemplation or theorizing, one comes face to face with what one knows. 

d. Politics

The end for any individual human being is happiness, but human beings are naturally political animals, and thus belong in the polis, or city-state. Indeed, the inquiry into the good life (ethics) belongs in the province of politics. Since a nation or polis determines what ought to be studied, any practical science, which deals with everyday, practical human affairs, falls under the purview of politics (1094a26-1094b11). The last chapter of Nicomachean Ethics is dedicated to politics. Aristotle emphasizes that the goal of learning about the good life is not knowledge, but to become good (1095a5), and he reiterates this in the final chapter (1179b3-4). Since the practice of virtue is the goal for the individual, then ultimately we must turn our eyes to the arena in which this practice plays out—the polis.

A good individual makes for a good citizen, and a good polis helps to engender good individuals: “Legislators make the citizens good by forming habits in them, and this is the wish of every legislator; and those who do not effect it miss their mark, and it is in this that a good constitution differs from a bad one” (1103b3-6). Laws must be instituted in such a way as to make its citizens good, but the lawmakers must themselves be good in order to do this. Human beings are so naturally political that the relationship between the state and the individual is to some degree reciprocal, but without the state, the individual cannot be good. In the Politics, Aristotle says that a man who is so self-sufficient as to live away from a polis is like a beast or a god (1253a29). That is, such a being is not a human being at all. Again, a man who is separated from law and justice is the “worst of all” (1253a32).

In Book III.7 of the Politics, Aristotle categorizes six different political constitutions, naming three as good and three as bad. The three good constitutions are monarchy (rule by one), aristocracy (rule by the best, aristos), and polity (rule by the many). These are good because each has the common good as its goal. The worst constitutions, which parallel the best, are tyranny, oligarchy, and democracy, with democracy being the best of the three evils. These constitutions are bad because they have private interests in mind rather than the common good or the best interest of everyone. The tyrant has only his own good in mind; the oligarchs, who happen to be rich, have their own interest in mind; and the people (demos), who happen not to be rich, have only their own interest in mind.

Yet, Aristotle grants that there is a difference between an ideal and a practically plausible constitution, which depends upon how people actually are (1288b36-37). The perfect state will be a monarchy or aristocracy since these will be ruled by the truly excellent. Since, however, such a situation is unlikely when we face the reality of our current world, we must look at the next best, and the next best after that, and so on. Aristotle seems to favor democracy, and after that oligarchy, but he spends the bulk of his time explaining that each of these constitutions actually takes many shapes. For example, there are farmer-based democracies, democracies based upon birth status, democracies wherein all free men can participate in government, and so forth (1292b22-1293a12).

The most unfortunate aspect of Aristotle’s politics is his treatment of slavery and women, and we might wonder how it affects his overall inquiry into politics:

The male is by nature superior, and the female inferior; and the one rules, and the other is ruled; this principle, of necessity, extends to all mankind. Where then there is such a difference as that between soul and body, or between men and animals (as in the case of those whose business is to use their body, and who can do nothing better), the lower sort are by nature slaves, and it is better for them as for all inferiors that they should be under the rule of a master. For he who can be, and therefore is, another’s, and he who participates in reason enough to apprehend, but not to have, is a slave by nature. Whereas the lower animals cannot even apprehend reason; they obey their passions. (Politics 1254b13-23)

For Aristotle, women are naturally inferior to men, and there are those who are natural slaves. In both cases, it is a deficiency in reason that is the culprit. Women have reason but “lack authority” (1260a14), and slaves have reason enough to take orders and have some understanding of their world, but cannot use reason as the best human being does. It is difficult, if not impossible, to interpret Aristotle charitably here. For slaves, one might suggest that Aristotle has in mind people who can do only menial tasks, and nothing more. Yet, there is a great danger even here. We cannot always trust the judgment of the master who says that this or that person is capable only of menial tasks, nor can we always know another person well enough to say what the scope of his or her capabilities for thought might be. So even a charitable interpretation of his views of slavery and women is elusive.

e. Physics

Aristotle’s physics, which stood as the most influential study of physics until Newtonian physics, could be seen largely as a study of motion. Motion is defined in the Physics as the “actuality of the potentiality in the very way in which [the thing in motion] is in potentiality” (201b5). Motion is not merely a change of place. It can also include processes of change in quality and quantity (201a4-9). For example, the growth of a plant from rhizome to flower (quantity) is a process of motion, even though the flower does not have any obvious lateral change of place. The change of a light skin-tone to bronze via sun tanning is a qualitative motion. In any case, the thing in motion is not yet what it is becoming, but it is becoming, and is thus actually a potentiality qua potentiality. The light skin is not yet sun tanned, but is becoming sun tanned. This process of becoming is actual, that is that the body is potentially tanned, and is actually in the process of this potentiality. So, motion is the actuality of the potentiality of a being, in the very way that it is a potentiality.

In Book 8.1 of the Physics, Aristotle argues that the cosmos and its heavenly bodies are in perpetual motion and always has been. There could not have been a time with no motion, whatever is moved is moved by itself or by another. Rest is simply a privation of motion. Thus, if there were a time without motion, then whatever existed—which had the power to cause motion in other beings—would have been at rest. If so, then it at some point had to have been in motion since rest is the privation of motion (251a8-25). Motion, then, is eternal. What moves the cosmos? This must be the unmoved mover, or God, but God does not move the cosmos as an efficient cause, but as a final cause. That is, since all natural beings are telic, they must move toward perfection. What is the perfection of the cosmos? It must be eternal, perfectly circular motion. It moves towards divinity. Thus, the unmoved mover causes the cosmos to move toward its own perfection.

f. Metaphysics

Aristotle’s Metaphysics, legendarily known as such because it was literally categorized after (meta) his Physics, was known to him as “first philosophy”—first in status, but last in the order in which we should study his corpus. It is also arguably his most difficult work, which is due to its subject matter. This work explores the question of what being as being is, and seeks knowledge of first causes (aitiai) and principles (archai). First causes and principles are indemonstrable, but all demonstrations proceed from them. They are something like the foundation of a building. The foundation rests upon nothing else, but everything else rests upon it. We can dig to the foundation, but (let’s pretend there’s no further earth under it) we can go no further. Likewise, we can reason our way up (or down) to the first principles and causes, but our reasoning and ability to know ends there. Thus, we are dealing with an inherently difficult and murky subject, but once knowledge of this subject is gained, there is wisdom (Metaphysics 982a5). So, if philosophy is a constant pursuit of wisdom for Plato, Aristotle believed that the attainment of wisdom is possible.

Aristotle says that there are many ways in which something is said to be (Meta.1003b5), and this refers to the categories of being. We can talk about the substance or being (ousia) of a thing (what that thing essentially is), quality (the shirt is red), quantity (there are many people here), action (he is walking), passion (he is laughing), relation (A is to B as B is to C), place (she is in the room), time (it is noon), and so on. We notice in each of these categories that being is at play. Thus, being considered qua being cannot be restricted to any one of the categories but cuts across all of them.

So what is being or substance? The form of a thing makes it intelligible, rather than its matter, since things with relatively the same form can have different matter (metal baseball bats and wooden baseball bats are both baseball bats). Here, we are really getting at the essence of something. Aristotle’s phrase for essence is “to ti en einai,” which could be translated as “what it is (was) to be” this or that thing. Since nothing is what it is outside of matter—there is no form by itself, just as there is no pure matter by itself—the essence of anything, its very being, is its being as a whole. No particular being is identical with its quality, quantity, position in space, or any other incidental features. It is the singular being as a whole, the “this” to which we can apply no further name, that shows us the being in its being.

The Metaphysics then arrives at a similar end as does the Physics, with the first mover. But, in the Metaphysics, we are not primarily concerned with the motion of physical beings but with the being of all beings. This being, God, is pure actuality, with no mixture of any potentiality at all. In short, it is pure being, and is always being itself in completion. Thinking is the purest of activities, according to Aristotle. God is always thinking. In fact, God cannot do otherwise than think. The object of God’s thought is thinking itself. God is literally thought thinking thought (1072b20). We recall from Aristotle’s psychology that mind becomes what it thinks, and Aristotle reiterates this in the Metaphyiscs (1072b20-22). Since God is thinking, and thinking is identical with its object, which is thought, God is the eternal activity of thinking.

5. Hellenistic Thought

The Hellenistic period in philosophy is generally considered to have commenced with Alexander’s death in 323, and ended approximately with the Battle of Actium in 31 BC. Although the Academy and the Lyceum could be considered in a thorough investigation into Hellenistic philosophy, scholars usually focus upon the Epicureans, Cynics, Stoics, and Skeptics.

Hellenistic philosophy is traditionally divided into three fields of study: physics, logic, and ethics. Physics involved a study of nature while logic was broadly enough construed to include not only the rules of what we today consider to be logic but also epistemology and even linguistics.

a. Epicureanism

Epicurus (341-271 B.C.E.) and his school are often mistakenly considered to be purely hedonistic, such that nowadays an “epicure” designates one who delights in fine foods and drinks. Etymologically, it is accurate to call Epicurus and his followers “hedonists,” where we refer merely to pleasure, without restricting that pleasure to bodily pleasures. Epicurus’ school, the Garden (an actual Garden near Athens), was primarily friendly in nature, and non-hierarchical (Dorandi 57). Although Epicurus was a prolific author, we have only three of his letters preserved in Diogenes Laertius’ Lives. Otherwise, we depend in large part upon the Epicurean Lucretius and his work On the Nature of Things, especially in order to understand Epicurean physics, which was essentially materialistic. The goal of all true understanding for Epicurus, which must involve an understanding of physics, was tranquility.

i. Physics

Epicurus and his followers were thoroughgoing materialists. Everything except the void, even the human soul, is composed of material bodies. Epicureans were atomists and accordingly thought that there is nothing but atoms and void. Atoms “vary indefinitely in their shapes; for so many varieties of things as we see could never have arisen out of a recurrence of a definite number of the same shapes” (DL X.42). Moreover, these atoms are always in motion, and will remain in motion in the void until something can offer enough resistance to stop an atom in motion.

Epicurus’ view of atomic motion provides an important point of departure from Democritean atomism. For Democritus, atoms move according to the laws of necessity, but for Epicurus, atoms sometimes swerve, or venture away from their typical course, and this is due to chance. Chance allows room for free will (Lucretius 2.251-262). Epicureans seem to take for granted that there is freedom of the will, and then apply that assumption to their physics. That is, there seems to be free will, so Epicureans then posit a physical explanation for it.

ii. Ethics

Much of what we know about Epicurean ethics comes from Epicurus’ Letter to Menoeceus, which is preserved in Diogenes Laertius’ Lives. The goal of the good life is tranquility (ataraxia). One achieves tranquility by seeking pleasure (hedone), but not just any pleasure will suffice. The primary sort of pleasure is the simplicity of being free from pain and fear, but even here, we should not seek to be free from every sort of pain. We should pursue some painful things if we know that doing so will render greater pleasure in the end (DL X.129-130). So, Epicurus’ hedonism shapes up to be a nuanced hedonism. Indeed, he recommends a plain life, saying that the most enjoyment of luxury comes to those who need luxury least (DL X.130). Once we habituate ourselves to eating plain foods, for example, we gradually eliminate the pain of missing fancy foods, and we can enjoy the simplicity of bread and water (DL X.130-131). Epicurus explicitly denies that sensual pleasures constitute the best life and argues that the life of reason—which includes the removal of erroneous beliefs that cause us pain—will bring us peace and tranquility (DL X.132).

The sorts of beliefs that produce pain and anxiety for us are primarily two: a mistaken conception of the gods, and a misconception of death. Most people, according to Epicurus, have mistaken conceptions about the gods, and are therefore impious (DL X.124). Similar to Xenophanes, Epicurus would encourage us not to anthropomorphize the gods and to think only what is fitting for the most blessed and eternal beings. We are not thinking clearly when we think that the gods get angry with us or care at all about our personal affairs. It is not befitting of an eternal and blessed being to become angry over or involved in the affairs of mortals. Yet, perhaps Epicurus is anthropomorphizing here. The argument seems to rely upon his argument that tranquility is our greatest pleasure and upon the assumption that the gods must experience that pleasure. On the other hand, one could read Epicurus as a sort of proto-negative theologian who merely suggests that it is unreasonable to believe that gods, the best of beings, feel pain at all. One might wonder whether anthropomorphizing is avoidable at all.

We should not fear death because death is “nothing to us, for good and evil imply sentience, and death is the privation of all sentience” (DL X.124). The key here is the first premise that good and evil apply only to sentient beings. We recall that, for Epicurus, we are thoroughly material beings. Both mind and soul are part of the human body, and the human body is nothing if not sentient. Therefore, when the body dies, so too does the mind and soul, and so too does sentience. This means that death is literally nothing to us. The terror that we feel about death now will vanish once we die. Thus, it is better to be free from the fear of death now. When we rid ourselves of the fear of death, and the hope of immortality that accompanies that fear, we can enjoy the preciousness of our mortality (DL X.124-125).

b. The Cynics

The Cynics, unlike the Epicureans, were not properly a philosophical school. While there are identifiable characteristics of cynical thought, they had no central doctrine or tenets. It was a disparate movement, with varying interpretations on what constituted a Cynic. This interpretative freedom accords well with one of the characteristics that typified ancient Cynicism—a radical freedom from societal and cultural standards. The Cynics favored instead a life lived according to nature.

“Cynic,” from the Greek kunikos, meant “dog-like.” We cannot be sure whether the Dogs thought of themselves as doglike, or whether they were termed as such by non-Cynics, or both. The first of the Dogs, Antisthenes (c.445-366 B.C.E.), was supposedly close with Socrates, and was present at his death, according to Plato’s Phaedo. Yet, it was Diogenes of Sinope (c.404-323 B.C.E.), often called simply, “Diogenes the Cynic,” who was and is the most famous of the Dogs. Most information we have comes from Diogenes Laertius’ Lives, which was written centuries after Diogenes the Cynic’s life, and is therefore historically problematic. It nevertheless provides us with an imaginative description of Diogenes the Cynic’s life, which was apparently unusual and outstanding.

Diogenes the Cynic was purportedly exiled from Sinope for defacing the city’s coins, and this later became his metaphorical modus operandi for philosophy—“driving out the counterfeit coin of conventional wisdom to make room for the authentic Cynic life” (Branham and Goulet-Cazé 8). The cynic life referenced here consisted of a life lived in accordance with nature, a rebellion against and freedom from dominant Greek culture that lives contrary to nature, and happiness through askesis, or asceticism (Branham and Goulet-Cazé 9). Thus, Diogenes wore but a thin, rough cloak all year round, accustomed himself to withstand both heat and cold, ate but a meager diet, and most sensationally, openly mocked everyday Greek life.

He was reportedly at a dinner party where the attendants were throwing bones at him as though he were a dog. So, Diogenes “urinated upon them as a dog would” (DL VI.46). He reportedly masturbated in public, and when reprimanded for it, he replied that he “wished it were as easy to relieve hunger by rubbing an empty stomach” (DL VI.46). Again, “He lit a lamp in broad daylight and said, as he went about, ‘I am looking for a human being’” (DL VI.41), implying that none of the Greeks could appropriately be called “human.” These shenanigans were intended to wake up the Athenians to the life of simplicity and philosophy. One needs very little to be happy. In fact, one should severely limit one’s desires, and live as most animals do, without anxiety, and securing only what one needs to continue living. This all seems a response to the cold fact that much of human life and circumstance is out of our control. So, Diogenes claimed that philosophy was a practice that prepared him for any kind of luck (DL VI.63).

The Cynics seem to have taken certain aspects of Socrates’ life and thought and pushed it to the extreme. One might wonder what drives the ascetic practice for any sort of luck. Is it that we see that moving from one superficial pleasure to the next is ultimately unfulfilling? Or, is the practice itself driven by a sort of fear, an emotion that the Cynic means to quell? That is, one might read the asceticism of the Cynic as a futile attempt to deny the truth of human fragility; for example, at any moment the things I enjoy can vanish, so I should avoid enjoying those things. On the other hand, perhaps the asceticism of the Cynic is an affirmation of this fragility. By living the ascetic life of poverty, the Cynic is constantly recognizing and affirming his/her finitude and fragility by choosing never to ignore it.

c. The Stoics

Stoicism evolved from Cynicism, but was more doctrinally focused and organized. While the Cynics largely ignored typical fields of study, the Stoics embraced physics, logic, and ethics, making strides especially in logic. Zeno of Citium (c.334-262 B.C.E.) was the founder of the stoic school, which was named after the Stoa Poikile, a “painted portico” where the Stoics regularly met. This was the beginning of a long and powerful tradition, which lasted into the imperial era. Indeed, one of the most famous of stoic ethicists was the Roman emperor Marcus Aurelius (121-180 C.E.). Epictetus (55-135 C.E.) is another famous Stoic ethicist who also carried on the tradition of Stoicism beyond the Hellenistic period. Although the Stoics made some strides in logic after Aristotle, this article’s focus is on Stoic physics, epistemology, and ethics.

i. Physics

As Pierre Hadot has shown, the Stoics studied physics in order better to understand their own lives, and to live better lives: “Stoic physics was indispensible for ethics because it showed people that there are some things which are not in their power but depend on causes external to them—causes which are linked in a necessary, rational manner” (Hadot 128). Like the Cynics, the Stoics strove to live in accordance with nature, and so a rigorous study of nature allowed them to do so all the more effectively.

The Stoics were materialists, though not thoroughgoing materialists as the Epicureans were. Also, chance can play no role in the Stoics’ ordered and thoroughly rational and causally determined universe. Since we are part of this universe, our lives, too, are causally determined, and everything in the universe is teleologically oriented towards its rational fulfillment. Diogenes Laertius reports that the Stoics saw matter as passive and logos (god) as active, and that god runs through all of the matter as its organizing principle (DL VII.134). This divinity is most apparent in us via our ability to reason. At any rate, the universe is, as the name implies, a unity, and it is divine.

ii. Epistemology

The knowledge we have of the world comes to us directly through our senses and impresses itself upon the blank slate of our minds. The naked information that comes to us via the senses allows us to know objects, but our judgments of those objects can lead us into error. As Hadot says about these so-called objective presentations, “They do not depend on our will; rather, our inner discourse enunciates and describes their content, and we either give or withhold our consent from this enunciation” (Hadot 131). There might be a problem lurking here regarding the standard of truth, which, for the Stoics, is simply the correspondence of one’s idea of the object with the object itself. If it is true that the correspondence of our descriptions of the object with the actual object can bring us knowledge, how can we ever be sure that our descriptions really match the object? After all, if it is not the bare sense impression that brings knowledge, but my correct description of the object, it seems that there is no standard by which I can ever be sure that my description is correct.

iii. Ethics

Stoic ethics urges us to be rid of our desires and aversions, especially where these desires and aversions are not in accord with nature. For instance, death is natural. To be averse to death will bring misery. Stoic ethics can perhaps be best summed up in the first paragraph of Epictetus’ Handbook:

Some things are up to us and some are not up to us. Our opinions, and our impulses, desires, aversions—in short, whatever is our own doing. Our bodies are not up to us, nor are our possessions, our reputations, or our public offices, or, that is, whatever is not our own doing. The things that are up to us are by nature free, unhindered, and unimpeded; the things that are not up to us are weak, enslaved, hindered, not our own…If you think that only what is yours is yours, and that what is not your own is…not your own, then no one will ever coerce you, no one will hinder you, you will blame no one, you will have no enemies, and no one will harm you, because you will not be harmed at all.

This passage might be shocking to us today when, especially in the United States, many of the things that Epictetus tells us to avoid are what we are told to pursue. We therefore might wonder why our bodies, possessions, reputations, wealth, or jobs are not in our control. For Epictetus, it is simple. Possessions come and go—they can be destroyed, lost, stolen, and so forth. Reputations are determined by others, and it is reasonable to believe that even the best people will be hated by some, and even the worst people will be loved by some. Try as we might, we might never gain wealth, and even if we do, it can be lost, destroyed, or stolen. Again, public office, like reputation, is up to others to determine. So, the adage that “you can be anything you want in life” is not only false under stoic ethics, but dangerously misleading since it will almost inevitably lead to misery.

Just because, however, I live as Epictetus recommends, how can I be sure that I will never be harmed? Even if I fully grant that someone who, for instance, pushes me down a flight of stairs has committed his own wrong, and that his wrong actions are not in my control, will I not still feel pain? Physical pain, for a Stoic, is not harm. The only real harm is when one harms oneself by doing evil, just as the only real good is living excellently and in accordance with reason. In this example, I would harm myself with the judgment that what happened to me was bad. One might object here, as one might object to Cynicism, that stoic ethics ultimately demands a repression of what is most human about us. Indeed, Epictetus says, “If you kiss your child or wife, say that you are kissing a human being; for when it dies you will not be upset” (Handbook 12). For the Stoic, being moved by others brings us away from tranquility. However, kissing a “human being” is not the same as kissing this human being, this individual who would be deeply hurt by knowing that I treat them merely as a human being, and who I relate to only through a sense of duty, rather than a real sense of love. Stoic ethics risks removing our humanity from us in favor of its own notion of divinity.

d. Skeptics

The two strands of Skepticism in the Hellenistic era were Academic Skepticism and Pyrrhonian Skepticism. Somewhat like the Cynics, each major Skeptic had his own take on Skepticism, and so it is difficult to lump them all under a tidy label. Also like the Cynics, however, there are certain characteristics that can be highlighted, despite differences between particular thinkers. Skepsis means “inquiry,” but the Skeptics did not seek solid or absolute answers as the goal of their inquiry. Rather, the goal of their skepsis was tranquility and freedom from judgments, opinions, or absolute claims to knowledge. Skepticism, broadly speaking, constituted a challenge to the possibility and nature of knowledge.

i. Academic Skepticism

The sixth scholarch (leader) of Plato’s Academy was Arcesilaus (318-243 B.C.E.), who initiated a substantial tradition of Skepticism in the Academy that lasted into the first century B.C.E. Arcesilaus found the inspiration for his skepticism in the figure of Socrates. Arcesilaus would argue both for and against any given position, ultimately showing that neither side of the argument can be trusted. He directed his skepticism primarily toward the Stoics and the empirical basis of their claims to knowledge. We recall that, for the Stoics, a grasping of sense impressions in the proper way is the true foundation for knowledge. Arcesilaus’ argument against stoic empiricism is not clear (the argument is recounted in Cicero’s Academia 2.40-42), but it seems ultimately to reach the conclusion suggested above, namely that we can never be sure that the way we have perceived (judged) an object via the senses is true or false. The argument runs roughly as follows. For any given presentation of an object to the senses, we can imagine that something else could be presented to the senses in just the same way, such that the perceiver cannot distinguish between the two objects being presented, which Arcesilaus thought the Stoics would grant. The perceiver can present these objects to him/herself, via the senses, in a true or false way, which the Stoics would also grant. It is possible, then, that the perceiver thinks one presentation is true and the other is false, but he has no way of distinguishing between either. Arcesilaus’ conclusion is that we should always suspend our judgment.

Carneades (213-129 B.C.E.), the tenth scholarch of Plato’s Academy, seems to have cleverly answered a typical objection raised against Skepticism. It is inconsistent, goes the objection, to insist that it is impossible for anything to be known (“grasped”), since that statement, “nothing can be known” is itself a claim to knowledge. Carneades recognized that even the claim “nothing can be known” should be called into doubt. Again, like Arcesilaus, Carneades relied upon the typical skeptic tactic of presenting arguments both for and against the same thing and claiming that we cannot therefore claim that either side is correct.

ii. Pyrrhonian Skepticism

We know almost nothing for sure about Pyrrho of Elis (360-270 B.C.E.). He wrote nothing, which is perhaps a sign of his extreme skepticism, that is if we cannot know anything, or cannot be sure whether knowledge is possible, then nothing can definitively be said, especially in writing. Perhaps what most differentiates Pyrrhonian Skepticism from Academic Skepticism is the profound indifference that Pyrrhonian Skepticism is meant to generate. Diogenes Laertius relays the story that, when his master Anaxarchus had fallen into a swamp, Pyrrho simply passed him by, and was later praised by Anaxarchus for his supreme indifference (DL IX.63). Pyrrhonian Skepticism refutes all dogmas and opinions and vehemently clings to indeterminacy, even the idea that “nothing can be known.”

Aenesidemus, the Pyrrhonian Skeptic, advanced the “Ten Modes,” arguments that address typical difficulties in appearances and judgment—each aimed toward the conclusion that we ought to suspend judgment if we are to be at peace. The first mode argues that other animals sense things differently from human beings, and that we cannot therefore pretend to place any absolute value on the things sensed. Since the qualities of sensation vary from species to species, for example “the quail thrives on hemlock, which is fatal to man” (DL IX.80) we ought to suspend value judgments upon those things. In the quoted example, then, the hemlock is clearly not in itself evil, but neither is it in itself good, but it is a matter of indifference. The remaining modes follow a similar pattern, highlighting relativity—whether cultural, personal, sensory, qualitative or quantitative—as evidence that we ought to suspend judgment.

The Skeptics, as Pierre Hadot says, use “philosophical discourse…to eliminate philosophical discourse” (143). That is, they do not adhere to any philosophical position, but use the tools of philosophy to gain a sense of simplicity and tranquility in life, thereby ridding themselves of the need for philosophy. By using dialectic, and opposing one argument to another, the Skeptic suspends judgment, and is not committed to any particular position. The Skeptic,

In everything he did…was to limit himself to describing what he experienced, without adding anything about what things are or what they are worth. He was to be content to describe the sensory representations he had, and to enunciate the state of his sensory apparatus, without adding to it his opinion. (Hadot 145)

We might wonder just how practical such an approach to life would be. Can we flourish or thrive, effectively communicate, or find cures for diseases by merely describing our experience of the world? For example, antibiotics can help, more often than not, to cure diseases born from certain bacteria. Could we not say, for practical purposes, that we know this to be the case? We are not, after all, ignorant of the fact that bacteria are becoming resistant to certain antibiotics, but this does not mean that they do not work, or that we cannot someday find alternative cures for bacterial infections.

The Skeptic could reply in several ways, but the most effective reply to the example provided might go something like this: Medicine does not bring us knowledge, if knowledge is certainty. Medicine, and what it claims to know has, after all, changed significantly. The practice of medicine is just another way of describing the way certain bodies interact with other bodies in a given time and place. But the Skeptic would go further. The curing of a disease, he would say, is neither good nor bad. Perhaps my disease is cured, and the next day, I am killed in some other way. If death is a matter of indifference, then the cure for illnesses must be, too. Again, we might wonder in this case how one is ever spurred to action.

6. Post-Hellenistic Thought

Platonic thought was the dominant philosophical force in the time period following Hellenistic thought proper. This article focuses on the reception and reinterpretations of Plato’s thought in Neoplatonism and particularly in its founder, Plotinus.

a. Plotinus

Plotinus (204-270 C.E.), in his Enneads—a collection of six books broken into sections of nine—builds upon Plato’s metaphysical thought, and primarily upon his concept of the Good. Plotinus is also informed by Aristotle’s work, the Unmoved Mover (thought thinking thought) in particular, and is privy to the bulk of the ancient philosophical tradition. As Kevin Corrigan says, “Plotinus transforms everything he inherits by the very activity of thinking through that inheritance critically and creatively” (23). In other words, Plotinus inherits concepts of unity, the forms, divine intellect, and soul, but makes these concepts his own. The result is a philosophy that comes close to a religious spiritual practice.

There are three aspects to Plotinus’ metaphysics: the One, Intellect, and Soul. The One is the ineffable center of all reality and the wellspring of all that is—more precisely, it is the condition of the possibility for all being, but is itself beyond all being. The One cannot be accurately accounted for in discourse. We can only contemplate it, and at most relay our own experience of this contemplation (Corrigan 26). We can speak negatively about the One (VI, 9.3). Thus, for example, we say that it is impassive. It does not create Intellect or Soul or anything else; rather, by its supreme nature, it merely emanates Intellect and Soul.

i. Intellect, Soul, and Matter

The Intellect emanates from the One because of the One’s fullness. The One, by being the One, simply gives off the Intellect, so to speak (Enneads V, 2.7-18). Since being moves out from its source and returns to its source (Corrigan 28), the Intellect turns towards the One and contemplates it. The Intellect is other than the One, but united with it in contemplation. As other, it gives rise to multiplicity, namely the forms that it is and that it thinks (it thus thinks itself). The Intellect generates Soul, which shares in intellect, but also animates the material world. Thus, the material world is generated by Soul, and this includes every individual being. A particular human being, then, has its share of soul, and its highest part of the soul is intellect, where true selfhood is.

ii. The True Self and the Good Life

The best life for human beings necessitates that each human become his or her own true self, which is the intellect. That is, we must turn away, as much as is possible, from matter and the sensible world, which are distractions, and be intellect (Enneads I.4). To become one’s true self is to live the best life. Being oneself in this sense, however, is quite different from the individuality promoted in the Western world. Hadot says, “To become a determinate individual is to separate oneself from the All by adding a difference which, as Plotinus says, is a negation. By cutting off all individual differences, and therefore our own individuality, we can become the All once again” (166). The best life depends upon becoming one’s true self via the intellect, which means to step away from the part of the soul by which we typically identify ourselves, the passionate and desiring part of the soul. If we are now accustomed to identify ourselves by our likes, dislikes, opinions, , then a true Plotinian self would not be a self at all. For Plotinus, however, this is true selfhood since it is closest to the center of all life, the One.

b. Later Neoplatonists

Plotinus set off a tradition of thought that had great influence in medieval philosophy. This tradition has been known since the 19th century as “Neoplatonism,” but Plotinus and other Neoplatonists saw themselves merely as followers and interpreters of Plato (Dillon and Gerson xiii). Plotinus’ student, Porphyry, without whom we would know little to nothing about Plotinus or his work, carried on the tradition of his master, although we do not possess a full representation of his work. With Iamblichus came a focus upon Aristotle’s work, since he took Aristotle as an informative source on Platonism. Neoplatonism also saw the rise of Christianity, and therefore saw itself to some degree in a confrontation with it (Dillon and Gerson xix). Perhaps in part because of this confrontation with Christianity, later Neoplatonists aimed to develop the religious aspects of Neoplatonic thought. Thus, the later Neoplatonists introduced theurgy, claiming that thought alone cannot unite us with gods, but that symbols and rites are needed for such a union (Hadot 170-171).

c. Cicero and Roman Philosophy

Greek philosophy was the dominant philosophy for years, including in the Roman Republic and in the imperial era. Cicero (106-43 B.C.E.) considered himself to be an Academic Skeptic, although he did not take his skepticism as far as a renunciation of politics and ethics. He is a very useful source for the preservation of and commentary upon not only Academic Skepticism, but also the Peripatetics, Stoics, and Skeptics. He was also an accomplished orator and politician, and authored many works of his own, which often employed skeptic principles or commented upon other philosophies. He took pains, as a true Skeptic, to present both sides of an argument. Cicero was murdered during the rise of the Roman empire.

Stoicism played an important role in the imperial period, especially with the Roman emperor Marcus Aurelius. Marcus is most famous for his so-called Meditations, which is a translation of the Greek ta eis heauton, “[things] to himself.” As the Greek title clearly shows, these meditations were meant for Marcus himself. These were reminders on how to live, especially as an emperor who saw turbulent times. This work, in its usually short, pithy statements, reveals some principles of stoic physics, but this only in service of its larger ethical orientation. It advocates a life of simplicity and tranquility lived according to nature.

7. Conclusion

From the Presocratics to the Hellenists, there is a preference for reason, whether it is used to find truth or tranquility. The Presocratics prefer reason or reasoned accounts to mythology, sometimes in order to find physical explanations for the phenomena all around us, to think more clearly about the gods, or sometimes to find out truths about our own psychology. For Socrates, the exercise of reason and argumentation was important to recognize one’s own limitations as a human being. For Plato, the life of reason is the best life, even if it cannot ultimately answer every question. Aristotle used reason to investigate the world around him, in some sense resuscitating the Presocratic preference for physical explanations, and returning lofty discussions to earth. The Hellenists emphasized philosophical practice, always in accordance with reason. We have also seen the profoundly influential tradition set in motion by Plato with the development of his thought into the so-called Neoplatonic era. That scholars and the intellectually curious alike still read these works, and not merely for historical purposes, is a testament to the depth of thought contained therein.

8. References and Further Reading

a. Presocratics

i. Primary Sources

  • Diels, Hermann and Walther Kranz. Die Fragmente der Vorsokratiker: Griechisch und Duetsch. Berlin: Weidmannsche Buchhandlung, 1910. Print.
    • This is the first and most traditionally used collection of Presocratic fragments and testimonies. This edition has the fragments in Greek with German translations. The book is no longer in print, and while it is often still cited in most scholarship, it is not the work cited in this article.
  • Graham, Daniel W. The Texts of Early Greek Philosophy: The Complete Fragments and Selected Testimonies of the Major Presocratics. 2 vols. Cambridge: Cambridge University Press, 2010.
    • This is the first collection of the Presocratic fragments and testimonies published with the original Greek and English translations. It is the work cited in this text. Graham offers a short commentary on the fragments, as well as references for further reading for each thinker. He has organized by topic the fragments for each thinker, and labels the fragments with an F, followed by the number of the fragment. That is how the fragments have been cited in this article. Testimonies are cited merely by their designated numbers.

ii. Secondary Sources

  • Barnes, Jonathan. The Presocratic Philosophers. London and New York: Routledge, 1982.
    • A classic work with interpretations of the Presocratics.
  • Burnet, John. Early Greek Philosophy. London: A&C. Black Ltd., 1930.
    • Another classic work with interpretations of the Presocratics.
  • Long, A.A. ed. The Cambridge Companion to Early Greek Philosophy. Cambridge: Cambridge University Press, 1999.
    • A collection of sixteen essays by some of the foremost scholars on Presocratic thought. The essays are generally accessible, but some are more appropriate for specialists in the field.
  • McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994
    • This is a book for non-specialists and specialists. It contains most fragments for most thinkers and reasonable explanations and interepretations of each. There is also a helpful chapter at the end of the book on the nomos-phusis debate. The text includes a fairly extensive section for suggestions for further reading.
  • Vlastos, Gregory. “Ethics and Physics in Democritus.” Philosophical Review, vol. 2, 578-592, 1994.
    • This article is technical but offers insight into the connection between Democritean physics and ethics, and it was cited in the current overview.

b. Socrates and Plato

i. Primary Sources

  • Cooper, John, ed. Plato: Complete Works. Indianapolis: Hackett, 1997.
    •  The work is the most comprehensive and is also used throughout this article. This collection includes all of Plato’s authentic work as well as every work considered to be spurious or likely spurious. There is no other such collection in English. Any citations of John Cooper in this article come from Cooper’s introduction to this work.
  • Xenophon, IV: Memorabilia, Oeconomicus, Symposium, and Apology. Jeffrey Henderson ed. E.C. Marchant and O.J. Todd trans. Cambridge: Harvard University Press, 2002.
    • This is from the Loeb Classical Library, and accordingly has the original Greek with English on the facing page.

ii. Secondary Sources

  • Benson, Hugh H., A Companion to Plato. Malden: Blackwell Publishing, 2006.
    • This is a collection of scholarly articles on Plato’s work, and on Plato’s version of Socrates.
  • Brickhouse, Thomas C. and Nicholas D. Smith, Plato’s Socrates. New York: Oxford University Press, 1994.
    • This is a scholarly yet approachable book on just what the title suggests. It covers a range of problems that thoughtful readers will encounter when reading Plato.
  • Kraut, Richard ed. The Cambridge Companion to Plato. Cambridge: Cambridge University Press, 1992.
    • This is a collection of articles from premier Plato scholars on a variety of topics.
  • Morrison, Donald R. ed. The Cambridge Companion to Socrates. Cambridge: Cambridge University Press, 2011.
    • This is a collection of scholarship on historical, fictional, and philosophical perspectives of Socrates from Aristophanes to Plato.
  • Nails, Debra, “The Life of Plato of Athens,” in A Companion to Plato. Hugh H. Benson ed. Malden: Blackwell Publishing, 2006.
  • Nails, Debra, The People of Plato: a prosopography of Plato and other Socratics. Indianapolis: Hackett, 2002.
    • While this book can be laden with details, it is an indispensible resource for Plato scholars, as well as for anyone curious enough to know more about the various interlocutors and character references in Plato’s dialogues.
  • Taylor, A.E., Plato: The Man and His Work. New York: Meridian Books, 1960.
    • Although dated, this book offers of a survey and assessment of the bulk of Plato’s dialogues.
  • Tigerstedt, E.N., Interpreting Plato. Stockholm: Almqvist & Wiksell International, 1977.
    • This book is heavy on detail, but it provides a valuable survey of problems in and interpretations of Plato.
  • Vlastos, G., Socrates: Ironist and Moral Philosopher. Ithaca: Cornell University Press, 1991.
    • This book was especially influential for its chronological categorization of Plato’s dialogues, although the chronological reading has since lost its influence.

c. Aristotle

i. Primary Sources

  • Barnes, Jonathan ed. The Complete Works of Aristotle. Princeton: Princeton University Press, 1984.
    • This book is the most comprehensive, and it includes spurious works or works thought to be spurious. It is also the edition cited in this article.

ii. Secondary Sources

  • Barnes, Jonathan, The Cambridge Companion to Aristotle. Cambridge; New York: Cambridge University Press, 1995.
    • This book contains scholarly articles on a variety of subjects in Aristotle’s thought.
  • Broadie, Sarah, Ethics With Aristotle. New York ; Oxford: Oxford University Press, 1991
    • This book is a good overview of and commentary upon Aristotelian ethics.
  • Burnyeat, Miles, Map of Metaphysics Zeta. Pittsburgh: Mathesis Publications, 2001.
    • This book is meant to help readers navigate one of the most difficult books of Aristotle’s most difficult work.
  • Irwin, Terence, Aristotle’s First Principles. Oxford: Oxford University Press, 1988.
    • Although somewhat dense, this work provides insight into Aristotle’s metaphysical first principles, which underlie much of his work.

d. Hellenistic Philosophy

i. Primary Sources

  • Empiricus, Sextus, Outlines of Scepticism. Julia Annas and Jonathan Barnes eds. Cambridge: Cambridge University Press, 2000.
    • This book gives a good overview of Hellenistic Skepticism, and contains helpful notes from Annas and Barnes.
  • Epictetus, The Handbook, Nicholas P. White trans. Indianapolis: Hackett, 1983.
    • Although Epictetus was not a Hellenist, his formulation of stoic ethics is concise and highly influential. This work was also cited in this article.
  • Corrigan, Kevin, Reading Plotinus: A Practical Introduction to Neoplatonism. West Lafayette: Purdue University Press, 2005.
    • Corrigan presents key readings representative of Plotinus’ philosophy, and after each section of primary readings, provides his own lucid and helpful commentary.
  • Dillon, John and Lloyd P. Gerson, Neoplatonic Philosophy. Indianapolis: Hackett, 2004.
    • This is a helpful introduction to Neoplatonic thought. The bulk of the text are selections of Plotinus’ work, but it also contains selections from Porphyry, Iamblichus, and Proclus.
  • Inwood, Brad, and L.P. Gerson trans. and ed. Hellenistic Philosophy: Introductory Readings. Indianapolis: Hackett, 1988.
    • Since, like the Presocratics, original works are lacking in Hellenistic thought, this book is a good place to begin. It collects central texts, including ancient commentaries, covering the central themes of physics, logic, and ethics from epicurean, stoic, and skeptic perspectives.
  • Inwood, Brad, and L.P. Gerson trans. and ed. The Stoics Reader: Selected Writings and Testimonia. Indianapolis: Hackett, 2008.
    • Again, few original works survive from Hellenistic Stoicism proper, but this book provides central readings in Hellenistic Stoicism.
  • Laertius, Diogenes, Lives of Eminent Philosophers II. Jeffrey Henderson ed. R.D. Hicks trans.
  • Cambridge: Harvard University Press, 1931.
    • This volume of Diogenes’ famous work contains the three letters purported to be Epicurus’ letters on physics and ethics.
  • Plotinus, Enneads. 7 vols. ed. Jeffrey Henderson and trans. A.H. Armstrong. Cambridge: Harvard University Press, 1966.
    • This is the Loeb edition of Plotinus complete Enneads, along with Porphyry’s “Life of Plotinus.” This edition has the Greek facing the English translation.

ii. Secondary Sources

  • Algra, Keimpe, Jonathan Barnes, Jaap Mansfeld, and Malcolm Schofield, eds. The Cambridge History of Hellenistic Philosophy. Cambridge: Cambridge University Press, 1999.
    • Although this work is intended for specialists and non-specialists alike, it is dense and sometimes overburdened with details for the non-specialists tastes. It does, however, provide valuable historical information and commentary.
  • Branham, R. Bracht and Marie-Odile Goulet Cazé, The Cynics: The Cynic Movement in Antiquity and Its Legacy. Berkeley: University of California Press, 1996.
    • This is an informative collection of scholarly articles on a variety of topics in Cynicism. It also has a very helpful historically oriented introduction, which was cited in this article.
  • Hadot, Pierre, What is Ancient Philosophy? Michael Chase trans. Cambridge: Harvard University Press, 2002.
    • This book contains informative and innovative readings of ancient philosophy in general, and was cited in this article for its treatment of Hellenistic philosophy.

 

Author Information

Jacob N. Graham
Email: jgraham@bridgewater.edu
Bridgewater College
U. S. A.

Thomas Reid: Theory of Action

Reid pictureThomas Reid (1710-1796) made important contributions to the fields of epistemology and philosophy of mind, and is often regarded as the founder of the common sense school of philosophy. However, he also offered key arguments and observations concerning human agency and morality.

Reid carefully criticized the views of his contemporaries, and defended an account of human freedom in which he argues that only beings endowed with will and understanding, who also have power over their will and actions, and who are directed by motives and reasons, are agents, beings capable of acting freely. Agents are, in Reid’s terms, efficient causes of some of their actions, causing them by exerting their power to act. To say that an agent possesses active power is, for Reid, to say that the agent is capable of exerting a productive capacity and that in such cases it is really up to the agent to produce or to refrain from producing the actions which the agent has the power to produce. Reid presents this theory of action and human liberty primarily in his last published work, the Essays on the Active Powers of Man (1788).

Reid argues that human beings have power not only over their actions, but also over their choices and intentions. Reid therefore opposes what he calls the system of necessity, and openly challenges the views of philosophers such as John Locke, Joseph Priestley, Anthony Collins, and David Hume; who tend to share the view that human beings have freedom over their actions but not over their wills.

In the Essays on the Active Powers of Man, Reid develops his theory of action around an examination of active power, of the will, of human motives and beliefs, and around a defense of his account of human freedom.

Table of Contents

  1. On Active Power
    1. On the Notion of Power
    2. Active Power, Will, and Understanding
      1. General Observations Concerning Active Power
      2. Agents with Active Power are Endowed with Will and Understanding
  2. Of the Will
    1. General Observations
    2. Voluntary Operations of the Mind
  3. The Principles of Action or Motives
    1. Mechanical Principles
    2. Animal Principles
    3. Rational Principles
    4. Influence of Motives on the Will
  4. On Moral Liberty
    1. The Notion of Moral Liberty
    2. Of Causes
      1. Efficient Causes and First Principles
      2. Physical Causes
      3. Principles of Action and Causation
    3. The Problem of Infinite Regress
    4. First Argument for Moral Liberty
    5. Second Argument for Moral Liberty
    6. Third Argument for Moral Liberty
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. On Active Power

a. On the Notion of Power

Reid writes in the first essay of the Essays on the Active Powers that active power is the power of acting—the power of producing a work of art or of labor (EAP, 12). Reid starts his book by arguing that, contrary to what David Hume had defended (T.1.3.14), human beings have an idea of power. In order to show that we do have such a notion, Reid observes that the ideas of acting and being acted upon are found very early and universally in the minds of children. If one is able to form the thought of hitting or of being hit upon, and if all languages have active and passives voices, Reid argues, then one must be able to form the ideas of activity and of passivity (EAP, 13). Moreover, Reid contends that all humans have the notion of active power because many ordinary operations of mind imply a belief in active power, both in ourselves and in others. Actions such as deliberations, purposes, making promises, counsels, encouragements and commands imply, Reid writes, the belief that we have some degree of power over these operations and over their effects.

When Reid writes that humans have an idea or a notion of power, he means that humans have at least a conception of power. Conception, for Reid, is a term of art that should not be understood in the Kantian sense of subsuming some object under a particular concept, or as predicating some quality to the object. To the contrary, conception, for Reid, is one of the most basic cognitive capacities, whereby beings (both humans and some non-human animals) hold something in mind without predicating anything to it. The being grasps the object by holding it in mind without thinking anything about or of the object. Reid thinks that human beings all have, at the very least, a conception of power—even though they often also form more complex beliefs or judgments about power.

Hume, Reid points out, argues that since the notion of power is neither produced by sensation nor reflection, and since we cannot properly define the notion, then we have no such idea (EAP, 24). Reid, in reply, points out that although we do not directly perceive an agent’s power, we are nonetheless conscious of our exertions of power. Moreover, even though ‘power’ cannot be defined by appealing to more general or more simple categories, one may still form a conception of power from its effects, and one may still offer observations about its attributes and qualities. Reid therefore writes that since we do have a notion of power we may now study its characteristics. The first attribute Reid observes is that active power only exists in beings with will and understanding.

b. Active Power, Will, and Understanding

Reid turns to the question of whether beings who have active power could at the same time lack understanding and will (EAP 27). We cannot answer this question, Reid points out, by observing changes in nature since we do not perceive the agent nor the power behind these changes (more on changes in nature in section 4.b.ii). Reid writes that it is better to turn our attention to human agents, since, in agreement with Locke, Reid points out that our first ideas of power are taken from the power we find ourselves experiencing when we produce changes in our bodies and in the world around us (EAP, 20).

i. General Observations Concerning Active Power

By turning our view to our own exertions of power we notice, according to Reid, that power is brought into action by volition. A person’s capacity to act is actualized by exerting that power. There are many times when a person may not use his or her capacity to act. However, Reid argues, when they do exert that power, it is because the agents willed, at some point, to perform the action either immediately or later in time. An agent’s power, Reid writes, “is measured by what he can do if he will” (Of Power 10). For Reid, we notice that our willings or choices are exertions of power when we turn our attention to them. Opinions vary about Reid’s use of various terms related to volition, but one way to understand Reid is to think of volitions as willings, choices, or decisions. And, for Reid we are conscious that our choices are something over which we have some control. When we chose to do something immediately, the choice (or volition) “is accompanied with an effort to execute that which we willed” (EAP, 50). We may not always pay attention to this effort, but, Reid continues, “this effort we are conscious of, if we will but give attention to it; and there is nothing in which we are in a stricter sense active” (EAP, 51). Reid’s view, therefore, is that choices or willings are exertions of power, they are mental events through which the agent puts his or her power into action, and agents are conscious of these exertions of power.

Reid’s account is hence different from an account such as the one Thomas Hobbes defended, where an agent is free or has active power (freedom) to perform an action if he willed to perform the action. If the agent wills to perform the action, but is prohibited from performing it, then the agent is not free, according to the Hobbesian picture (Hobbes, 1648, 240). Reid, however, contends that an agent is free not only to act if he willed to act, but that the agent is also free to will or to refrain from willing. The anti-Hobbesian view Reid defends is that an agent is not free unless he has power over his willing or refraining from willing. First, Reid argues, we have an idea of power that implies having control or power over our choices, and this belief is presupposed in many other beliefs, and in our everyday actions (see previous section and arguments for moral liberty). Second, Reid observes that we are conscious of the effort we exert when we choose. The exertion of power is the object of consciousness, which we directly observe if we carefully attend to the objects of our consciousness.

ii. Agents with Active Power are Endowed with Will and Understanding

That beings with active power possess a will follows immediately, Reid holds, from his claim that the power to act implies the power to will. Reid develops his argument for this claim in several steps. First, we are conscious of having power over many of our actions: over the movements of our bodies, and the movements of our minds. Second, for Reid, having power over an end such as an external action that depends on our will requires having power over the actions of the agents which bring about the end. Reid writes that the action or effect an agent produces “cannot be in his power unless all the means necessary to its production be in his power” (EAP, 203). By ‘necessary means’ Reid does not refer to all the involuntary physical or biological events that are necessary conditions for performing the action. After all, he writes that the man who intends to shoot his neighbor dead is the cause of the death, but “he neither gave to the ball its velocity, nor to the power its expansive force, nor to the flint and steel the power to strike fire…” (EAP, 41). The velocity of the ball is not something that is in the agent’s power, even though it is a necessary condition for the killing of the man. The choice and intention of the agent, however, are events over which the agent does have power (1.b.i).

Reid’s third step, therefore, is the premise that choices and intentions are actions; they are mental events over which the agent has power—the evidence for this claim is that agents are conscious of the effort of exerting their power when they choose. Furthermore, choices and intentions are necessary means to performing effects. Hume’s predecessors all agreed that we have power over our actions or effects. Reid, therefore, points out that “to say that what depends upon the will is in a man’s power, but the will is not in his power, is to say that the end is in his power, but the means necessary to that end are not in his power, which is a contradiction” (EAP, 201). For Reid, this statement follows from the claim that willings (choices) are actions. If an agent has the power to perform an external action A, then the agent has the power to perform another external action B required to perform A. But if an internal action C is required to perform A, then the agent who has the power to perform A, also has the power to C. Reid’s conclusion, therefore, is that agents who must carry out a chain of actions in order to perform an end must have the power over each specific action (including volitions) in the chain of actions in order to perform the end (see Yaffe 2004, chapter 1, for a more complete development of this argument).

Now, if the performance of an external action requires having power over the internal action of choosing, then the power to act requires having a will. Only agents with a will are capable of willing. This does not mean that all beings with a will have power over their wills. After all, some animals might be endowed with a will and act voluntarily but not have power to control their will. But having power over one’s will implies having a will. For Reid, therefore, human beings who are able to have some control over their actions and choices, who have active power, must be endowed with a will.

According to Reid, active power or human freedom requires having, not only a will, but also understanding to direct that will. To reach this conclusion Reid argues it is important to observe that the power to will implies the power to refrain from willing. To have control over our choices and to observe that our choices are exertions of active power leads to the observation that in many cases we are free to will or to refrain from willing. This is what Reid means we he argues that we have power over our actions. Actions (external actions or internal willings)—as long as they are truly the agent’s actions—are capacities to act or to refrain from acting. The inability to refrain from acting is the consequence of forces with respect to which an agent is passive. Reid therefore writes that

If, in any action [an agent] had the power to will what he did, or not to will it, in that action he is free. But if, in every voluntary action, the determination of his will be the necessary consequence of something involuntary in the state of mind, or of something in his external circumstances, he is not free; he has not what I call the liberty of a moral agent, but is subject to necessity. (EAP, 197)

To be active, for Reid, requires the ability to will and to refrain from willing to perform an action. The two way power of willing and refraining from willing defines a person’s capacity to truly have control over his choices and actions.

If active power is a power to will and to refrain from willing, Reid thinks only beings with understanding could have such a two-way power. This two-way power implies the capacity to weigh reasons, or at least to be able to act in light of a reason, or against some good reason, and this would require possessing intellectual capacities. Hence liberty, Reid writes, “implies, not only a conception of what he wills, but some degree of practical judgment or reason” (EAP, 196). The power to act, therefore, implies the power to will. The power to will together with the capacity to weigh reasons and act in light of them or not (the ability to will and to refrain from willing for certain reasons), require that the being with active power be endowed with both a will and some degree of understanding.

2. Of the Will

a. General Observations

Reid introduces his essay on the will (Essay II of EAP,) by pointing out that the will is the power to determine to act or to refrain from acting. If active power is the power to act freely, to have power or control of the direction of (some of) our thoughts, of our bodies, and to initiate changes outside of us, the will seems to be simply the capacity to determine, that is, to choose, to decide, and to intend a certain course of action. Since we have no direct knowledge of the will other than by its effects, Reid focuses his discussion on what he calls ‘volitions,’ a term that refers to the acts of the will.

Reid continues by describing five essential qualities of every act of will. First, acts of will are about something—they have an object. The person who wills must will something, Reid writes, and this implies being able to have a conception of what is willed, and an intention to carry out what is willed. Acts of will produce voluntary actions, and what distinguishes them from things done by instinct or from habit is precisely that voluntary actions involve a conception and intention, whereas things done instinctively do not require any thought or intention.

Second, the object of will must be some action of the person who wills. For Reid, this is what distinguishes willings (acts of will) from commands and desire. We may desire things that are not within our power, and we may command others to do things we do not desire them to do (think of the judge who commands that one be punished even though he or she does not desire that the criminal be punished). Acts of will, for Reid, are therefore to be distinguished from desires. Third, the object of the act of will must be something we believe to be in our power and to depend upon our will (EP 50). If one loses the power to speak, for instance, and if one believes one has no such power, one does not will to speak—only to try to speak if recovery, say, is possible.

Fourth, Reid points out that the volition to act immediately is accompanied with an effort to do what is willed. We might not always be conscious of this effort, especially when the action is easy, but Reid thinks we might still notice and be conscious of the effort involved if we are attentive to what is happening when we determine to act. Finally, Reid observes that in decisions and intentions that are important to an agent, there is always some motive or reason that influences and inclines the agent in willing one way or another. Unimportant actions might not be performed for particular reasons, but actions that are wise, virtuous, and meaningful are performed for reasons (they are motivated by rational principles of action; see section 3.c).

b. Voluntary Operations of the Mind

The intellect and the will are, Reid writes, always conjoined in the operations of the human mind (as far as we know). Even acts usually attributed to the understanding only, like perceiving and remembering, for instance, involve some degree of activity. Conversely, every act of will involves at least some conception, intention, belief, and often also some belief about the value or worth of the action. These operations clearly involve the understanding, Reid points out (EAP 60).

Reid writes that there are three operations of the mind that have mistakenly been thought to be intellectual operations only. Reid, however, holds that these operations are also active capacities. These voluntary operations he has in mind are attention, deliberation, and fixed purposes. It is important for Reid to show the voluntary aspect of these operations because they are involved in a person’s character traits and personalities. Reid’s ‘necessitarian’ opponents might think persons have no control over their personalities, but, for Reid, since the operations involved in them are active then persons will have some degree of control and hence responsibility over them.

The first voluntary operation Reid discusses is attention. The attention we might give to a subject or to an action for the most part depends upon our will. For Reid this does not mean, however, that attention is completely within the control of agents and that attention is always the result of an act of will. To the contrary, attention is often the involuntary result of some impulse or habit. Our passions and affections direct our attention to the objects that move them. Still, Reid holds, the attention one finds oneself to have may be changed or focused. Even though the wonderful smell of the garden draws my attention to it as I walk past, the act of stopping to consider the garden and to really pay attention to it is an act of the will, Reid argues. Since attention is a voluntary act, Reid will later be in a position to argue that “we ought to use the best means we can to be well informed of our duty, by serious attention to moral instruction…” (EAP, 271).

The next voluntary operation of the mind Reid discusses is deliberation. Genuine human actions are not always the result of deliberation. One may act freely out of a fixed resolution or habit to act virtuously, for instance. Moreover, one may not always have time to deliberate. And one does not deliberate in cases that are perfectly clear: “no man deliberates,” Reid writes, “whether he ought to choose happiness or misery” (EAP, 63). But when the times permits, and when the situation is unclear, then a person might deliberate. Deliberation, for Reid, is the exercise of the agent’s capacity to consider various reasons, various desires, or various feelings and affections that move him to perform a particular action or set of actions. Deliberation is an active capacity to consider outcomes and motives and then to determine to act according to, or against, some of these reasons. Since deliberation is an active operation, Reid will later be able to show that we have a duty to deliberate coolly and impartially about our actions.

Finally, fixed purposes and resolutions are also, and essentially, active operations of the mind according to Reid. He distinguishes them from volitions or determinations to act immediately. They are, rather, intentions to act at a distance. One may decide to perform a single action in the future, or one may resolve to follow a course of actions, or to pursue some general end. In fact, Reid’s view is that the general purposes and fixed resolutions of agents are the basis of the character traits of agents. Character traits, as opposed to natural tempers, are for Reid general and regular tendencies to act in a certain manner, and the tendencies result from the person’s resolution or purpose to act according to some plan or rule. A person who is a person of virtue, for instance, is a person who formed the resolution to be a person of virtue (EAP, 69), Reid writes. This resolution will express itself in the person’s general and regular tendency (the character trait) to act virtuously. Ultimately, therefore, the character traits of agents are based in voluntary determinations of the agent’s will. And a person’s constancy or steadiness depends on the person’s commitment to the person’s general purposes and resolutions. When an agent resolves to act according to his principles, this resolution is clearly an act of will, Reid argues. Reid concludes by pointing out that since resolution is a voluntary act, over which we have some control, it may therefore properly be a virtue, whereas willfulness, inflexibility and obstinacy may properly be recognized as vices.

3. The Principles of Action or Motives

It is vital, Reid maintains, to have a correct understanding of the various motives or incitements to act, because without these active power would be utterly useless and fickle. Whatever incites us to act is, according to Reid, a principle of action. By ‘principle of action’ Reid means the last answer to the ‘why did you act?’ question. For Reid, motives or principles of action are not instrumental means to reaching some further end (EAP, 110). Principles of action are motives for actions pursued for no further reason, or which are desired for no further end or purpose.

For Reid it is important to pay close attention to those principles on which human beings do or could act—recognizing that humans often act from a variety of principles concurring in one direction.  A correct account of the various motives or principles of action will help us understand the character of the action performed: whether it is instinctive, meaningful, the outcome of some natural affection, intelligible, meaningful, wise, virtuous, etc. (EAP, 75).

Reid categorizes these principles into three classes: the mechanical, the animal, and the rational principles of action.

a. Mechanical Principles

Mechanical principles of action are motives that influence beings to act and which do not require any thought or intention. Reid writes that they require no attention, no deliberation, no will, no conception and no intention. They are completely blind impulses. Among the collection of mechanical motives, Reid points out, we find instincts and habits. Instincts are blind tendencies that are natural, found early in animals and humans, and which do not require repeated use in order to motivate. The fact that infants cry when they are hurt, that they are afraid when left alone, that they are terrified by an angry tone of voice but soothed by soft and gentle voices, are the result, according to Reid, of such mechanical instincts (EAP, 79). Brute animals are also moved to perform amazing accomplishments (like honey combs or spider webs) from mere instinct, without any intelligence of the mechanics or mathematics displayed by their artworks. Another instinct Reid observes in both humans and animals is the instinct of imitation. And even belief, in the early part of our lives, is guided by instinct and imitation, Reid points out.

Human beings can also learn to perform an action easily, without any thought, by having performed the action frequently, as well as by the natural instinct to imitate others. The habit or facility to perform an action is therefore a mechanical principle, according to Reid. Speaking, writing, riding a bicycle are, Reid points out, often performed blindly by those who have developed the habit of such activities (EAP, 88-90). These actions are carried out involuntarily and without any particular intention. The mechanical principle of habit will therefore be of great help, according to Reid, in developing the habit of performing those types of action a person has resolved to perform. Certain character traits, Reid holds, are the effect of a person’s fixed resolutions or purposes. These fixed resolutions and the actions one performs in line with them are, for Reid, voluntary operations (see 2.b.). But when these actions are performed regularly, the mechanical principle of habit helps agents acquire a facility to perform them.

b. Animal Principles

The second category of principles of action are those that Reid calls ‘animal’ because, he holds, we have them in common with many brute animals (EAP, 92). Among the animal motives, Reid notes the appetites (hunger, thirst, lust), constant desires (such as a desire for knowledge, power and esteem), benevolent affections (gratitude, love, parental affection, pity, esteem, friendship, and public spirit), and malevolent affections (emulation and resentment).

Reid argues that animal motives move all human beings who are capable of having thoughts, even though an animal motive is of a more instinctive sort when it is at work in animals and small children. When animal motives move and influence a being, the mental state which is, at the very least, required, is a conception of the object or action toward which the being is moved. Conception is the bare minimum mental capacity necessary in order to be moved by animal motives, but animal motives may require in many cases forming beliefs (propositional attitudes), or even forming beliefs about value. Esteem for the wise, for instance, may require believing that the person esteemed is wise, virtuous, or acted in an estimable manner. Animals may have an instinctive kind of admiration or affection for its master, but it does not involve thinking of its master as wise, virtuous, or kind. According to Reid, the mental capacities of animals is smaller than those of humans, and animals consider actions only, and not the intentions of others. A motive such as gratitude, for instance, is observable in the dog who is “kindly affected to him who feeds it” (EAP 115), even though the person is about to kill it. Humans who are moved by gratitude, however, have in mind the intention of the benefactor, and the gratitude of humans suppose beliefs such as that the action of the benefactor goes beyond what justice or morality requires (Ibid.).

Reid’s understanding of the nature of beliefs is not always clearly expressed. He often writes of beliefs in the same way he writes of judgments, where “the objects of judgment” are “expressed by a proposition”, but at other times he writes of belief as an attitude (belief, disbelief or doubt) about that judgment or proposition, and which accompanies the judgment (EAP, 347). Reid, furthermore, thinks that animals of the most sagacious kind are capable of having conceptions of states of affairs, but he is unsure whether they are capable of forming beliefs. He writes in one place that he thinks brutes do not have opinions (EAP, 147), but he writes in another passage that brutes have instinctive beliefs, and, he points out: “Whether brutes have anything that can properly be called belief, I cannot say; but their actions shew something that looks very like it” (EAP, 86). In any case, whether or not they are capable of having beliefs, Reid points out, “it will be granted, that opinion in men has a much wider field than in brutes” (EAP, 147). Only the beliefs of human beings, Reid points out, involve evaluations, values, and reasoning about laws and systems. Animal motives (appetites, desires, affections) are present both in animals and in human beings. However, when they move animals and small children they are more simple mental operations, more similar to the mechanical motives, whereas when they are at work in adult human motivation they usually involve higher mental capacities.

Reid understands animal motives as desire/object pairs. He writes that animal motives always involve a desire for some object and sometimes—but not always—a sensation or feeling which is typical of each desire. Reid calls all animal motives animal or natural desires, since they all aim to achieve or attain some object or end.

Furthermore, according to Reid, each natural animal motive or natural desire has its own natural object (EAP, 113). Desires for the good of certain persons, desires for knowledge, desires for power, etc. imply a state or mental act in the agent who desires and some object desired. Reid writes, for example, that when we come to realize that food will satisfy our hunger, the desire and its object “remain through life inseparable. And we give the name of hunger to the principle that is made up of both” (EAP, 93). It is therefore natural, according to Reid, to describe each animal motive either in terms of the desire or mental state involved, or in terms of the end desired.

c. Rational Principles

Reid observes that human beings are not only moved by passions, affections or desires, but also by a third important class of motives, which he calls ‘the rational principles of action.’ Reid further distinguishes between two different rational principles of action: our overall good, and our duty. Reid calls these ‘rational’ because “they can have no existence in beings not endowed with reason, and, in all their exertions, require, not only intention and will, but judgment or reason” (EAP, 154).

The first rational motive, a regard for our good as a whole, which Reid also characterizes as regard for one’s overall interest or happiness, is a rational motive because only beings with reason are able to consider and be moved by what is their overall good. Reid observes that being motivated by our overall happiness requires the ability to compare actions and to determine the possible outcomes and consequences of future courses of action. Directing our actions in order to aim at our overall happiness therefore clearly requires having rational capacities.

The second rational motive, a regard for duty, also involves the rational capacity to form judgments or evaluations of what one ought, morally, to do. Agents must have the rational capacities to consider existing actions and agents, to see that this action is morally good, and that one wrong, and hence to form specific and particular estimations of the moral quality of actions and of persons. Agents must also have the capacity to form general principles or axioms such as “what is in no degree voluntary, can neither deserve moral approbation nor blame” (EAP, 271), or “we ought to act that part towards another, which we would judge to be right in him to act towards us, if we were in his circumstances and he in ours” (EAP, 274). Forming these beliefs requires a minimum level of rational capacities, for Reid, but also, and more importantly, a moral faculty or sense, which he also calls ‘conscience.’ These capacities are required to be able to judge that a future action is one we should perform.

If animal principles are best understood as desire/object pairs, rational motives, rather, are best understood in terms of the end, object or state of affairs, to which they aim rather than as the mental state required in order to be moved toward that end (Yaffe 2004, 108). It is true that Reid often speaks in a language that implies that rational principles are some kind of mental state. After all, Reid often writes that rational principles of action imply a belief or judgment about what is good: prudentially or morally. And it is easy to conclude that the motive is the belief, thought, judgment or conviction itself rather than the end that is pursued. However, as Yaffe points out, when Reid seems to hold such a view it is because he is emphasizing the point that in order to move us, ends must bear a special relation to us. An end is a reason for me only if I think of that end and if I think of my future actions as bearing the relevant kind of relation to me and to the end (Yaffe 2004, 108).

When an agent is moved by the motive of duty, the agent is moved by a future ideal or by a future state of affairs that will instantiate that ideal. The agent, for instance, recognizes that an action that does not yet exist is one he ought to bring about because the action is conducive to duty. Ultimately, for Reid, the motive does not depend on the agent’s desires since an agent might not desire to do what he thinks he ought to do.

For Reid animal motives, or passions in general, serve an important purpose and are good and useful parts of human and animal nature. These motives, when they are not deformed by bad education and bad habits, are often conducive to virtue (Kroeker 2011). However, if there is a conflict between animal and rational motives, the rational ones have the final word, according to Reid. Rational principles are simply better guides to what is wise and virtuous since they require knowing not only what we naturally desire but also what, in the long term, will be better, would fulfill moral laws and requirements, or would bring about that which is valuable. By their moral sense, in fact, human beings recognize the value and worth of their natural animal motives, and they might also consider which course of action, whether with or without the influence of animal motives, would be morally valuable. By adopting such a position, Reid clearly opposes the view of David Hume who holds that all motives are passions, and that ultimate ends are determined by human desires and not by the intrinsic value or worth of future courses of action (T 3.1 and EPM 161).

d. Influence of Motives on the Will

Reid further distances himself from the Humean position and from the position of philosophers such as Anthony Collins and Joseph Priestley (as well as Gottfried Wilhelm Leibniz, Baruch Spinoza and perhaps David Hartley) by claiming that motives do not causally necessitate the agent’s choices, intentions and actions. Rather, Reid argues that motives influence the agent in his or her volitions.

Animal motives and rational motives, Reid observes, influence agents in different ways. Animal motives, or what has commonly been called ‘passion,’ draw us “toward a certain object, without any farther view, by a kind of violence, a violence which indeed, may be resisted if the man is master of himself, but cannot be resisted without a struggle” (EAP, 55-56). These passions move us easily, and counterbalance the defects of our wisdom and virtue, Reid points out. For example, we often eat out of mere hunger, without any thought about what is wise or good for us, or without any thought about the quality of the object. In fact, Reid points out that it is often difficult to answer questions about what we should eat, how much, and how often. If we had to reason about such questions, we would often fail to be moved to eat. These passions, therefore, serve to preserve the human species, and to help us perform tasks (EAP, 52). Passions may be stronger or weaker, but an effort is always required to resist them. Two passions may move a being in contrary ways. In animals and in humans who do not have time to deliberate or think about what they are doing, the strongest passion may prevail (EAP, 53). But human beings are usually able to form some judgment about what they are doing, and to weigh goods and evils. Human beings are therefore usually passive in part, and active in part. When the passion is irresistible or when there is no time to determine, human beings are mostly passive. But when the person is able to deliberate calmly and impartially, and to determine according to distant goods and values (and not only in terms of present gratification), the active power of the person is increased.

Rational principles of action never causally determine an agent’s choices and intentions. They function like advice or like the testimony of various parties in front of a judge. They leave a person completely at liberty to choose and to determine (EAP, 59). By the rational principles of action, Reid writes, we judge “what ends are most worthy to be pursued, how far every appetite and passion may be indulged, and when it ought to be resisted” (EAP, 56). Therefore, both kinds of principles of action move us in different ways. The phenomenology involved is different in both cases. Passions are felt like forces that push us, whereas the rational principles are felt like arguments or advice, which may, at the most, produce a cool conviction of what we ought to do.

4. On Moral Liberty

a. The Notion of Moral Liberty

According to Reid, moral liberty is the agent’s power to act. Moral liberty and active power are therefore identical for Reid (see section 1 for more on active power). Moreover, moral liberty is more than the power an agent has over his or her voluntary actions. It is also power over “the determinations of his own will” (EAP, 196). Here Reid distinguishes between merely voluntary actions and free voluntary actions. Small children and animals, Reid observes, may act voluntarily if their actions are the result of their choices and intentions. However, their voluntary actions are determined by the strongest passion, appetite, affection or habit. Brute animals, Reid writes, lack moral government, which is the control one has to choose according to what one thinks is best or required. These beings, Reid continues, have no conception of a law or of a guide to action by which they could direct or fail to direct their choices and actions (EAP, 197). Instead of being guided by the moral law they are guided, blindly, by the physical laws that govern their constitution, in the same manner as the inanimate creation is governed by physical laws. Hence, since they cannot form the conception of what is best or required, they cannot govern their choices accordingly. Since their voluntary actions are the result of choices that are causally determined by physical laws, Reid concludes that they lack moral liberty.

Moral liberty, Reid contends, is a two-way power to will or not to will something (see 1.b). Reid holds that agents who possess moral liberty must not only be capable of following rules, guidelines, advice and arguments, they must also be capable of disobeying these directives (EAP, 200).  One may imagine, Reid writes, some puppet that is endowed with understanding and will, but who has no degree of active power. This puppet, Reid points out, would be an intelligent machine, but it would still be subject to the same laws of motion as inanimate matter (EAP, 222). The puppet, that is, would be incapable of disobeying these laws. But being incapable of disobedience implies, Reid holds, that the puppet is not active in its obedience – it does not even obey in the proper sense of the term. To be free, or to possess active power in Reid’s sense, would mean that the puppets’ obedience is obedience in the proper sense; “it must therefore be their own act and deed, and consequently they must have power to obey or disobey” (EAP, 222). When agents act freely, their actions are truly up to them, and hence they must be capable of willing and of refraining from willing the action they perform. Free actions, for Reid, are the effects of the agent’s exertion (putting into effect) of his active power. He writes that it is “the exertion of active power we call action” (EAP, 13). For Reid, therefore, only actions that are produced by an agent’s exertion of this kind of active power are genuine, free, actions.

b. Of Causes

i. Efficient Causes and First Principles

According to Reid, we all have an idea of productive causes. These productive or efficient causes, for Reid, are causes that have the power to bring about certain events. Efficient causes have the power to produce changes. Efficient causes are therefore very different in nature from events that constantly precede or follow other events in nature. Events in nature, humans come to realize, are passive rather than active. “Instead of moving voluntarily, we find them to be moved necessarily; instead of acting, we find them to be acted upon…” Reid writes (EAP, 207). Moreover, Reid points out that constant conjunction does not link true causes with effects. Priestley and Hume, Reid writes, define a cause as a circumstance that is constantly followed by a certain event (EAP, 205). Reid disagrees—he holds that genuine causes cannot be defined this way. A preceding event does not have the power to produce the effect with which it is constantly conjoined, Reid points out. Smoke is constantly conjoined with fire, but, according to Reid, fire is not the active producer of smoke. And yet we all have a notion of causes that are productive—of causes which actually produce and are responsible for change.

An efficient or productive cause, Reid holds, “is that which has power to produce the effect” (Of Power, 6) and “which produces a change by the exertion of its power” (EAP, 13). To produce an effect, there must be in the cause a power to produce the effect, and the exertion of that power. For Reid, a power that cannot be exerted is no power at all (EAP, 203). The only things which possess the power to produce changes and which can exert this power are beings with will and understanding (see section 1.b.ii). Beings with active power, the will to exert it, and understanding to direct it, are agents. Efficient causes—causes in the proper sense—are therefore agents (EAP, 211).

Furthermore, Reid observes that a principle or belief that appears very early in the mind of human beings is that everything that begins to exist must have a cause (in the proper or real sense of the term) of its existence, “which had power to give it existence” (EAP, 15 and 202). For Reid “that things cannot begin to exist, nor undergo any change, without a cause that hath power to produce that change…is so popular, that there is not a man of common prudence who does not act from this opinion, and rely upon it every day of his life” (EAP, 25).

In examining Reid’s account, three questions must be answered. First, ‘what are first principles?’ Second, ‘is this principle of causality—that whatever begins to exist must have a cause—truly a first principle?’ And third, ‘if it is a first principle, does it require holding a strong notion of causes (causes as efficient causes or as agents)? ‘

First principles, according to Reid, are propositions “which are no sooner understood than they are believed” (EIP 452). First principles are things human beings believe naturally: “there is no searching for evidence, no weighing of arguments; the proposition is not deduced or inferred from another…” (Ibid.). What is believed, is, for Reid, a first principle when it functions as an axiom for other propositions. Other propositions are discovered through the power of reasoning – either by inductive or by deductive reasoning. Reasoning is the capacity of drawing a conclusion from a chain of premises, Reid writes. But “first principles, principles of common sense, common notions, self-evident truths” (EAP, 452), are words that all express propositions that do not require reasoning from a chain of premises.

For instance, the proposition that the objects we perceive really do exist is a first principle (EIP 476). It is not something we usual hold by deducing it from other premises. Furthermore, it is not inferred from repeated experiences since the proposition is supposed in the experience itself. Another example is that the future will resemble the past (EIP 489). “Antecedently to all reasoning,” Reid writes, “we have, by our constitution, an anticipation, that there is a fixed and steady course of nature” (Inquiry 199). The child, Reid observes, who has once been burnt by fire will continue to shun fire. Repeated experiences and reasoning may help them confirm that fire always burns, but children will believe that the fire which burned them once will burn them again by nature, before reasoning and experience have offered any confirmation of such a fact. One may offer support for first principles, and false first principles may be defeated (EIP 463-467), but first principles are natural and found to be true as soon as they are understood and regardless of any proof in their favor.

The principle of causality, according to Reid, is one of these natural principles. That whatever begins to exist must have a cause which produced it is, according to Reid, a first principle of our human constitution (EIP 497). Reid writes that Hume has convincingly showed that the arguments offered in defense of this principle all take for granted what must be proved (EIP 498). This principle cannot be proved by induction from experience either, according to Reid. Indeed, Reid writes, “in the far greatest part of the changes in nature that fall within our observation, the causes are unknown, and therefore, from experience, we cannot know whether they have causes or not” (EIP 499). Yet, it would be absurd, Reid argues, to conclude that these events do not have causes. The proposition that every change must have a cause that produced it is therefore a first principle, an axiom that humans hold, not as a result of any kind of reasoning or of repeating observations, but as a result of their natural constitution.

Finally, now, even if Reid is correct to think of the principle of causality as a first principle, why should we think that every change or any beginning of existence must have an efficient cause? One of Reid’s strategies is to show that we hold this belief in cases of human actions. Those actions which are truly an agent’s actions, Reid argues, are actions that are up to the agent, that are within the agent’s control, and really produced by the agent. The cause of human actions is an agent with the power to act, with the will to act, with understanding to direct his power, and who exerted his power (sections 1.b and 4.a).

Another strategy to which Reid appeals, in order to show that when we think of causes of some change we have in mind efficient causes, is to focus on what causes the beginning of existence. One natural belief, according to Reid, is the belief that nothing can come into existence without an efficient cause (EAP, 202). The first cause, that which brought things into existence, it is commonly believed, is a cause which had power to bring about existence and understanding to bring about the order of existence. Order and purposiveness (teleology) are effects of something with power and intelligence. Existence and order do not just pop into being, according to Reid. We all believe, in practice if not in words, that existence has a cause. And since existence is ordered and purposeful, we think of the cause as an efficient cause (a substance endowed with active power, will, and understanding). Reid then seems to think that we all believe that the first coming into existence is caused by an efficient cause, and this would explain why we believe that all change is caused (perhaps in the beginning, or by intermediate actions), by an efficient cause. Hence, we hold the belief that we, as human beings, are sometimes efficient causes, and we hold the belief that existence itself, and the course of nature, requires an efficient cause.

ii. Physical Causes

Although the proper sense of the term ‘cause’ is efficiency or productive causality, Reid recognizes that there is also a lax or popular use of the term. We often use the term ‘cause’ to describe events in nature that are constantly conjoined with other events, which we call ‘effects.’ Causes in this sense—in the Humean sense—are not agents but events constantly conjoined with others according to the laws of nature. Only efficient causes are causes per se, for Reid, but events that ‘cause’ others according to laws of nature may be called ‘causes’ or ‘necessary causes’ as long as we bear in mind that this is an improper use of the term.

Originally, however, Reid notices that human beings were prone to think of inanimate beings as agents. In agreement with Hume, Reid writes that human beings tended to think a soul or agent is the cause of any motion that is not accounted for. In anticipation to contemporary cognitive science, Reid recognizes that humans have what is called today a ‘hyper active agency detecting device.’ We are prone to attribute powers to beings, qualities and relations that are inanimate and passive (Of Power, 6 and EAP, 207). Then by reflection and education human beings can come to observe that objects in nature are merely passive, and that they have no productive force or power. As philosophy advances, Reid writes, we find that objects which appeared to be intelligent and active are dead, inactive, passive, and moved necessarily (EAP, 207).

Nonetheless, many human beings still tend to think of laws of nature as powers or as causes in the true sense of having power to initiate change (EAP, 211). They observe constant conjunction, events constantly following others in similar circumstances, such as heat and ice melting. However, they do not perceive the real connection between these events. “Antecedent to experience,” Reid writes, “we should see no ground to think that heat will turn ice into water any more than that it will turn water into ice” (Of Power, 7). And yet the tendency is to attribute to laws of nature behind these events the same productive powers we attribute to agents. But, Reid argues, Newton himself was correct to write that a little reflection will clearly show that laws of nature cannot produce any phenomenon “unless there be some agent that puts the law in execution” (Of Power, 7). It is as absurd to attribute agency to laws of nature as it is to attribute agency to beings who lack will and understanding. However, one may continue to use the word ‘cause’ to refer to events constantly preceding others according to the laws of nature, or to refer to the laws themselves, but one must recognize, Reid insists, that such a use is a popular and improper use of the term.

We do not observe active power in inanimate beings, and we do not perceive such a productive power in the laws of nature. Still, human beings all believe that beginning of existence, order and purpose require an efficient cause (4 b.i) Our causal beliefs, Reid argues, are teleological—we assume there is purpose and activity behind events in nature. Our context, education, bad science, or certain kinds of religion or philosophy might then lead us to give up such teleological beliefs. But, for Reid, careful attention to those beliefs that are presupposed behind our arguments and practices will reveal that human beings all have the (legitimate) tendency to think of nature as purposeful.

For Reid, laws of nature express purpose because they are expressions of divine character traits. Every agent who forms resolutions and fixed purposes will tend to act in a regular manner (section 2.b). Since natural events do not have any productive capacity, since the laws of nature are not productive powers, since all change and beginning of existence requires an efficient cause, and since human agents are not these efficient causes, there must be another agent behind the existence and motion of nature, God. Events in nature follow regular courses of action, which, like all regular courses of actions, express the character traits or fixed resolutions of an agent: the Author of nature. We do not know exactly how this efficient cause acts in nature, Reid points out (EAP, 210). He might have acted once, or might act in all changes, or act through intermediary agents. But “it is sufficient for us to know, that, whatever the agents may be, whatever the manner of their operation, or the extent of their power, they depend upon the first cause, and are under his control; and this indeed is all that we know; beyond this we are left in the darkness” (EAP, 30). In conclusion, therefore, efficient causes are agents, beings with active power, will and understanding. Events in nature and inanimate objects are not agents, and hence are called ‘causes’ in a popular sense only, but they require an efficient cause, an agent who acts orderly and purposefully.

iii. Principles of Action and Causation

In opposition to his ‘necessitarian’ opponents, Reid argues that the rational principles of action that motivate human agents in their choices and actions are not causes (in any sense of the term). The opponents Reid confronts directly are David Hume and Joseph Priestley, but he also has in mind philosophers such as Locke, Collins, Leibniz, and Spinoza. All of these philosophers held, according to Reid, that the will is not free, but that human choices and decisions are determined, necessarily, by the strongest, or, under some views, the best, motive.

In Reid’s account, the two great categories of motives or principles of action are the animal and the rational ones. Reid points out animal motives may function as physical causes do, but only in cases where the agent has no power. In cases of torture, or of madness, human beings may lose all power of self-government and of action. In such instances we do not hold them responsible for their actions (although we might hold them responsible for previous actions which were in their power) because we recognize that the animal motive (pain, fear, etc.) was irresistible. For this reason Reid concludes that when a person discloses an important secret by the agony of the rack, for example, we pity him more than we blame him (EAP, 57-58). If the person with strength of mind succeeds in resisting the passion (the fear of pain, for instance), we impute the action to the person. If the person fails after trying, we impute the action more to the passion than to the person, and we blame the person in proportion to a person’s capacity to resist passions (EAP, 59). The extent to which we have power to act is inversely proportional to the influence of animal motives. Animal motives, therefore, may act as physical causes, but never as efficient causes, since events or mental states are not beings with power and understanding.

On the other hand, rational principles of action—which are valuable ends that agents recognize they may instantiate by their actions—never act as physical causes. It is impossible for such motives to function as physical causes, Reid argues. These motives are non-existing ends that the agent considers, and since they are not existing states of affairs there is no way they could function as physical causes (EAP, 214; more on this in Yaffe 2004). They could not function as efficient causes either, since efficient causes are substances with power and understanding. Hence (rational) motives are neither causes nor agents, “they suppose an efficient cause, and can do nothing without it” (EAP, 214). Instead of thinking of rational motives as causes, Reid writes, we should rather think of them as advice or exhortation, which leave agents perfectly free to determine their choices and actions.

Defenders of necessity (Collins, Hume, Kames, Hartley, Priestley, Leibniz) have argued, Reid writes, that the strongest of contrary motives must prevail. And if there is but one motive, they hold, then this motive determines the agent. There is a necessary relation, according to these philosophers, between a person’s motives and his actions. Motives, according to them, function as necessary causes. The strongest or best motive is the motive that causally determines the action.

In response to such views, Reid examines what test these philosophers use to determine the strength of a motive. Reid asks us to consider how we judge which of two motives is the strongest. One way to establish which motive is the strongest is to consider which motive prevails. The strongest motive, according to such a position, is simply the motive that prevails (that wins). Or, one could claim that “by the strength of motive is not meant its prevalence, but the cause of its prevalence…” (EAP, 217). But, Reid points out, this answer simply asserts that the strongest motive is the strongest motive. To prevail or win is identical to being the strongest (Ibid). This solution, therefore, fails to offer a test or way of determining which motive is the strongest. Moreover, it assumes what must be proved: that the strongest motive, or the cause of the strongest motive, is the necessary cause of action. These accounts all beg the question and assume, Reid argues, that “motives are the causes, and the sole causes of actions” (Ibid.). Reid points out that the test of the strength of motives we use to determine if motives are necessitating causes to action must not assume that motives function as necessitating (physical) causes.

In reaction to his opponents, Reid writes that the strength of animal motives is determined differently from the strength of rational motives. The strongest animal motive, Reid observes, is the one that is most difficult to resist. The strength of animal motives is determined by the effort required to resist them. In a limited number of situations, they are irresistible (in cases of torture, for instance) but in the majority of situations agents are able to resist strong animal motives, Reid notices. Brutes cannot resist the strongest animal motive because they are not capable of forming the notion of ‘ought’ and ‘ought not’ (EAP, 219). Agents, on the other hand, sometimes act according to rational motives, because they are able to recognize that an action is worth performing even if it is contrary to a strong animal motive. They can consider whether the action will be either prudentially or morally valuable, and hence act according to rational motives. The strongest rational motive, Reid observes, is the one that offers the best reason to act. It is not the one which is hardest to resist, but it is “that which it is most our duty and our real happiness to follow” (EAP, 219). The best reason, furthermore, cannot be a necessitating cause to action. Indeed, ends such as moral value or overall well-being are non-existing states of affairs, and they are not substances. Hence they could be neither necessary (physical) causes nor efficient causes.

Finally, Reid points out that it is true that we often reason from men’s actions to their motives. But we do not have enough evidence to be able to conclude, inductively, that human beings are always determined by motives (EAP, 220). Indeed, we do not have empirical data to justify such an inference. The relation strong motive-action is a probable one, but there are too many instances in which humans resist strong motives, or act according to weak motives, or act in light of future values, or act against what they consider to be best. The relation is probable, at best, but far from being certain, Reid concludes. Moreover, in many cases that are unimportant, it is possible for agents to act without any motive at all (EAP, 215; see also Rowe 1991, 171-175).

c. The Problem of Infinite Regress

Several well-known objections could be offered to theories of liberty such as Reid’s. One kind of objection is that theories of action which hold that choices and actions are not causally determined by motives, or that hold that agents have power over their will (their volitions), lead to problems of infinite regress. Some argue that accounts of liberty such as Reid’s presuppose a regress of acts of will. Others argue that they lead to a regress of motives or reasons for action. And some point out that they lead to a regress of exertions of power. All such kinds of infinite regresses are impossible, the objections go, and hence the doctrine of liberty which Reid defends must therefore be false.

Reid offers a response to the first kind of regress. Indeed, he notices that an objection first advanced by Thomas Hobbes is that the claim that we have power over our will (to will this way or that) amounts to saying that we may will it, if we will. Hobbes therefore argues, Reid notes, that in order to act the will must thus be determined by a prior will, which must in turn be determined by a prior will, and so on ad infinitum (EAP, 199). Hobbes, Reid writes, therefore holds that liberty consists only in the power to act as we will, and it does not extend to the will itself.

In response to Hobbes, Reid points out that his account does not imply an infinite regress of acts of will because even though the determinations of the will are effects, Reid writes, which must have a cause, the cause of the determining (the choice or intention) is not some previous choice or intention. If the will is free, the agent is the cause of the agent’s determinations, he writes, and the action is thus imputable to the agent. If another agent is the cause (either immediately or by the interposition of other events), then the determination is imputable to that other agent. There is no regress of acts of will, according to Reid, since the cause is the agent and not another act of will (EAP, 201). Moreover, as explained in section 1.b.ii, Reid argues that power over an end action requires having power over the intermediary actions leading up to the end. Volitions (choices or willings) are internal actions leading up to the end, and hence power over the end implies having power over the volitions (Ibid.).

Another possible regress implied by Reid’s account of liberty (which is often called today a ‘libertarian’ account of freedom, to be distinguished from the political theory) is a regress in reasons or explanations. Under an account like Reid’s, the agent is influenced by different motives, desires, reasons, and the agent then decides to act in line with one motive rather than another. How does the agent make this choice? Leibniz, for instance, writes that if the best motive does not causally necessitate the choice, then the agent must have a further reason or motive to choose one motive rather than another. Leibniz imagines that the agent must listen to different voices, different motives, and that in order to choose between these voices or considerations, the agent must listen to yet another higher-order voice, and choosing this higher-order motive requires considering still higher-motives. The agent, that is, needs a higher order motive in order to choose between lower order motives. But then, Leibniz argues, the agent would need a reason to choose the higher-order motive, and so on ad infinitum (see Leibniz 1985, and Rowe 1991b:181). Since agents cannot go through an infinite number of reasons, and in order to keep from falling into the regress trap, a more appealing account, according to Reid’s opponents, is the one that holds that agents are determined, necessarily, by the strongest or the best motive.

According to Reid, however, the process of weighing motives and of deliberation does not lead to an infinite regress of motives. Rational motives may be higher-order motives about which animal motive should be acted upon or resisted, or about the overall prudential or moral value of the courses of action to which various motives lead (EAP, 57). But at the moment of decision, deliberation has stopped. Deliberation ends with a final judgment about what one ought to do, and with an intention either to act accordingly, or to refrain from acting according to what is best. “The natural consequence of deliberation on any part of our conduct,” Reid writes, “is a determination how we shall act; and if it is not brought to this issue it is lost labour” (EAP, 64). At the moment of decision, Leibniz is wrong, Reid would contend, to think that the libertarian must revert to an infinite series of reasons. The agent knows which motive he should follow. Deliberation ends with a consideration of which course of action is most valuable. And whether the agent follows the one that is best or not will not depend on yet another reason or motive. Indeed, what other reason could he offer? Evaluations of what one ought to do, about prudence or moral duty, are evaluations about ends, according to Reid. And an end is something that requires no further justification or reason. Therefore, once a person recognizes the end, and which course of action will instantiate the end, no further reason needs to be offered. Now, whether or not the agent in fact acts according to such an end is a question of weakness of will, or of lack of self-government; it is not an issue of providing yet further motives for choosing between higher-order motives (Kroeker 2007).

In the late 20th and early 21st century, several Reid scholars have argued that yet another kind of regress looms for Reid: an infinite regress of exertions of active power. Reid writes that “in order to the production of any effect, there must be in the cause, not only power, but the exertion of power; for power that is not exerted produces no effect” (EAP, 203). The regress arises when we consider the nature of the exertion of active power. Is the exertion an event? If so, then it must have a cause since Reid writes that every event must have an efficient cause. Efficient causes, for Reid, are agents, and when an agent acts freely, that agent is the cause of his or her own action. But if exertions are effects, does this mean that the agent causes the exertion of power? If so, then the agent would have to exert his active power in order to exert his active power. But if this former exertion is also an event caused by the agent, then it requires another previous exertion of active power, and so on.

Reid does not explicitly address the problem of regress of exertions of active power, and scholars disagree about which response he would adopt. Since this problem is well known, it is worth mentioning some suggested responses. One possible answer is simply to accept the regress, and to defend its coherence (Chisholm 1979). Another possible strategy is to argue that some acts of will are acts of mind that are uncaused. William Rowe, for instance, argues that the exertion of active power is an event but it is logically or conceptually impossible for such an event to be the effect of the exertion of active power of any agent (Rowe 1991b). For Rowe exertions are acts by which agents agent-cause their choices and actions. But agents do not cause the exertion itself, Rowe argues.

Another Reidian response to the problem of regress of exertions is that there can be events that an agent causes without bringing about any other event as a means of producing them. Paul Hoffman, for instance, suggests that exertions of power are events. But they are the kind of events that do not require a previous exertion of active power (this solution was first considered, but not ultimately defended, by Rowe (1987)). Hoffman argues that exertions of active power are events, but events which are exempted of being caused by an exertion of active power. These events “do not require a prior activating of a power” (Hoffman 2006, 445).

A further suggested solution is to reject the view that exertions of active power are events. Timothy O’Connor, for instance, argues that exertions are better understood as relations that hold between agents and their effects. Agents bring about their volitions, but not by some further action. They bring about their effects directly. And this irreducible relation is the exertion of active power. Exertions of active power are relations between causes (agents) and effects (the agents’ volitions and actions), and as relations they are not caused, and hence are not events (O’Connor 1994, 621).

A solution to the regress of exertions problem, Gideon Yaffe alternatively suggests, may be found if we consider what is involved in trying to act. For Yaffe, when an agent tries to do something and succeeds, there is just one action. If someone tries to commit a murder and succeeds in committing the murder, for example, we do not think the person performed two different things: trying and killing. The only action performed here is killing. The trying is not a separate action, but it is somehow part of the action of killing. Hence, Yaffe writes that “if the trying is anywhere in cases of success it is ‘in’ the successful action (Yaff 2004, 157). Now, Yaffe continues, if exerting power is just like trying, then in cases of success, the exertion is in the action. In cases of failure, on the other hand, the only thing the agent does is try. Trying in cases of failure is an action or event. We might say that in cases of failure the agent exerts his power to try. Moreover, since trying is a successful action, there is no further action of trying to try which would require a further exertion of active power. Hence, when we understand the exertion of power as trying (regardless of whether the trying succeeds and the agent acts, or the agent fails and the agent only tries to act) it is false to say that such exertions lead to a regress of exertions. Yaffe concludes, therefore, that since the relation agent-action in cases of success is direct and non-reducible, Reid’s account is best characterized as an agent-causal view of action.

Finally, James Van Cleve suggests, in response to the infinite regress of exertions objection, that in causing an action, the agent causes his own causing of the action. An agent’s causing some action has a reflexive component: in causing an action, the agent thereby causes the causing of the action. And we need not worry about the further complexity that in causing the causing the action I also cause the causing of the causing of the action, because this further complexity “is matched by no corresponding complexity in the fact described” (Van Cleve, 2015: 432). In causing an action, Van Cleve argues, the agent causes his own causing of the action—the causing of the action is not a further causing of this causing but is the agent himself.

d. First Argument for Moral Liberty

In the fourth essay of the Essays on the Active Powers, Reid focuses on three arguments for the conclusion that adult human beings have active power over their will. According to Reid’s account, moral liberty is the freedom an agent has to choose and to act (see sections 1.b and 4.a). What Reid calls the ‘first argument’ for moral liberty is based on the fact that the belief that we are free and the belief that every event must have a cause which had power to produce it are natural beliefs. For Reid some beliefs are natural beliefs, and natural beliefs are legitimate. They are legitimate in that it is reasonable to hold these beliefs unless proof is brought against them. Reid will show that there is, in fact, no good proof brought against such beliefs, and hence that it is legitimate to hold that humans possess moral liberty.

Natural beliefs are held by all adult human beings who are not under the influence of madness, of drugs, or of circumstances which might diminish their rational and active capacities. They are generated by our innate natural faculties, and, for Reid, we have no good reason to distrust our natural faculties. They all stand on the same footing, and hence we have no reason to trust one faculty and not another, Reid argues. Moreover, he shows that we cannot distrust all of them since we would be using our faculties to offer reasons for distrusting all of them—a position that is self-defeating and untenable (EAP, 229).

Another characteristic of natural beliefs is that they appear very early in the mind of humans even though they are neither a result of induction nor of deduction. Children, for instance, believe very early that they have a degree of active power over their actions, and that every event must have an efficient cause. Human beings do not first form these beliefs by observing external objects. In nature we only perceive events constantly conjoined to others, Reid points out, and not the connection between them (EAP, 229). The notion of power (and of efficient causation), therefore, cannot be formed by observation of external objects (see section 1.a on our ideas of active power; and 4.b.i for more on natural principles). Moreover, as adults we are not directly conscious of our own power (only of the exertions of our power), but yet we are convinced that we do have power when we choose and act. For Reid we only will to perform actions which we believe to be in our power. Hence, in exerting our power (in acting), we also believe that the action is in our power, and hence that we have the power over our willings and external actions.

Furthermore, Reid continues, the conviction that we have power over our choices and actions is implied in all our deliberations. When we deliberate about which action we should perform, according to Reid, we are convinced that the action we are considering performing is in our power. And once we finish deliberating, our resolutions and purposes imply a conviction that we have power to execute what we have resolved to do (EAP, 230). Reid adds that our acts of promising also imply the belief that we have the power to carry out what we have promised, and the human activity of blaming also presupposes a belief that the person acted wrongly. Blame, Reid writes, presupposes a wrongful use of power, as it is absurd to blame a person for yielding to necessity. Finally, the belief that humans often do have power over their will is assumed in the regulations of tribunals.

Some of these examples, however, show that humans believe that they have power over their actions, but they do not imply the conclusion Reid is seeking: that we do have power over our volitions. Reid’s view, in response, is that unless reflection, good philosophy, good science or compelling evidence may be given to distrust the propositions that sane adult human beings believe naturally, we are justified in holding them. Natural beliefs are legitimate; they express what is the case. To accept a proposition as a natural principle (as a first-principle), it is not enough to simply assert that we all believe it. Reid argues, rather, that there are “ways of reasoning about them, by which those that are just and solid may be confirmed, and those that are false may be detected” (EIP 463). One must show, for instance, that the belief is acquired without the need of inductive or deductive reasoning, that it is common to the learned and the unlearned, that giving up the conviction would lead to absurdities, that it enjoys the same kind and level of evidence as other similar beliefs which we admit as first principles, and that it appears early in the mind of children (EIP, VI.4). Reid, in his first argument for moral liberty, contends that we have power over our will because it is a proposition that is believed, naturally, and because it meets the requirements for being a first principle. The burden of proof, therefore, falls on the side of those who defend the doctrine of necessity (EAP, 235). This epistemological point, coupled with Reid’s belief that there are no convincing arguments to show that the belief must be rejected, lead Reid to conclude that we have moral liberty.

One might object that many might profess that they do not, in fact, believe that they have power over their will. Reid, in response, argues that what people profess or state is not always in line with what they actually believe. To know what people believe, Reid points out, it is not always best to ask them. It is better to look at their actions and practices. Hence a man might claim that he is not afraid of the dark, and yet refuse to sleep alone and to turn the light off. His actions betray his fear, Reid points out (EAP, 232). The same holds for our beliefs of power over our will and actions. Everyday practices are evidence that we believe we have power not only over our actions but also over our will.

e. Second Argument for Moral Liberty

Reid also argues for moral liberty from the fact that humans are moral and accountable beings. For Reid there can be no “moral obligation and accountableness, praise and blame, merit and demerit, justice and injustice, reward and punishment, wisdom and folly, virtue and vice” if human beings do not have active power (EAP, 240). In civil government, in religion, and in all moral discourse, the obligation that human beings have to do what is right is taken as given. Humans, it is universally recognized, are moral and accountable beings.

In order to be a moral being, Reid argues, a being must know the law and must have the power to obey it. Brute animals, Reid observes, are not moral and accountable beings, because they do not have the capacity to understand the moral law. Human beings, however, if they are not insane or incapacitated, are, according to Reid, capable of recognizing a moral law (what they ought to do). Human beings also have the power to do what they are responsible for (EAP, 237). Reid points out that it is self-evident to all human beings that one cannot “be under an obligation to do what it is impossible for him to do” (Ibid.). A person may be responsible for a previous action, like cutting off his finger, but he cannot then be held responsible for not being able to use his fingers for a particular action that requires dexterity. The fact that humans are held responsible for some of their actions, therefore, implies that they are capable of knowing the moral law, and that they can have the power to act accordingly.

The doctrine of necessity, Reid, argues, is not coherent with the moral accountableness of human agents. According to the doctrine of necessity, Reid writes, a man is not free to will or to choose. But if this is the case, then it is impossible for a person to will otherwise than how he in fact wills. But this means that it is impossible for a person who is determined to will to refrain from willing. And hence, according to Reid, the doctrine of necessity implies that persons should not be held responsible, because it is absurd to hold someone responsible for what he could not refrain from choosing. But we do continue to hold humans accountable for their choices and actions. Again, therefore, the burden of proof falls on those who hold that our everyday practices and beliefs are wrong, according to Reid.

f. Third Argument for Moral Liberty

In his third argument for moral liberty Reid argues that human beings have power over their actions and their volitions because they are able of conceiving plans and of carrying them out, which implies they possess moral liberty. A regular plan of conduct, Reid contends, cannot be contrived without understanding, and cannot be carried into execution without power (EAP, 240).

Reid assumes, from the start, that some persons lay down a plan of conduct, and resolve to pursue that plan. Reid also assumes we will all agree that some have in fact pursued such plans or ends by adopting and performing the proper means.

For Reid, these assumptions imply that the person who carries out the plan has wisdom and understanding. But, he argues, they also demonstrate that the person has a degree of power over his voluntary determinations. For Reid, “understanding without power may project, but it can execute nothing” (EAP, 240). A regular plan of action cannot be contrived without understanding, and it cannot be executed without power.

The question, now, is ‘where is this understanding and power?’ In other words: ‘who is the true cause of the plan?’ If wisdom and understanding is in fact in the person who carried out the plan, Reid writes, then we have the very same evidence that the power which executed it is also in that person.

For Reid’s opponent, the true cause of the plan lies in the motives of the agent, and not in the agent who exerts his active power. But, Reid points out, motives do not have understanding and power. These qualities are characteristics of agents, but only agents have understanding and power over their wills and actions. So, is there an agent other than the person who carried out the plan who in fact arranged the motives (and was the true cause of the plan)? If the cause is an agent other than the person who carried out the plan, Reid writes, then we cannot say the person had a hand in the execution of the plan; it is not even his plan, and, further, we have no evidence that the person is a thinking being. After all, many animals carry out plans such as building bee hives or spinning webs. But we do not think of the plans as the effects of intelligence in the animal. The plans here are effects of another agent, of another efficient cause. But in the case of human beings, the plans they carry out really are their plans. And we think of humans as exerting the intelligence required to lay out and pursue their plans of action. Hence, Reid concludes, if the intelligence is really in the person who carried out the plan and whose plan it is, then the power must be in that person as well.

What evidence do we have that our fellow human beings think and reason? For Reid, the evidence that persons who carry out intelligent plans have wisdom and understanding is provided by the persons’ speech and actions. The behavior of human beings, their language, and their bodily movements and expressions are all signs of intelligence (for more on natural signs of mental states, see EAP, 116 and 141). The understanding therefore truly lies in the person who carried out the plan. However, to carry out a plan, one must have power to execute it. Hence, “if the actions and speeches of other men give us sufficient evidence that they are reasonable beings, they give us the same evidence, and the same degree of evidence, that they are free agents” (EAP, 242). Pursuing plans, step by step, therefore, is a sign of both wisdom and active power.

In conclusion, Reid recognizes that although many questions remain to be answered, one thing is clear: in everyday activities and concerns of the present life of human beings, no person is very much affected by the doctrine of necessity (EAP, 269). Even though some vehemently defend necessity and others zealously guard liberty, we see no great difference between them in their everyday conducts. The fatalist, despite his claims, “deliberates, resolves, and plights his faith. He lays down a plan of conduct, and prosecutes it with vigour and industry…” (EAP, 269) He exhorts, commands, blames and praises. In all cases “he sees that it would be absurd not to act and to judge as those ought to do who believe themselves and other men to be free agents” (Ibid.). In everyday life, Reid concludes, all men act in a way that is consistent with his theory of liberty, and inconsistent with the doctrine of necessity. All humans believe, in the everyday, that they possess active power over some of their actions. And what all humans believe is a sign of what is, according to Reid, unless one may offer good arguments to show that the belief is false. In the absence of such good arguments, Reid thinks that the most reasonable position it to claim that humans have moral liberty. And what is most important, according to Reid, is for us “to manage these powers, by proposing to ourselves the best ends, planning the most proper system of conduct that is in our power, and executing it with industry and zeal. This is true wisdom; this is the very intention of our being” (EAP, 5). Powers to act, to direct our actions, to plan, to decide, are all useful powers for Reid, and using them well, for good ends, is the noblest human mission.

5. References and Further Reading

a. Primary Sources

  • Collins, Anthony (1976) ‘A Philosophical Inquiry Concerning Human Liberty’ in Determinism and Free Will, ed. J.O’Higgins SJ. The Hague: Martinus Nijhoff.
  • Hobbes, Thomas (1648) ‘Of Liberty and Necessity’ in The English Works of Thomas Hobbes, iv, accessed November 24, 2015, https://archive.org/details/englishworkstho28hobbgoog.
  • Hume, David (1998) An Enquiry Concerning the Principles of Morals (EPM), Tom L. Beauchamp (ed.). Oxford: Oxford University Press.
  • Hume, David (2007a) A Treatise of Human Nature: A Critical Edition (T), David Fate Norton and Mary J. Norton (eds.). Oxford: Clarendon Press.
  • Leibniz, Gottfried (1985) Theodicy. LaSalle, Il: Open Court.
  • Locke, John (1979) An Essay Concerning Human Understanding. Oxford: Clarendon Press.
  • Priestley, Joseph (1782) Doctrine of Philosophical Necessity Illustrated; being an Appendix to the Disquisitions relating to Matter and Spirit. London.
  • Reid, Thomas (1997) An Inquiry into the Human Mind on the Principles of Common Sense (I), Derek R. Brookes (ed.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1764.)
  • Reid, Thomas (2001) ‘Of Power’, The Philosophical Quarterly 51.202: 3-12 (Original work unpublished manuscript, 1772).
  • Reid, Thomas (2002) Essays on the Intellectual Powers of ManA Critical Edition (EIP), Derek R. Brookes (ed.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1785.)
  • Reid, Thomas (2010) Essays on the Active Powers of ManA Critical Edition (EAP), Knud Haakonssen and James A. Harris (eds.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1788.)
  • Spinoza, Baruch (1985) Ethics, in The Collected Works of Spinoza, vol. 1, transl. E. Curley. Princeton: Princeton University Press.

b. Secondary Sources

  • Alvarez, Maria (2010) ‘Thomas Reid’, in A companion to the Philosophy of Action, Timothy O’Connor & Constantine Sandis (eds.). UK: Wiley-Blackwell: 505-512.
    • Argues that Reid’s endorsement of the thesis that we cause changes by causing volitions gives rise to serious problems.
  • Chisholm, Roderick M. (1979) ‘Objects and Persons: Revisions and Replies’, in Essays on the Philosophy of Roderick M. Chisholm, Ernest Sosa (ed.). Amsterdam, Rodopi: 317-388.
    • Recognizes that his account implies an infinite regress but denies that the regress is vicious.
  • Hatcher, Michael (2013) ‘Reid’s Third Argument for Moral Liberty’, British Journal for the History of Philosophy, 21.4: 688-710.
    • Argues that the argument from design is one of the premises in Reid’s third argument for moral liberty.
  • Harris, James (2003) ‘On Reid’s “Inconsistency Triad”: A Reply to McDermid’, British Journal of the History of Philosophy 11.1: 121-127.
    • Defends Reid against the charge that Reid’s attempt to offer arguments for moral liberty is inconsistent with his view that the proposition that we have power over our wills is a first principle, and hence cannot be proved.
  • Hoffman, Paul (2006) ‘Thomas Reid’s Notion of Exertion’, Journal of the History of Philosophy 44.3: 431-447.
    • Offers a Reidian answer to the infinite regress of exertions problem by showing that exertions of power are events which are exempted of being caused by prior exertions.
  • Jaffro, Laurent (2011) ‘Reid on powers of the mind and the person behind the curtain’, Canadian Journal of Philosophy, 41.sup1: 197-213.
    • Examines Reid’s claim that there is no inert activity; the operations of the understanding involve some degree of activity.
  • Kroeker, Esther (2007) ‘Explaining our Choices: Reid on Motives, Character and Effort’, The Journal of Scottish Philosophy, 5 (2) 2007, 187–212.
    • Offers a Reidian answer to the objection that Reid’s account or moral liberty leads to an infinite regress of reasons and motives, and examines Reid’s criticism of the Principle of Sufficient Reason.
  • Kroeker, Esther (2011) ‘Reid’s Moral Psychology: animal motives as guides to virtue’, Canadian Journal of Philosophy, 41.sup1: 122-141.
    • Examines Reid’s understanding of the animal motives and argues that uncorrupted animal motives are, for Reid, guides to virtue.
  • Lindsay, Chris (2005) ‘Reid on Scepticism about Agency and the Self’, Journal of Scottish Philosophy 3.1: 19-33
    • Defends Reid’s possible response to Alvarez’s objection that Reid’s theory gives rise to a skeptical worry concerning awareness of one’s actions, and suggests there are still tensions originating from Reid’s dualist metaphysics.
  • Madden, Edward (1983) ‘Commonsense and Agency Theory’, Review of Metaphysics 36.2: 319-341.
    • Presents an overview of Reid’s views on agency and causality, and discusses problems related to Reid’s claim that reasons for action are not causes.
  • McDermid, Douglas (1999) ‘Thomas Reid on Moral Liberty and Common Sense’, British Journal for the History of Philosophy 7.2: 275-303.
    • Presents the essential features of Reid’s theory of liberty, shows how his arguments depend on common sense epistemology, and discusses limitations of Reid’s account.
  • McDermid, Douglas (2010) ‘A Second Look at Reid’s First Argument for Moral Liberty’, in Reid on Ethics, Sabine Roeser (ed.). Great Britain: Palgrave Macmillan: 143-163.
    • Discusses the Reidian claim that the principle that we have moral liberty (that we have some degree of power over our actions and determinations of our will) is a principle of common sense.
  • O’Connor, Timothy (1994) ‘Thomas Reid on Free Agency’, Journal of the History of Philosophy 32.4: 605-622.
    • Argues that the Reidian response to the problem of infinite regress of exertions of power is that it is best to understand exertions as relations that hold between agents and their actions, and that such relations, as opposed to events, are uncaused.
  • Rowe, William (1987) ‘Two Concepts of Freedom’, Proceedings and Addresses of the American Philosophical Association 61: 43-64.
    • Presents a Reidian response to the objection that Reid’s moral theory leads to an infinite regress of exertions of power by arguing that exertions are the kind of events that do not require a previous exertion of power.
  • Rowe, William (1991a) ‘Responsibility, Agent-Causation, and Freedom’, Ethics 101: 237-257.
    • Argues that agent-causation, for Reid, is the concept that plays a central role in the logical connection between moral responsibility and freedom.
  • Rowe, William (1991b) Thomas Reid on Freedom and Morality. Ithaca: Cornell University Press.
    • Presents and evaluates Reid’s theory of freedom.
  • Stalley, R.F. (1998) ‘Hume and Reid on the Nature of Action’, Reid Studies, 12: 33-48.
    • Discusses Reid’s response and criticism of Hume’s claim that we have no clear idea of power or cause.
  • Stecker, Robert (1992) ‘Thomas Reid’s Philosophy of Action’, Philosophical Studies 66: 197-208.
    • Argues that Reid’s theory of causality leads to a regress problem, and if it does not it is still uninformative and mysterious.
  • Van Cleve, James (2015) Problems from Reid. Oxford: Oxford University Press.
    • Examines Reid’s arguments related to a wide range of topics such as perception, sensation, skepticism, conception, epistemology and first principles, as well as action theory.
  • Van Woudenberg, René (2010) ‘Thomas Reid on Determinism,’ in Reid on Ethics, Sabine Roeser (ed.). Great Britain: Palgrave Macmillan: 123-142.
    • Argues that Reid understood ‘determinism’ differently from how it is currently understood, and discusses Reid’s criticism of determinism as he understood it.
  • Weinstock, Jerome (1975) ‘Reid’s Definition of Freedom’, the Journal of the History of Philosophy, 13.3: 335-345.
    • Examines Reid’s attempt to include power over one’s will in the definition of freedom, and argues that power over the will has to do with possession of reason and a faculty of judgment that enables us to govern ourselves.
  • Yaffe, Gideon (2007), ‘Promises, Social Acts, and Reid’s First Argument for Moral Liberty.’ Journal of the History of Philosophy 45. 2: 267-289.
    • Presents Reid’s philosophical views about the act of promising, and showing that, for Reid, it is because promising is a social act that he thinks it presupposes active power.
  • Yaffe, Gideon (2004), Manifest Activity. Thomas Reids Theory of Action, Oxford: Oxford University Press.
    • Offers a critical examination and reconstruction of Reid’s arguments in favor of his doctrine of moral liberty.

 

Author Information

Esther Engels Kroeker
Email: Esther.kroeker@uantwerpen.be
University of Antwerp
Belgium

Avicenna (Ibn Sina): Logic        

avicenna drawingAvicenna (Ibn Sina) (c. 980-1037 C.E., or 375-428 of Hegira) is one of the most important philosophers and logicians in the Arabic world. His logical works are presented in several treatises. Some of them are commentaries on Aristotle’s Organon, and are presented in al-Shifa al-Mantiq, the logical part of the Encyclopedic Book al-Shifa (The Cure), which also contains a treatise on Metaphysics (al-Ilahiyat) and a treatise on Natural Science (al-tabi‘iyyat); others are edited on their own, such as the books entitled al-Isharat wa-t-tanbihat, al-Najat, and Mantiq al-mashriqiyyin. Among his writings, there is a book written in Persian called Danishnama-yi ʻAla᾿i (The Book of Knowledge for ʻAla al-Dawla).

As a logician, he was mainly influenced by Aristotle’s commentators such as Alexander of Aphrodisias, and by al-Farabi, in the Arabic world, and had himself many followers including al-Ghazali, Nasir-eddine al-Tusi, Afdal al-Din al-Khunaji and Fakhr al-Din al-Razi. He added new concepts and distinctions that are not found in the writings of the ancient authors, or in those of the earlier logicians of the Arabic tradition. He improved the Aristotelian categorical and modal syllogistics, and constructed a whole system of hypothetical logic, different from the Stoic system and far more developed than al-Farabi’s reflections on the same topic. His modal syllogistic in particular is very different from the Aristotelian one, for the conversions involving the modal propositions are different. As to hypothetical logic, it involves the conditional as well as disjunctive propositions and makes connections between them both and between them and the predicative ones.

In his analysis of the syllogisms, Avicenna introduces a new distinction between the iqtirani (translated as “conjunctive”) syllogisms and the istithna᾿i (usually translated as “exceptive”) ones [that is, the usual Stoic kind of syllogisms], the former being a large class including the categorical syllogisms together with one kind of the hypothetical syllogisms. In the former, the conclusion does not occur in the premises; it conveys new knowledge deduced from the two premises, while in the latter, which uses the istithna, that is, detachment, the premises explicitly include either the conclusion or its contradictory.

His analysis of the absolute (or non-modal) propositions is new and original, since he introduces temporal considerations in this type of proposition and renews the oppositional relations between these propositions by introducing perpetual propositions, general absolute propositions and special absolute propositions. This analysis is influenced by semantic and linguistic considerations.

Table of Contents

  1. Life and Works
  2. The Definition of Logic
  3. The Different Kinds of Propositions and their Relations
  4. The Analysis of the Absolute Propositions in al-Shifa, al-Qiyas
  5. The Absolute Propositions in Mantiq al-Mashriqiyyin
  6. The Categorical Syllogistic
  7. The Modal Propositions and Oppositions
  8. The Modal Syllogistic
  9. The Hypothetical Logic
  10. Combining Hypothetical and Categorical Propositions
  11. References and Further Reading
    1. Avicenna’s Logical Treatises and Other Primary Sources
    2. Secondary Sources

1. Life and Works

Abu Ali al-Hussein Abdullah ibn al-Hassan ibn Ali Ibn Sina (called Avicenna in the West) was born in a village near Bukhara in 980 C.E. (375 of Hegira). At a very young age, he was taught the Koran together with much literature. He also learned philosophy, geometry and Indian calculus during his childhood and youth. His teacher, al-Natili, taught him Logic, starting with Prophyry’s Isagoge. He also studied the other treatises of the Aristotelian Organon and Euclidian geometry, which he mastered readily. While turning to Natural Science, he became interested in medicine and read many books related to that topic. At the age of 16, he was so knowledgeable in that domain that he was able to treat and cure people. As he became famous as a physician, he was asked to take care of the Sultan Nuh ibn Mansur and succeeded in curing him from his disease. To thank him, the sultan gave him access to the royal library where he could read many original medical books, together with books on poetry, Arabic grammar and theology. At the age of 18, he was already very knowledgeable in all these disciplines and had read many unknown ancient treatises in various sciences. However, he had some difficulty understanding Aristotle’s Metaphysics, which he read several times without getting the point of it. Only after he had the opportunity to read al-Farabi’s commentary on Metaphysics, entitled Aghradu kitab ma ba‘d at-tabi‘a, could he understand the aim and interest of metaphysics.

After the death of his father, he left Bukhara because of some troubles at that time and started to travel. He went to several places including Khurasan, Jurjan, and other Persian towns searching for subsistence. He started writing, in particular the Kitab al-Qanun (on medicine) and other books in various domains. He then became the minister forShams ad-Dawlah (the sultan of Hamadan in Persia) after having cured him from his disease. He started working for the sultan during the day and writing his books, in particular al-Shifa, in the evenings. He wrote first al-Tabi‘iyyat (Natural Science) of the Shifa book, after finishing the first volume of the Qanun. His secretary and his brother assisted him by reading and copying these books.

After the death of the sultan and because of some political complications, he went to jail for four months, where he wrote Hayy ibn Yaqdan, among other works; however, he was able to escape and travel with his brother and his secretary to Ispahan, where he stayed the rest of his life. In Ispahan, he became the minister and doctor of the sultan ‘Ala’ ud-Dawla; he also wrote Kitab al-Najat, plus the rest of Kitab al-shifa, in particular the logical part of it, and other books on arithmetic, geometry, music, and biology (anatomy and botany) together with a book on astronomy, at the request of the sultan. He also wrote three books on language and wrote down his medical observations and experiences in his famous Kitab al-Qanun.

Although Avicenna worked very hard, he also enjoyed a good living. However, during an expedition with the sultan he got sick; he tried to cure himself, but despite his efforts, he became very weak, and the intestinal disease did not disappear completely, recurring several times. He died at the age of 53 (428 of Hegira) from the disease, realizing that no medicine could cure him (al-Isharat wa-t-tanbihat 85-97; Mantiq al-Mashriqiyin; “Avicenna”).

Avicenna’s corpus is rich and varied. He wrote on all sciences, from astronomy, botany, and medicine to metaphysics, logic, physics, chemistry, linguistics and many others. In the field of logic, which is our concern here, he wrote several treatises, such as: al-Shifa al-Mantiq (which contains the correspondence on all Aristotelian treatises), al-Najat (translated in 2011 by Asad Q. Ahmed under the title: Avicenna’s Deliverance: Logic), al-Isharat wa-t-tanbihat, al-Qasida al-muzdawija, and Mantiq al-Mashriqiyyin. The latter, however, is what remains from a much longer book which, apparently, was destroyed in Ispahan in 1034 C.E. and is said to have contained Avicenna’s project on an Eastern philosophy called hikma mashriqiyya (Arnaldez 192).

2. The Definition of Logic

Avicenna defines the field of logic in many of his logical treatises. Some of these definitions include the following:

What is meant by logic, for men, is that it is a regulative (qanuniya) tool whose use prevents his mind from making errors (ʼan yadalla fi fikrihi) (al-Isharat 117).

…for [logic] is the tool that prevents the mind from errors in what men conceive and assent to, and it is what leads to the true convictions by providing their reasons and by following its methods (al-Najat 3).

Logic is said to be what “prevents the mind from errors.” This means, first, that it is a tool—an Organon in classical terms—and second, that its aim is to help reach the truth and avoid fallacies by studying in a specific way the methods used for that purpose. Logic focuses on the methods used in the quest for truth; it is thus different from other sciences in this particular way. Its focus is to study all the relevant methods that can be used to reach the truth in the best and safest way. However, the second definition is more precise than the first, as it explicitly evokes the notions of concepts and assents. As defined by Tusi, logic is “a science by itself (ʻilmun bi nafsihi) and a tool with regard to other sciences…” (al-Isharat 117, note 1). It is, in other words, the science that studies the best tools to reach the truth. This opinion can be found in al-Qiyas, where Avicenna says that these two features of logic are not contrary but compatible (al-Qiyas 11.7).

However, what exactly is the subject matter of logic? In al-Shifa al-Madkhal, Avicenna distinguishes between concepts on the one hand and propositions on the other. Propositions are either true or false, whereas concepts are components of the propositions and are neither true nor false. They are expressed in general terms, which can either be verbs, nouns, or adjectives. These terms may either be subjects or predicates in propositions. Logic thus studies the conceptions and assents as the Arabic authors largely hold. Although Avicenna talks about this view of logic explicitly, he is often said by many commentators to endorse another view, namely that logic is concerned with the study of the secondary intentions (or intelligibles, as they are sometimes called). The view that logic studies secondary intelligibles is expressed in al-Ilahiyyat (the metaphysics of the Shifa) rather than in the logical treatises. Avicenna says in that treatise:

The subject matter of logic, as you know, is given by the secondary intelligible meanings, based on the first intelligible meanings, with regard to how it is possible to pass by means of them from the known to the unknown [emphasis added], not in so far as they are intelligible and possess intellectual existence ([an existence] which does not depend on matter at all, or depends on an incorporated matter). (Metaphysics 7 qtd. by T. Street; the expression “as you know” is contested by one of the referees who prefers “as you have learned” or “as you have known”).

However, in this view, as the emphasized passage shows, the aim of the study is also always to show how one can go from the known to the unknown. Therefore, the real focus and specificity of logic is to pay attention to this passage. What are these secondary intelligible meanings? According to Street, these are second-level concepts, as distinguished from first-level concepts, such as the concepts of animal or human being, which refer to individuals in the real world, that is, the concepts that could have the property of being subjects, predicates, or genus. He thus focuses on the ontological aspect of this subject, as according to him, the secondary intelligibles are a distinct stretch of being (Street 2008, section 2.1.2). This account of Avicenna’s position is very close to A. I. Sabra’s analysis of the subject of logic in Avicenna’s theory, which relates Avicenna’s distinction to “the Porphyrian distinction between terms in first position and terms in second position,” (A. I. Sabra 1980, p. 753) that is, between the terms that refer to external objects and those referring to first position meanings.

However, Sabra’s interpretation of the first and secondary intelligibles has been criticized by W. Hodges, who finds Sabra’s comparison between the Avicennan and the Porphyrian distinctions “unfortunate, because there is an obvious candidate in Madkhal i.2–4 for the notions that Ibn Sina is referring back to, namely things in first and second mode of existence”. Thus, his own interpretation is different from that of both T. Street and A. I. Sabra in that it stresses the formal character of logic in Avicenna’s frame.

According to him, the definition provided in Mantiq al-Mashriqiyin clarifies the distinction between words “in first and second mode of existence” in the best way. This definition is the following: “And the subject [of logic] is the meanings in the context of their being subject to composition through which they reach a point where an idea is made available in our minds which was not in our minds [before]… (wa mawduʻuhu – al-maʻani min haythu hiya mawduʻatun lil-ta᾿lifi alladhi tasiru bihi muwassilatan ᾿ila tahqiqi shay᾿in fi adhhanina laysa fi adhhanina, …)”. Accordingly, “being in second mode of existence is the same thing as being ‘subject to composition’ – or at least being subject to being made a component of proposition”. In this respect, the secondary intelligible meanings can have as features to be subjects or predicates, universal or particular, and so on. These meanings are parts of the propositions, which in turn are the components of a syllogistic mood and have as features to be either premises of a syllogistic mood or a conclusion following from the premises. The conclusion deduced from the premises contains the newly acquired knowledge, which results from the already known premises. Here we find the passage from the known to the unknown stressed by Avicenna in his various definitions of logic.

It is in this sense that logic may be said to be formal, that is, to study the inferences and arguments with respect to their logical structures in order to find out what inferences are conclusive, that is, valid. In Avicenna’s text, the notion of validity might be defined as truth “in all matters” (al-Qiyas 64.10-11), as he shows in that passage that the validity of a syllogistic mood does not and should not depend on the matter of the propositions it contains.

This shows that Avicenna privileges deductive logic over other kinds of logical systems. The logic he defends is demonstrative as he explicitly says in the very beginning of al-Qiyas, where he identifies the aim of logic as follows: “Our aim in the art (sinaʻa) of logic is first and foremost (al-awwal wa bi-dhdhat) to identify the syllogisms and, among them, to study the demonstrative syllogisms. The usefulness of this study for us is to be able, using this instrument, to acquire the demonstrative sciences” (al-Qiyas 3.2-4). Thus, in his frame, logic is deductive, and it is this feature that makes it useful to the other sciences, as it is by applying its inferences and rules that the other sciences can reach the truth. As an instrument or a tool in other sciences, it plays a most important and specific role (al-Qiyas 10.15). Its truths and rules are, because it is deductive, comparable to the truths of mathematics, which are exact, “far from error,” and admitted unanimously and are “well-regulated” (munassaq) and not subject to plurality (al-Qiyas 16.9-10). Likewise, the part of logic that studies proofs and syllogistic inferences, which is the heart of logic, “does not admit plurality if it is rightly understood, for it is of the well-regulated type (min al-qismi al-munassaq)” (al-Qiyas 17.2).

3. The Different Kinds of Propositions and their Relations

Avicenna distinguishes between several kinds of propositions in the different sections of his system. The propositions may be categorical, modal, or hypothetical.

The categorical propositions are predicative: they contain a subject and a predicate, plus a copula and a quantifier in some cases. This class includes singular propositions, indefinite propositions, and quantified propositions, which are either particular or universal. All kinds are either affirmative or negative.

The modal propositions contain a modal operator in addition to all the components of the categorical propositions. The modalities are expressed using the words “necessarily” and “possibly,” to which a negation may be added. They are also singular, indefinite, or quantified and may be affirmative or negative.

The hypothetical propositions are either conditional or disjunctive. The conditional ones contain the expression “if…then,” while the disjunctive ones contain either the words “either…or” or the expression “not…both.” Unlike the categorical propositions, their elements are not terms, but whole propositions, as in the example “If the sun rises, it is daytime” and “either this number is odd or it is even.”

Avicenna says that these hypothetical propositions, like the predicative ones, may also be singular, indefinite, or quantified. When they are quantified, they contain words such as “whenever” (kullama), “always” (da’iman), “never” (laysa al battata), “maybe” (qad yakun), “not (whenever)” (laysa kullama), “not always” (laysa da’iman). However, they differ from the predicative ones in that their elements are propositions related by a logical operator instead of two terms related by a copula. An example of a universal affirmative conditional is the following: AC: “Whenever H is Z, A is B.” The universal affirmative disjunctive, that is, AD, is expressed as follows: “Always either H is Z or A is B.”

Avicenna acknowledges the four relations of the traditional square of opposition between the quantified categorical propositions in al-‘Ibarah (De Interpretatione). However, he does not explicitly talk about a square and does not use geometrical figures at all, although he does precisely define the four relations by means of their truth conditions. These definitions are the following (the commonly used vowels A, E, I, O do not appear in Avicenna’s text; they are used here for convenience):

  • Contradiction (tanaqud): valid when its components are true-false in all matters (A/O and E/I).
  • Contrariety (tadadd): valid when its components are false-false in the possible matter or true-false in the necessary and impossible matters (A/E).
  • Subcontrariety (ma taht at-tadadd): valid when its components are true-true in the possible matter and true-false in the impossible and necessary matters (I/O).
  • Subalternation (called tadakhul, a word close to inclusion): valid when its components are false-true in the possible matter, true-true or false-false in the two other matters (A/I and E/O).

As to the quantified hypothetical propositions, he says that the couples AC / OC and EC / IC, as well as AD / OD and ED / ID, are contradictory. However, the conditional ones do not validate the other relations of the square, given the operators they contain, as their corresponding formulas are as follows (where quantification is made based on situations): 

 Ac: (∀s)(Ps ⊃ Qs); Ic: (∃s)(Ps ∧ Qs); Ec = ~(∃s)(Ps ∧ Qs); Oc = ~(∀s)(Ps ⊃ Qs).

The disjunctives may be formalized as follows:

AD: (∀s)(Ps ⊻ Qs), ID: (∃s)(Ps ∨ Qs), ED: ∼(∃s)(Ps ∨ Qs), OD: ~(∀s) (Ps ⊻ Qs) (S. Chatti 2014a).

These formalizations of the disjunctive propositions validate not only the contradictions AD / OD and ED / ID, but also the contrarieties AD / ED, the subcontrarieties ID / OD, and the subalternations AD / ID and ED / OD.

4. The Analysis of the Absolute Propositions in al-Shifa, al-Qiyas

However, in al-Shifa al-Qiyas (the correspondent of Prior Analytics, literally: The Syllogism), he provides a more detailed analysis of the categorical propositions, which introduces temporal connotations. In this new analysis, the perpetual propositions (containing the words “always” or “never”) are the real opposites of the general absolute propositions (containing “at some times”). For instance, the real contradiction of “Every S is P (at some times)” is not simply “Some Ss are not P,” but rather “Some Ss are never P,” which is a perpetual proposition. In the same way, each general absolute proposition, whether universal or particular, affirmative or negative, is contradicted by a perpetual one. With these new propositions, one can draw several squares with valid relations but, once again, Avicenna does not himself draw any figure (Chatti 2014b 3).

Furthermore, some propositions are called special absolute propositions because they contain the expression “at some times, but not always.” These are contradicted by complex propositions containing disjunctions and are comparable to the bilateral possible propositions in this respect (Street 2004, Appendix).

One may ask why Avicenna introduces these temporal connotations. This is because he believes that most categorical propositions are not true if one does not add some further condition. For instance, “Every man is laughing” is true only when one adds “at some times” (al-Qiyas 82). Thus, the temporal connotation helps determine the truth value of the proposition when it is ordinary, that is, part of everyday discourse.

The temporal conditions are specifically Avicennan, together with the condition “as long as he exists” added in some kinds of propositions. However, Avicenna also uses other conditions such as “as long as it is S,” and “as long as it is P,” which are already present in Theophrastus’ text, as noted by Wilfrid Hodges in several writings. The former conditions make Avicenna’s logic different from Aristotle’s as well as from al‑Farabi’s and Averroes’ systems, which do not contain such temporal precisions.

5. The Absolute Propositions in Mantiq al-Mashriqiyyin

In Mantiq al-Mashriqiyyin, which supposedly contains Avicenna’s original views but which is, as it stands, very incomplete, as only some fragments remain from it, Avicenna classifies the predicative propositions into five kinds as follows:

  1. Necessary (daruriya): S is P (as long as S exists) (Mantiq 65, 68)
  2. Implicative (lazima) (personal translation): S is P (as long as it is S) (Mantiq 65, this is also called lazima mashruta: implicative with a condition)
  3. Factual (tari’a) (Mantiq 65) (muwafiqa 68-69): S is P (not perpetually)
  4. Determined (mafruda) (Mantiq 65): S is P (at some determined time), e.g. “Every moon eclipses” (Mantiq 68)
  5. Spread (muntashira) (Mantiq 65): S is P (at some undetermined but regular times), e.g. “Every human being breaths” (Mantiq 68)
  6. Temporal (waqtiya) (Mantiq 65) (hadira = present (Mantiq 68)): S is P (at present), e.g. “Every animal is a man,” which could be true “if there were a time where this would be the case” (Mantiq 68)

Some sentences can even be interpreted in different ways, as witnessed by the following example: “Every sick person is weakened” (Mantiq 72), which could either be considered kind (2) (for instance, if the illness is chronic), kind (4) if the weakness occurs at one determined time, or kind (5) if the weakness occurs at undetermined but regular times (Mantiq 72).

These kinds are not really new in Avicenna’s theory, as he addresses kinds (4) and (5) in al-Qiyas and gives an example in that same treatise that is very close to kind (6) by saying: “because it is possible at some time that ‘every B is C’ […], at that time ‘every animal is a person’ will be true” (al Qiyas 141). As to (1) and (2), they represent two kinds of propositions that were sharply separated in al-Qiyas (1964 21-22) and that are also still separated in al-Isharat because (1) was considered descriptional, while (2) is substantial (Street 2004 551).

Avicenna also uses the expression lazima mashruta when he talks about the implicative, leading to the question of why he adds this adjective and whether there is any difference between lazima and lazima mashruta. Taking into account the explanations provided in the text (al-Qiyas 65.5-11), the difference might be related to the truth values of both propositions, as Avicenna analyzes some sentences that could be true only when one adds “as long as it is S” and distinguishes between them and those containing “as long as it exists.” In some cases, the condition “as long as it is S” is absolutely indispensable, which sharply distinguishes these sentences from those, called necessary, which could be true with the condition “as long as it exists.” In the first kind, “S is P only when it is S,” otherwise it would be false; this is why Avicenna says “with a condition,” as this condition is crucial to determine the truth-value of the sentence. The example given in the text clarifies this idea. The sentence “A moving thing changes as long as it exists” is not true because something that can move (such as animals or people) or even that does actually move could not be said to change “as long as it exists,” as this movement and, consequently, the change, does not last the whole time the thing exists. To the contrary, if one says:

  • “Every man is an animal (as long as it is a man),” this sentence is not different in its truth-value from that other sentence:
  • “Every man is an animal (as long as he exists),” both are true, even if the conditions are not the same in each case.

In this example and similar ones, as Avicenna emphasizes in al-Qiyas, “the situation is not [that] different (la yaftariqu al-halu) from saying ‘as long as it exists’ and saying ‘as long as it is white’ [that is, ‘as long as it is S’]” (al-Qiyas 22.6-7), while in other sentences, such as the first example above (“A moving …”), there is a real difference. Therefore, maybe Avicenna adds “with a condition” (mashruta) to emphasize the case where the condition “as long as it is S” makes a real difference with regard to the truth-value of the sentence. The difference is that the implicative is true only in so far as the thing is described as S, whereas the necessary is true during the whole time that the thing described as S exists.

Nevertheless, these two kinds of propositions share at least one thing, which is the continuous link between S and P, expressed in both propositions by “as long as,” which is not obvious in the other kinds.

Kind (3) seems to correspond to the so-called general absolute propositions (containing “at some times”). However, the special absolutes (containing “at some times but not always”), which were analyzed in al-Qiyas and al-Isharat, are not part of this classification.

All these propositions can be modalized by adding the modal words “necessarily” (bi-d-darurati) and “possibly” (bi-l-’imkani), to which one can add negations as follows: “not necessarily” (laysa bi-d-darurati) or “not possibly” (laysa bi-l-’imkani) (al-Qiyas 71). These modal propositions are four-fold (ruba‘iyah) (al-Qiyas 70), as they contain four elements.

6. The Categorical Syllogistic

Avicenna admits three figures and the same valid moods as Aristotle in Prior Analytics. He explicitly rejects the fourth figure because of its unnaturalness (al-Qiyas 107). However, given the precisions added in the analysis of the absolute propositions, some rules such as conversion become invalid for some kinds of propositions, which consequently invalidate the demonstrations that use this rule and the moods that are obtained by means of it. Only some kinds of propositions validate the conversions (of E, I and A). Thus, the general absolute propositions, containing “at some times,” do not validate E-conversion given that “No man is laughing (at some times)” does not lead to “No laughing thing is a man (at some times),” as “it is impossible to negate the predicate ‘man’ from what is laughing in effect” (al-Qiyas 82). Consequently, since conversion is often used in the reduction of syllogisms of the second and third figures, these syllogisms are not valid when they contain the non-convertible propositions. As a matter of fact, the valid moods admitted by Avicenna contain quantified propositions containing the condition “as long as it is S.” These admit E-conversion, as well as the other syllogistic rules, as it can be deduced, for instance, from the sentence “Nothing that sleeps wakes while sleeping” that “Nothing that wakes, sleeps while awake” (Street 2004 551). Thus Cesare [that is, the 1st mood of the 2nd figure in Avicenna’s wording] from the second figure is stated as follows: “Every C is B (as long as it is C); No A is B (as long as it is A); therefore, no C is A (as long as it is C)” (al-Qiyas 114-115). Note also that as all other Arabic logicians, he starts the syllogism with the minor premise, and the major premise is the second one. However, the places of the terms are correct because, for all moods, the minor term is the subject of the conclusion, while the major term is its predicate.

In addition, he explicitly states some rules that govern the different figures and the valid moods including, for instance, the following general rule: “the conclusion (natija) follows the least (akhass) premise with regard to quantity and quality, but not with regard to modality” (al-Qiyas 108; al-Najat 33). As a consequence, in the third figure, only particulars are deducible (al-Qiyas 108) and in the second figure, only negatives are deducible, while in the first figure, all kinds of propositions are deducible.

Other rules are stated, such as the following:

  •  In the second figure, no syllogism is possible with two affirmative premises, as the middle can be predicated by two opposite subjects, e.g. ‘body’ can be predicated by ‘men’ and ‘stones’ (al-Qiyas 111).
  •  In all syllogisms of the first and second figures, the major premise must be universal.
  •  In the syllogisms of the third figure, the minor premise must be affirmative.
  •  No syllogism is conclusive when the minor is negative and the major is particular.

In three of the first figure moods (Darii, Barbara, and Celarent), the conclusion may be converted, which gives rise to other (imperfect) moods. Therefore, these moods admit two conclusions: the first one, obtained as usual from the two premises, and a second one, obtained by conversion from the first conclusion (al-Qiyas 110.4-6).

The singulars are treated as universals (al-Qiyas 109.12-13). For instance, the following Barbara syllogism contains only singular propositions: “Zayd is the father of Abdullah; the father of Abdullah is the brother of ‘Amr; therefore, Zayd is the brother of ʻAmr” (al Qiyas 109.13-14).

Avicenna uses all kinds of proofs in demonstrating the moods of the second and third figures. He systematically provides two (or more) proofs for each mood, a direct one by conversion or by ekthesis and an indirect one, by reductio ad absurdum (bi-l-khalf). For instance, the mood Cesare above, that is, “Every C is B (as long as it is C); no A is B (as long as it is A); therefore, no C is A (as long as it is C),” in addition to its proof by conversion (al-Qiyas 114.5-8), is also proven by reductio ad absurdum as follows: “Suppose the conclusion is false, then ‘Some Cs are A (as long as they are C)’ is true; but we have ‘No A is B (as long as it is A)’; so by Ferio, we deduce ‘Not every C is B (as long as it is C); but this contradicts the first premise, that is, Every C is B (as long as it is C), which is not acceptable” (al-Qiyas 114-115). The negation of the conclusion is thus not compatible with the first premise, which indirectly proves the validity of the whole syllogism.

The other moods are also proven in two or more ways. For instance, in the second figure, Camestres is proven by the conversion of the minor and the conclusion and by reductio ad absurdum, Festino is proven by the conversion of the major and by reductio ad absurdum, and Baroco is proven by reductio ad absurdum and by ekthesis. In the third figure, Darapti is proven by ekthesis, by the conversion of the minor and by reductio ad absurdum; Felapton is proven by ekthesis, by the conversion of the minor, and by reductio ad absurdum; Datisi is proven by the conversion of the minor; Disamis is proven by ekthesis, by the conversion of the major and of the conclusion, and by reductio ad absurdum; Bocardo is proven by ekthesis and by reductio ad absurdum; and Ferison is proven by the conversion of the minor and by reduction ad absurdum (al-Qiyas, pp. 114-119)

These proofs are inspired by Aristotle’s, but some of them cannot be found in Aristotle’s texts. These are, for instance, the proofs by ekthesis of the second figure mood Baroco (al-Qiyas 118.10-12) and of the third figure mood Bocardo (al-Qiyas 119.1-2). However, Avicenna is not the first philosopher in the Arabic tradition to provide proofs of these two moods by ekthesis. Before him, Al-Farabi also proved them both by ekthesis in his Kitab al-Qiyas (al-Farabi 1986a 25.15-26.4, 28.16-29.1). Bocardo’s proof is exactly the same as al-Farabi’s, but Avicenna’s proof of Baroco is different from al-Farabi’s. To see the difference, let us state them both.

Baroco itself is the following:

Every A is B [“B belongs to every A” in one of al-Farabi’s phrasings]

Some C’s are not B [“B is not in some C”]

Therefore Some C’s are not A [“A is not in some C”]

The proof by ekthesis relies on the assumption that since “B is not in some C,” then B is negated by “all this part”; therefore, “suppose that this part is designated on its own and let us call it D” (al-Farabi, al-Qiyas 1988 131; 1986a 25.17-18), then we have the following steps in  Al-Farabi’s proof and Avicenna’s proof:

Al-Fārābī’s proof: Avicenna’s proof:
1. B is not in some C 1. Some C’s are not B
2. B belongs to no D (assumption) 2. No D is B (assumption)
3. B belongs to every A (major premise) 3. Every A is B (major premise)
4. D belongs to no B (from 2 by conversion) 4. No D is A (from 2, 3 by Camestres)
5. D belongs to no A (from 3, 4 by Celarent) 5. Some C is D (assumption)
6. A belongs to no D (from 5 by conversion) 6. Therefore Not every C is A (4, 5, by Ferio)
7. D is some C (assumption)
Therefore A is not in some C (from 6, 7 by Ferio)
(Al-Farabi, al-Qiyas 1988 131.9-17; Kitab al-Qiyas 1986a 25.15-26.4) (Avicenna, al-Qiyas 116.10-12)

Avicenna’s proof is shorter, as it applies Camestres directly to the assumption and the major premise to obtain the crucial premise “No D is A,” while al-Farabi, although he notes that steps 2 and 3 are the premises of Camestres, does not apply Camestres directly; rather, he converts 2 to obtain 4 and applies Celarent to arrive at the premise “A belongs to no D,” which is necessary to deduce the conclusion.

However, both proofs share the same difficulty, which is that the assumption “Some C is D” is not warranted by the premises of Baroco because given that O does not have an import, C could be empty, in which case the assumption would be false. This difficulty is raised by Wilfrid Hodges who considers that Avicenna could not have missed it but has probably considered that it did not make the ekthetic proof of Baroco illegitimate.

As to al-Farabi, although he preceded Avicenna in his use of ekthesis in the proof of Baroco and was also the first logician in the Arabic world to defend the idea that quantified negative propositions do not have an import, whereas the affirmatives do, since the affirmatives such as “every man is white” “are false when the subject does not exist” (Kitab al-Maqulat 1986b 124.14), whereas their negative contradictories are true in that case (124.15), he did not seem to find the proof of Baroco problematic and did not provide any other proof for it. His justification of the use of ekthesis is that conversion is not applicable in that case. Al-Farabi did not mention the difficulty above, and the concrete example he provides involves non-empty terms, that is, the following: A: Horse, B: whinnying, C: animal, D: man. Thus he says:

If we consider that the animals from which we have denied the whinnying are men, for example, we then have ‘every horse is whinnying’ and ‘No man is whinnying.’ It follows ‘No man is a horse’ as we showed above. And ‘men are some animals,’ therefore ‘Some animals are not horses’. (27.10-12).

Since in this example C is not empty, we might consider that al-Farabi did not pay immediate and sufficient attention to the fact that C could be empty in some cases, as the premise O in Baroco does not rule out that case.

7. The Modal Propositions and Oppositions

As we said above, Avicenna expresses the modal propositions by adding the words “necessarily” and “possibly” to all kinds of propositions. He negates the modality to obtain the contradictory of the affirmative modal proposition, whether singular, indefinite, or quantified. This syntactic device works when the modality is external (that is, at the beginning of each proposition); this seems to be the strategy used by Avicenna in his analysis of the modal oppositions (al-Qiyas 49-50), as he just added, in that part of the text, the word laysa (‘no’ or ‘not’) in front of the modal propositions, even the quantified ones, to obtain their negations. This differs from al-Farabi’s style, as the latter always puts the modal word in front of the predicate. However, Avicenna also uses internal formulations by putting the modal word at the end of the proposition, in particular in his modal syllogistic.

Avicenna provides three definitions of possibility, which are the following: 1. the unilateral possible, 2. the bilateral possible, and 3. what is neither actual nor necessary, nor impossible. The latter is related more specifically to the future. However, he privileges the second meaning, that is, the bilateral possible. In addition, he also provides the negation of the bilateral possible for all kinds of propositions, including the quantified ones, whether the modality is internal or external. He presents the entailments and equivalences between the modal propositions in his al-Shifa al-‘Ibarah (the correspondent of De Interpretatione) and shows in particular that the possible in Tables I and III rejected by Aristotle (De Interpretatione 22a14–22a32) should be interpreted as bilateral because, in that case, all the entailments become valid (S. Chatti 2014b 9-13).

As to necessity, it is the dual of possibility because “□ ≡ ~◊~” and “◊ ≡ ~□~.” The bilateral possible is expressed as “◊α ∧ ◊~α,” while its negation can be formalized as “□α v □~α.” When the propositions are quantified, the following couples of contradictories result: □A / ◊O, □E / ◊I, □I / ◊E, □O / ◊A. For the bilateral possible, the contradictories are as follows when the possibility is external: ◊A ∧ ◊O / □O ∨ □A and ◊E ∧ ◊I / I ∨ □E, and as follows when it is internal: A◊ ∧ E◊ / I□ ∨ O□ and “Some Ss are ◊~P and ◊P” / “Every S is □~P or □ P.”

The necessity operator may be added to all the assertoric propositions above, which contain various conditions. For instance, one may say: “Necessarily Zayd writes (as long as he writes),” or “Necessarily the moon eclipses (at some determined time).” When the proposition is necessary but does not contain any condition, the necessity is said to be absolute (Lagerlund 233). This absolute necessity is very rarely used, as most of the necessary propositions contain some condition, in particular the existential condition (that is, “as long as S exists”).

The oppositional relations between the modal quantified propositions may be represented by means of a Dodecagon (a figure with 12 vertices), where the other oppositions of the square can be added to the contradictions already mentioned. Avicenna has provided all the contradictions and some of the subalternations, contrarieties, and subcontrarieties; however, the remaining ones can easily be demonstrated in his system by means of the very relations he himself admits. The figure representing the modal singular (and indefinite) propositions is a hexagon, where all the relations are given by Avicenna, except two subcontrarieties, which are missing in his text but could be easily added (Chatti 2014b 10).

8. The Modal Syllogistic

Avicenna’s modal syllogistic differs from Aristotle’s in some points and from Averroes’ syllogistic, which is very Aristotelian. As we saw above, Avicenna primarily uses two kinds of possibility with the third one being mentioned but not given much importance. This influences the modal syllogistic rules and, consequently, the validity of the syllogisms. According to Avicenna, the conversion holds for necessary E and leads to necessary E as well. However, A necessary does not lead to I necessary, it leads rather to I possible. For instance, from the necessary proposition “Every laughing [thing] is necessarily a man” (or “is a man necessarily”), one cannot deduce “Some men are necessarily laughing”, rather the conversion leads to “Some men are possibly laughing” (al-Isharat 336), as the sentence “Some men are necessarily laughing” is not true, while the initial A proposition is necessarily true. The same can be said about I necessary (al-Isharat 335-336; Street 2008). It can be noted here that the example taken by Avicenna is precisely the one that illustrates the invalidity of conversion when it is applied to the general absolute propositions (see sections 4 and 6 above). As to possible E and possible O, they do not convert because if “Possibly no man is a writer” is true, “Possibly no writer is a man” is false and “Possibly some writers are not men” is also false (al-Isharat 338). This is so because the predicate “man” cannot be denied from the subject “writer,” even in a possible proposition given that a writer is necessarily a man and cannot be anything else.

To the contrary, possible A and possible I do convert. However, if the possible is narrow (=bilateral), the conversion leads to a general possible proposition (=unilateral), that is, the general possible I (al-Isharat 339). According to Avicenna, “if every C is possibly B” (where the possibility is bilateral and internal), or “if some C is possibly B” (by the bilateral kind of possibility), then “Some B is possibly C” (where the possibility is unilateral and internal), “Otherwise, it [would] not [be] possible for a thing that is B to be also C” (al-Isharat 339). An example can show this; if we say “Every human being is possibly a writer” (and eventually, “possibly not a writer” too), we can deduce that “Some writers are possibly human beings,” otherwise, it would be impossible for any writer to be a man, and this of course is not plausible. Naturally, in the second proposition, the possibility is unilateral, as it would be false to say “Some writers are possibly not human beings.” These conversions also hold with internal modalities.

As a matter of fact, in stating the modal syllogisms, Avicenna uses internal modalities as noted by Tony Street (Street 2008, section 2.3.1) who explains that, according to Avicenna, necessity has to do, above all, with being: “It depends on how things are and not on how things are described” (Street 2008, section 2.3.1). Consequently, the necessary propositions used in his syllogistic should contain “as long as it exists,” and the necessity operator is internal, as it occurs most of the time at the end of the proposition.

As to the moods held, there are several analyses of the modal syllogistic in the literature. One of them is the analysis provided by Tony Street in his article “An Outline of Avicenna’s Syllogistic” (2002). According to this author, the moods held in the first figure are the following: “[AXA], [ALA], XXX, XLX, LXL, LLL, MMM, MXM, MLM” to which he adds “two imperfect mixes: LML, XMM.” In the second figure, he says that the following are held valid: “LLL, XLL, LXL, MLL, LML” and in the third figure, the following are admitted: “XXX, LLL, LXL, XLX, MMM, XMM, MXM, LML, MLM” (Street 2002 160), where A: perpetual (containing the word “always”), X: absolute, M: possible, L: necessary.

Another analysis is provided by Paul Thom, who also uses the same kinds of propositions, that is, Perpetual (=P), General Absolute (=X), Possible (=M) and Necessary (=L) propositions. He states the universal affirmatives of each kind as follows: “1. X, the universal affirmative general absolute “every j is b”: jM ⊂ bm, [= every possible j is sometimes b] 2. P, the universal affirmative perpetual “every j is always b”: jM ⊂ bm [= every possible j is always b], 3. L, the universal affirmative necessity-proposition “every j is necessarily b”: jM ⊂ bL [every possible j is necessarily b], 4. M, the universal affirmative possibility-proposition “Every j is possibly b”: jM ⊂ bM [= every possible j is possibly b]” (Thom 2008 363, explanations inside brackets added following Thom’s interpretations of the subscripts, 363.9-10). In this interpretation, which Thom calls the “simple de re reading” (Thom 2008 363.12), the moods held are the following: “(i) the LLL, PLP, XLX, and MLM syllogisms of Fig 1, along with (ii) the LPL, PPP, XPX and MPM syllogisms, and also (iii) the LXL, PXP, XXX, and MXM syllogisms of the same Figure” (Thom 2008 364.5-7), plus the MMM and XMM moods (Thom 2008 364). These moods are validated by the semantics that Thom presents in his article. In the second figure, Thom says that the following moods are validated by the same reading and semantics: “LML-2, MLL-2” (Thom 365), plus the “XPL and PXL syllogisms” which are said to be “equivalent to XMX-1 and PMP-1” (Thom 2008 365). In the third figure, the following moods are validated: “XMX, LML, MMM, PMP and XMX” (Thom 2008 365).

Note that in both accounts the perpetual is added to the usual modalities, which could be justified by the fact that Avicenna uses the word da’iman (=always or perpetually) when he talks about the modal syllogistic and that he sometimes uses both words together by saying necessarily and always [bi-al-darurati da᾿iman] (al-Qiyas 128.5, 7).

However, this addition of the perpetual as a “separate class” of propositions has been criticized by Wilfrid Hodges who considers that the perpetual sentences are nothing more than those called “necessary” by Avicenna himself, as they have the same logical behavior. Therefore, although he agrees with Street’s list of the modal moods, he contests those containing the perpetual propositions. His analysis of the Avicennan modal syllogistic uses a two-dimensional framework that quantifies the times added to the usual quantification of objects. An example of this two-dimensional framework can be found, for instance, in the article “The move from one to two quantifiers” (Hodges 2015).

In these accounts, the absolute proposition seems to be interpreted as a general absolute one, that is, as a proposition containing “at some times.” However, Avicenna explicitly says in several places that this kind of proposition does not convert and that the absolute propositions used in the syllogistic moods should be convertible. The convertible absolutes contain the condition “as long as it is S” as we saw above (section 6). Therefore, the absolutes used in the different moods should contain this condition, even if it is a first figure mood, as the conversion should lead to a proposition of the same kind. For instance, E-conversion leads from “No C is B (as long as it is C)” to “No B is C (as long as it is B)”; it does not lead to “No B is C (at some times).” The discussion of Barbara XLL where the absolute is interpreted as “Everything described as B is A at some times and this time is the one where it is described as B (kull ma yusafu bi [B] yakunu lahu [A] waqtan ma, wa dhalika al-waqtu huwa kawnuhu mawsufan bi [B])” (al-Qiyas 128.5-6) confirms this opinion, given that Avicenna clearly explains exactly what he means by this absolute. This mood is illustrated by the following concrete example: “All snow is white by necessity, and every white thing dissociates the eye as long as it is white; therefore, all snow always dissociates the eye” (al-Qiyas 129. 1-2). This example is translated as follows by Wilfrid Hodges:

(a-d) All snow is coloured white throughout its existence.

(a-ℓ) Everything coloured white dissociates the eye so long as it is coloured white.

(a-d) Therefore all snow dissociates the eye throughout its existence.

This translation shows that Wilfrid Hodges, following Avicenna’s explanations, uses an ℓ sentence (that is, a sentence containing “as long as it is S”) in this mood, which means that the major proposition is not a general absolute (containing “at some times”), but rather what Avicenna calls an implicative (lazima) in Mantiq al-Mashriqiyin, which, unlike the former, is convertible. The mood itself is a Barbara XLL since Avicenna uses the word da᾿iman (always) in the conclusion. It therefore seems that Barbara XLL is admitted when the major is descriptional. This is confirmed by the list of first figure moods admitted by Avicenna, which contain among other possibilities the ℓdd and ℓℓℓ moods, that W. Hodges presents in “The move from one to two quantifiers” (Hodges 2015, section 5).

The moods above are different from Aristotle’s, as noted by these authors. For instance, Barbara LML is not valid in Aristotle’s modal logic, as Aristotle has Barbara LMM instead (Pr.A. 35b37-36a2). Avicenna differs also from Aristotle with regard to the conversion of necessary A, which leads to necessary I in Aristotle’s theory (Pr. A. 25a 31-33). However, as we saw above, it leads to possible I in Avicenna’s theory. This makes his theory different also from Averroes’, who tries to validate all the moods held by Aristotle.

9. The Hypothetical Logic

Hypothetical Logic is the part of the system that deals with the conditional and disjunctive propositions as they are stated, for instance, in Stoic logic. The hypothetical syllogism in general is called qiyas sharti. The logicians of the Arabic world, such as al-Farabi, include these propositional syllogisms in their correspondence about Prior Analytics and sometimes in their correspondence about Categories. However, unlike al-Farabi and Averroes, who just present the Stoic indemonstrables and some of their variants, Avicenna presents these same indemonstrables at the end of al-Qiyas (389-407), but also develops a whole hypothetical syllogistic where the valid moods contain conditional as well as disjunctive propositions and even combines such propositions with the categorical ones. The former syllogisms are called istithna᾿i (translated as “exceptive” in Street 2004, for instance), whereas the latter are called iqtirani (usually translated as “conjunctive”). The iqtirani / istithna᾿i distinction involves all kinds of syllogisms, whether categorical or hypothetical, as the class of iqtirani syllogisms includes the categorical syllogisms and one kind of hypothetical one, whereas the istithna᾿i syllogisms are the usual Stoic ones. The difference between both kinds is that the premises of an istithna᾿i syllogism include either the conclusion or its contradictory, while in the iqtirani ones, the conclusion is not included in the premises. The hypothetical syllogistic with the conditional propositions is almost the exact duplicate of the categorical syllogistic, as it contains three figures and moods corresponding to the usual categorical ones. When the disjunctive propositions are introduced, many moods are added that do not necessarily correspond to the categorical ones.

The sharti (hypothetical) propositions are of two kinds: those containing “if…then” and those containing “either…or.” The former are the conditional propositions, and they express either what Avicenna calls the luzum, that is, the strong implication (the relation of “following from”), or what he calls the ittifaq. The latter are the disjunctive propositions that express either a strong or less strong separation. Here too, he sometimes speaks of ittifaq. In the luzum, or real implication, the consequent necessarily follows the antecedent so that if the antecedent is true, the consequent must be true because there is a semantic or causal link between them both. For instance, when one says: “If the sun is up, then it is daytime.” Here, the antecedent is the cause of the consequent, and the consequent “follows the antecedent in the reality (fi al-wujudi) and rationally (fi al ʻaqli)” (al-Qiyas 233.16). The causality relation may be involved in several ways; either the antecedent is the cause of the consequent as when someone says “If the sun is up, then it is daytime,” or the antecedent is itself “caused [by] and not separated (ghair mufariq)” from the consequent, or both the antecedent and the consequent are caused by the same thing, as with lightning and thunder, which are caused by “the movement of the wind in the clouds” (al-Qiyas 234. 4). In all these examples, there is a natural link between the antecedent and the consequent, which are related in all situations. This natural or semantic link may be present and makes the conditional true even when both propositions are false, as when one says: “If men are stones, then they are inert” (al-Qiyas 261.1-2), or when the antecedent is false while the consequent is true as when one says: “If five is even, then five has a half” (al-Qiyas 260.14). In both examples, the entailment is due to the semantic link between the antecedent and the consequent.

However, in the ittifaq, which is translated as “chance connection” by N. Shehaby and evokes either the notion of accident or agreement, there is no such natural, semantic, or causal link as shown by the following example: “If men exist, then horses exist too” (al-Qiyas 234.14). Here, the existence of men is not the cause of the existence of horses, nor is it caused by it, nor are the two propositions related in either way, whether semantically or causally to each other. Each is true on its own and neither needs the other to be true in order for it to be true itself. Therefore, we could talk of some kind of concomitance because both are true. However, the truth of the consequent is not due to the truth of the antecedent, they just happened to be true together (ittafaqa ittifaqan) (al-Qiyas 234.15).

Elsewhere, Avicenna talks about al-muwafaqa fi al-sidqi (al-Qiyas 265.11), which means the “agreement” in the truth, that is, the fact for the propositions to be both true. Therefore, muwafaqa or ittifaq (which amount to the same, as both words come from the same root) mean in that case the agreement in the truth, which could be rendered simply by the word “concomitance.” In addition, he offers the sentence “if men are talking, then donkeys are braying” as an example and says that “here, it suffices that the consequent is true, for this reason the truth of this proposition is clear” (al-Qiyas 265.13-14). This idea that the truth of the consequent alone makes the whole conditional proposition true is also evoked by Wilfrid Hodges who says “[i]n this passage it seems that Ibn Sina understands an ittifaqi sentence (a, mt)(p, q) to be one which is taken to be true on the basis that ‘Q’ is always true” (Hodges 2014 237). Thus, maybe, as Wilfrid Hodges clearly suggests against Shehaby’s interpretation based on the notion of chance, ittifaq is best rendered by the notion of agreement (or accordance) of the consequent (or of both propositions) with reality. This may be shown by another example provided by Avicenna, which is: “If every donkey is talking, then every man is talking,” which is true “by means of concordance or agreement (ʻala maʻna al-muwafaqa)” (al-Qiyas 270.10). In this example, the antecedent is false and it is the truth of the consequent that makes the whole proposition true. This is confirmed by the following text, which sounds like a definition: “Agreement (muwafaqa) is nothing but (laysa illa) the configuration in which the consequent is true (wa al-muwafaqa laysa illa nafsu tarkib al-tali ʻala annahu haqqun)ˮ (al-Qiyas 279.15).

Anyway, in all these cases, whether both propositions are true or the consequent alone is true, the common feature is that the truth of the sentence is not due to the link between the antecedent and the consequent, as there is no strong (semantic or causal) link that could make us deduce the consequent from the antecedent. If the sentence is true, it is either because all its elements or its consequent alone are in accordance with reality. This accordance is perhaps what Avicenna means by ittifaq or muwafaqa, which he sometimes associates with the word mutabaqa, that is, correspondence (al-Qiyas 265.10).

The next issue to address is how Avicenna analyzes the hypothetical inferences and what syllogisms and moods are admitted.

In his study of the exceptive (istithna᾿i) syllogisms, Avicenna does not evoke the Stoics at all nor does he cite his sources. However, he says very often that these syllogisms are common or commonly known (mashhur) (al-Qiyas 390, 391, 395). He once evokes a “man who has advanced knowledge in the science of medicine” (al-Qiyas 398.12) and “people who strongly defend the first teacher (that is, Aristotle)” (al-Qiyas 398.14). These remarks suggest that the physician he refers to might be Galen, since Galen presented the stoic indemonstrables in his Institutio Logica, as Lukasiewicz, in his article on the stoic propositional logic (1934) says (Lukasiewicz 1934 qtd. in Largeault 1972 16), while the defenders of Aristotle might be the Peripatetics in general. Al-Farabi is also evoked and even criticized in this part of the system.

The exceptive syllogisms presented are the Modus Ponens, Modus Tollens, Modus Tollendo Ponens, Modus Ponendo Tollens, and the syllogism where the first premise is a negated conjunction, which Avicenna expresses by using a disjunction of two negative propositions.

Note that when the ittifaq is concerned, the deductions by means of these indemonstrables cannot be made, as from the two premises:

If every man is speaking, then every donkey is braying

But not every donkey is braying

one cannot deduce “Not every man is speaking” (al-Qiyas 267.1-11) by Modus Tollens, because “Every donkey is braying” does not follow from “Every man is speaking.”

However, he criticizes al-Farabi, who distinguishes between a complete implication (which is convertible, and is thus an equivalence) and an incomplete one (not convertible) by saying that such a distinction relies on the content of the propositions, which should not be taken into account in these syllogisms, since only the forms of the propositions should be considered. According to al-Farabi, with the complete implication, from “If p then q” and “q,” one may deduce “p” (al-Maqulat 1986b 79, symbols added). While, according to Avicenna, “When we say: ‘If A is B, then J is D’, and if this is a premise of our syllogism, we must consider what this means by considering its form, and decide what follows from this form. Saying that its consequent may be converted with its antecedent depends on something that is not the form of the premise; rather it has to do with the matter of the premise. This is like asking if the predicate of the universal affirmative is identical with its subject or not” (al-Qiyas 391. 16-17-392. 1-3). This means that the notion of form plays a fundamental role in Avicenna’s logic, whether with regard to the propositions or with regard to the inferences. The form of the conditional premise is shown by the order of the elements within the proposition.

As to the study of the iqtirani syllogisms, which are part of the hypothetical syllogistic, it is made possible by the fact, already mentioned in section 3, that the hypothetical propositions, whether conditional or disjunctive, are also quantified by Avicenna. These quantifications have been interpreted in two ways in the literature: some, like Nicholas Rescher (1963), say that Avicenna quantifies using times, while others, like Zia Movahed (2012), privilege the quantification of situations. As a matter of fact, the latter solution is preferable in that it is more general and closer to Avicenna’s examples.

As stated above (section 3), Avicenna uses the words “whenever” and “never” to express the two universal conditional propositions and the words “maybe” (qad yakun) and “not whenever” (laysa da’iman) to express the two particular conditionals. Consequently, the conditional quantified propositions can be formalized as follows:

 AC: whenever P, then Q = In all situations, if…then..: (∀s)(Ps ⊃ Qs)

 EC: never (if P then Q) = In all situations, if… then ~… : (∀s)(Ps ⊃ ~Qs) = ~(∃s)(Ps ∧ Qs)

 IC: maybe (if P then Q) = Not (never if…then): ~(∀s)(Ps ⊃ ~Qs) = (∃s)(Ps ∧ Qs)

 OC: not (whenever P, then Q) (laysa da’iman) (al-najat, p. 45) = ~(∀s)(Ps ⊃ Qs) = (∃s)(Ps ∧ ~Qs).

As to the disjunctive propositions, they are expressed using the words “always, either … or…,” “never, either … or…,” “maybe, either…or…,” and “not always, either… or….”

Consequently, they can be formalized as follows:

 AD: Permanently (always) either P or Q: (∀s) (Ps ⊻ Qs)

 ED: Never either P or Q (al-Qiyas 283): ~ (∃s) (Ps ∨ Qs)

 ID: Maybe either P or Q … (p. 288) (al-Qiyas 290): (∃s) (Ps ∨ Qs)

 OD: Maybe not either P or Q (maybe not = not always): ~(∀s) (Ps ⊻ Qs) (S. Chatti 2014a 190).

These formalizations differ from those presented by Nicholas Rescher in his “Studies in the History of Arabic Logic (1963), which are the following:

The universal affirmative: (∀t) (Pt  ∨ Qt)

The universal negative: (∀t) ~ (Pt ∨ Qt)

The particular affirmative: (∃t) (Pt ∨ Qt)

The particular negative: (∃t) ~ (Pt ∨ Qt) (Rescher 233).

Apart from the quantification of times, some of the above formulas might not be satisfying because Rescher uses a unique symbol for the disjunction in all his formulas. If this symbol represents the inclusive disjunction, then the proposition AD can be true when its elements are both true, which does not conform to Avicenna’s examples, since these examples involve incompatible propositions and the disjunction is clearly intended to be exclusive. If it represents the exclusive disjunction, then AD is rendered well but not ID, which would not be different from AD if one considers only one situation. As to ED, its formalization is not obvious.

The literature provides another formalization of the disjunctive quantified sentences, offered by Wilfrid Hodges, which seems promising. This interpretation is the following (where ‘mn’ means munfasil: disjunctive)

(a, mn) At all times t, at least one of p and q is true at t

(e, mn) At all times t, if p is true at t, then q is true at t

(i, mn) There is a time at which p is true and q is not true

(o, mn) There is a time at which neither p nor q is true

These formulas differ from Rescher’s with regard to ED and ID, but AD and OD are rendered in the same way and do not account for the exclusive character of AD. Anyway, all the above formalizations need to be verified by considering all the moods admitted by Avicenna in this part of his logic, and this still needs to be done. Many scholars are interested in and doing research on this subject.

Avicenna combines the disjunctive propositions with the conditional ones to state many new syllogisms that do not all correspond to the known categorical syllogisms. He states a clear correspondence between the conditional and disjunctive propositions, as in his theory, “p → q” is equivalent to “~p ∨ q” (al-Qiyas 251. 16-17) while “p ⊻ q” is equivalent to “p ≡ ~q,” “~(p ≡ q)” and “~p ≡ q” (al-Qiyas 248). These equivalencies make it possible to express any conditional proposition with a disjunctive one. However, some syllogisms such as Darapti and Felapton of the third figure require the A proposition(s) they contain to have a true antecedent; otherwise, they are not valid. Consequently, to validate these syllogisms, the A premise has to be expressed in this way: “(∃s)Ps ∧ (∀s)(Ps ⊃ Qs),” which becomes: “p ∧ (p ⊃ q)” when we consider only one situation. However, then the conditional proposition is no longer equivalent to a disjunctive one, as “p ∧ (p ⊃ q)” is not equivalent to “~p ∨ q”. This introduces some confusion in the definition of the conditional, which does not seem to be formally satisfying. One has to note, however, that the conditional in Avicenna’s theory should not be interpreted as a material conditional, as he does not give it the truth conditions of the material conditional. It seems thus to be an intensional implication, which deserves a more detailed examination.

10. Combining Hypothetical and Categorical Propositions

First of all, what is worth noting in Avicenna’s hypothetical logic is that, in his system, the hypothetical propositions are expressed as couples of propositions, each containing a subject and a predicate and related by a logical operator, that is, either a conditional or a disjunction. For instance, “If H is Z, then A is B,” or “Either H is J or A is B,” or “Whenever H is Z, then A is B” (al-Qiyas 305). Therefore, the elements of the hypothetical propositions are themselves categorical propositions of the SP (subject/predicate) kind. Therefore, it is not astonishing to find some mixed syllogisms containing both hypothetical and categorical propositions and where the subjects and the predicates are themselves related in some way to the premises and the conclusion. An example of such syllogisms is the following:

 A is either B or C or D

 Every B and [every] C and [every] D is H

 Therefore A is H (al Isharat 438)

In this syllogism, the first premise contains two disjunctions (which must be inclusive in order for the syllogism to be valid), while the second premise contains three universal categorical propositions related by conjunctions. This combination leads to a categorical proposition containing a subject and a predicate. The validity of the syllogism is therefore due to the logical relations between the terms A, B, C, D and H, and not only to the relations between the different propositions taken as wholes. The syllogism could be part of the usual [categorical] syllogistic if the disjunction were used in that theory. The whole syllogism shows that Avicenna uses the inclusive disjunction in his system, even if he does not say it explicitly.

Another syllogism looks more like a hypothetical one, as it has the following structure:

  If A is B, then every C is D

  Every D is H

  Therefore if A is B, then every C is H (al Isharat 441)

Here, the first premise and the conclusion contain conditionals, but the second premise is clearly a universal categorical proposition. The validity of the syllogism is due to the logical links between the terms and also the propositions as wholes. The terms are involved because “every C is H” follows from both “every C is D” and “every D is H” by the transitivity of the implication (that is, by a Barbara syllogism). The whole hypothetical syllogism says that if “A is D” implies “every C is D,” it also implies what follows from it, that is, “every C is H.” This can be formalized as follows (where “A is B”: p): “{[p → (∀x)(Cx ⊃ Dx)] ∧ (∀x)(Dx ⊃ Hx)} → [p → (∀x)(Cx ⊃ Hx)].”

This formalization shows the combination between the hypothetical logic and the categorical one.

Therefore, Avicenna’s logic combines the usual syllogistic and his own hypothetical syllogistic. However, the latter is still very close to the usual syllogistic because even in this kind of logic he uses very much the term variables and does not really express the elements of the conditional or disjunctive propositions by single variables. Instead, he represents them with expressions such as “H is Z” or “A is B” and the like. These expressions represent the propositions related by the propositional operators, but they contain only term variables; thus, the hypothetical propositions are not represented by propositional variables as they are in modern propositional logic. Nevertheless, when talking about the conditional propositions or the disjunctive ones at a meta-level, he qualifies these elements by using the words “the antecedent” and “the consequent.” This shows that he treats them as wholes at that level, but he does not use single variables within the conditional or disjunctive propositions.

Avicenna’s hypothetical system is thus very closely related to his syllogistic system, and it would be hard to separate them sharply by considering the former as some kind of propositional logic and the latter as a predicate logic in the modern sense. In addition, the system is only partially formalized, which makes it difficult to determine with enough clarity and accuracy the validity of the hypothetical syllogisms and even the definitions of the logical constants used. However, it cannot be judged without entering into all the details of the syllogisms held valid, and this deserves a separate study whose aim would be to precisely determine the improvements provided by this Avicennian hypothetical logic.

11. References and Further Reading

a. Avicenna’s Logical Treatises and Other Primary Sources

  • Al-Farabi, Abu Nasr. “Kitab al-Qiyas.” Rafik Al Ajam. Ed. al-Mantiq ‘inda al-Farabi. Vol. 2. Beirut: Dar el Machrik, 1986a, 11-64.
  • Al-Farabi, Abu Nasr. “Kitab al-Maqulat.” Rafik Al Ajam. Ed. al-Mantiq ‘inda al-Farabi. Vol. 1. Beirut: Dar el Machrik, 1986b, 89–132.
  • Al Farabi, Abu Nasr. Al-Mantiqiyat li-al-Farabi. Vol. 1. Texts published by Mohamed Teki Dench Proh. Edition Qom. 1988.
  • Aristotle. Prior Analytics. The Complete Works of Aristotle. Revised Oxford Edition. Ed. Jonathan Barnes. Vol. 1. 1991.
  • Avicenna. Mantiq al Mashriqiyyin. Ed. M. al Khatib and A. al Qatlane. Cairo: al maktaba al-salafiyya, 1910, 2-83.
  • Avicenna. An-Najat. Ed. Sabr el Kordi, Mohieddine. 2nd ed. Cairo: Library Mustapha al Babi al Hilbi, 1938.
  • Avicenna. Al- Shifa’, al-Mantiq 2: al-Maqulat. Ed. G. Anawati, M. El Khodeiri, A.F. El-Ehwani, S. Zayed. Cairo: Wizarat al thaqafa wa-l-Irsad al-Qawmi, 1959, 3-273.
  • Avicenna. Al-Shifa’, al-Mantiq 4: al-Qiyas. Ed. S. Zayed. Cairo: Wizarat al thaqafa wa-l-Irsad al-Qawmi, 1964, 3-580.
  • Avicenna. Al-Shifa’, al-Mantiq 3: al-‘Ibarah. Ed. M. El Khodeiri. Cairo: Dar al Kitab al arabi lil tab’ wa-Nashr, 1970, 1-131.
  • Avicenna. Al-Isharat wa l-tanbihat, with the commentary of N. Tusi, intr by Dr Seliman Donya, Part 1, 3rd ed. Dar el Maʻarif: Cairo, 1971.

b. Secondary Sources

  • Arnaldez, R. “Avicenne.” Dictionnaire des philosophes. PUF. 2nd ed. Vol 1(A-J). 1993, 191-199.
  • Asad Q. Ahmed. Avicenna’s Delivrance: Logic. Oxford University Press: Oxford, 2011.
  • Bäck, A. “Avicenna’s Conception of the Modalities.” Vivarium XXX, 2. 1992.
  • Bobzien, S. “Ancient Logic.” Stanford Encyclopedia of Philosophy. Ed. Edward N. Zalta. 2006.
  • Chatti, S. “Syncategoremata in Arabic Logic, Avicenna and Averroes.” History and Philosophy of Logic, 35, 2, 2014a, 167-197.
  • Chatti, S. “Avicenna on Possibility and Necessity.” History and Philosophy of Logic. 2014b, 1-22.
  • Garson, J. “Modal Logic.” Stanford Encyclopedia of Philosophy. Ed. Edward N. Zalta. 2009. http://plato.stanford.edu/entries/logic-modal/
  • Hodges, W. “The Move from One to Two Quantifiers.” The Road to Universal Logic: Festschrift for 50th Birthday of Jean-Yves Beziau. Vol. 1. Ed. Arnold Koslow and Arthur Buchsbaum. Birkhäuser: Basel, 2015, 221-–240.
  • Lagerlund, H. “Avicenna and Tusi on Modal Logic.” History and Philosophy of Logic, 30:3, 2009, 227-239.
  • Lukasiewicz, J. “Contribution à l’histoire de la logique des propositions,” tr. J. Largeault. Logique Mathématique. Armand Colin: Paris, 1972.
  • Movahed, Z. “A Critical Examination of Ibn Sina’s Theory of the Conditional Syllogism.” Sophia Perennis, Vol 1.1. Available at: www.ensani.ir/storage/Files/20120507101758-9055-5.pdf
  • Rescher, N. Studies in the History of Arabic Logic, tr. M. Mahrane. Cairo: 1963.
  • Sabra, A. I. “Avicenna on the Subject Matter of Logic,” The Journal of Philosophy, 77, 1980, 746-764.
  • Shehaby, N. The Propositional Logic of Avicenna, tr. Kluwer, D. Reidel, Dordrecht: 1973.
  • Street, T. “An Outline of Avicenna’s Syllogistic.” Archiv für Geschichte der Philosophie, 84 (2), 2002, 129-160.
  • Street, T. “Arabic Logic.” Handbook of the History of Logic. Ed. Dov Gabbay, John Woods. Vol. 1. Elsevier BV: 2004.
  • Street, T. 2008. “Arabic and Islamic Philosophy of Language and Logic.” Stanford Encyclopedia of Philosophy. (Fall 2008 Edition), Ed. Edward. N. Zalta. Available at: http://plato.stanford.edu/entries/arabic-islamic-language/.
  • Thom, P. 2008. ‟Logic and Metaphysics in Avicenna’s Modal Syllogistic.” The Unity of Science in the Arabic Tradition: Science, Logic, Epistemology and their Interactions. Ed. S. Rahman, T. Street, H. Tahiri. Dordrecht: 2008.

 

Author Information

Saloua Chatti
Email: salouachatti@yahoo.fr
Tunis University
Tunisia

Richard M. Gale (1932—2015)

R.M. Gale courtesay of daughterRichard Gale was an American philosopher known for defending the A-theory of time against the B-theory. The A-theory implies, for example, that tensed predicates are not reducible to tenseless predicates. Gale also argued against the claim that negative truths are reducible to positive ones. He created a new modal version of the cosmological argument for God’s existence, which he later refined with Alexander Pruss. The argument generated considerable interest in the philosophical community. He produced some interesting and sometimes controversial interpretations of both William James and John Dewey. In The Divided Self of William James, he argued that James is a Promethean pragmatist who attempts to ground all truth in ethics. In John Dewey’s Quest for Unity, he represented Dewey as a Promethean pragmatist who holds that human beings are creators of meaning. Gale argued that Dewey attempts to combine this with his own type of mysticism, while never achieving a successful synthesis.

Gale worked at New York University, Hunter College, and Vassar before joining the University of Pittsburgh in 1964 where he remained until retiring in 2003. He was the editor of three books, the author of eight, and he published over one hundred philosophical articles, critical studies, and book reviews.

Table of Contents

  1. Biography
  2. The Philosophy of Time
  3. Negation and Non-Being
  4. The Philosophy of Religion
    1. The Nature and Existence of God
    2. The Gale-Pruss Cosmological Argument
  5. Pragmatism
  6. References and Further Reading
    1. Primary Sources
      1. Books Edited
      2. Books Authored
      3. Articles
      4. Book Reviews
      5. Critical Studies
    2. Secondary Sources

1. Biography

Richard Gale was born on July 13, 1932 and raised on the upper West Side of Manhattan where he attended public schools. Gale describes an uneventful childhood in which he was an average student. However, his older sister Zelda was a brilliant student—a fact of which his teachers never tired of reminding him, and which, he opines, may have initiated the distrust of school teachings that lasted throughout his life. He recalls a strange sense of detachment during that period, feeling as if he were a visiting cultural anthropologist from Arcturus observing the habits of strange earthlings. This, Gale remarks, prepared him for the detached viewpoint of the philosopher, “the spectator of all time and eternity, bent on trying to understand a world that was not worth taking seriously.” Gale remarks that outward observers would not have felt he was so “out of it” because he dutifully followed all the rituals of “the tribe”, reciting the Pledge of Allegiance, singing the national anthem, and attending Sunday school at the Park Avenue Synagogue which, he wryly notes, “wasn’t on Park avenue”. Although Gale did not have any personal interest in religious questions after his teenage years, one positive experience in his youth was conversing with Rabbi Milton Steinberg, “a very great man”. His main interest at this time was athletics. Gale describes himself as a great player in practice who “stank” in the games because he could not relax. He liked philosophy because it did not require one to perform well in the crucial moment. One can think and rethink it in private until one gets it right. It is fortunate, Gale remarks, that he did not become a brain surgeon.

Gale’s other interest in his youth was music, especially jazz. His father owned the Savoy Ballroom and managed Chick Webb, Ella Fitzgerald, Cab Calloway, and the Ink Spots. He booked many leading black jazz musicians of the day, including Dizzy Gillespie, Charley Parker, Lester Young, Art Tatum, Errol Garner, and Sarah Vaughn, many of whom Gale knew as a teenager. He studied a version of the Schillinger system of musical composition and arranging with Edgar Sampson and organized a six piece combo in high school in which he played piano and did the arrangements. Gale remarks that “after almost every gig we had to change our name so we could get another job”. He began with “Richard Gale and His Manhattan Serenaders,” followed by “Richard Gale and His Rhumba Orchestra,” and, eventually, “Richard Gale and His Society Orchestra.” His “typical gig” was a bar mitzvah in Brooklyn where they were not paid but could eat as much as they wanted. “We were,” Gale admits, “a very oral bunch of guys.”

Gale did not want to go to college but gave in to his parents. Given his expectation that he would go into his father’s music business, he pursued a Bachelor of Music degree at Ohio Wesleyan University where he soon discovered that he had “everything that it took to be a great composer except talent”. His strategy was to write twelve tone musical compositions because he thought he would not be criticized for them because no one would expect to understand them. When one of his compositions was performed at the college the best his friends could say was “Very interesting,” which, Gale inferred, meant they hated it.

Since most of Gale’s time was put into his music classes and the Air Force Reserve Officer Training Corps, he had little time for other subjects. Recognizing that his career as a great composer was not on the cards, and, having no idea what philosophy was, he took a philosophy course taught by William Quillian, Jr. Here, he finally found what he was hungering for, something one does for its own sake, but he “did not know what it was”. Gale became an outstanding student in philosophy classes though he still laboured to get B’s in his other courses. Several of his junior year essays received prizes, enabling him to “step out of my sister’s shadow and … believe in myself”. He credits Lloyd Easton, who specialized in American Hegelianism, as his best teacher. A guest lecture on Dewey by George Raymond Geiger, whose “brilliance and humour blew me away,” inspired him to write his senior thesis on Kant’s and Dewey’s aesthetic theories under Easton and visiting professor G.P. Conger. He found Conger, best known for his Theories of Macrocosms and Microcosms in the History of Philosophy, a “delightful and fascinating man.” Years later Gale described his senior thesis as “the work of a mad dog Deweyite that made Kant look like he should have had his tenure revoked.”

Upon graduation in 1954 Richard served two years as an intelligence officer in Japan, which he “loved”. After leaving the Air Force, he joined his father’s music business at BMI as a “song plugger”. His job was to “romance” disc jockeys to “get them to play our songs.” The company had many big hits at that time including Elvis Presley’s “Don’t Be Cruel” and “All Shook Up.” Richard soon discovered that he “hated” the business because to be successful one had to manipulate people into doing one favours and he feared that he was “turning into the sort of person who is successful in this business”.

During this time he took a two-semester night course at NYU on existentialism with William Barrett, where reading Kierkegaard’s Fear and Trembling became “the most important event in my life.” Fear and Trembling was “the moral goose I needed to give me the strength and courage to leave the music business” and “do what I really loved.” He was fascinated by Barrett’s portrait of Kierkegaard’s “knight of infinite faith who can achieve authenticity only through making” a decision that appears “absurd to the world.” The music business “was my Regina!” His parents were not amused. Gale left the music business and began graduate studies at NYU in 1957. During this period he met his wife Maya Mori, who had just graduated with a degree in fine arts from Nihon University in Tokyo and was studying interior design in the United States. Two and a half years later they married.

Although perhaps not the usual experience, Gale found graduate school to be “paradise”: “Every day I had to pinch myself to believe that one could really make a living doing philosophy …” Gale published several of his term papers in respected professional journals but admits that he had a big advantage over his peers. Since normal graduate students had not suffered the horror of the music business they were at a motivational disadvantage. Gale later described his early publications as proof of the danger of rushing into premature publication, remarking that Walter Stace, for whom he had written one of them, told him that his courage exceeded his ability

Gale’s chose NYU because he was “an ardent Deweyite” and wanted to study with Sidney Hook, “the high priest of pragmatism.” His master’s thesis is titled Dewey and the Paradox of the Alleged Futurity of Yesterday. However, he wrote a paper for visiting professor Anthony Flew on McTaggart’s views on time and made a “decisive turn to ordinary language philosophy.” This began his passion for the problem of time that never left him.

He wrote term papers on Aristotle’s theory of time, Kant’s theory of time, Husserl’s theory of time, and so forth, one after another in every subsequent course. He wrote his Ph.D. thesis, The Concept of Time in 20th Century Analytic Philosophy, under Milton Munitz, “a saintly man whose egoless love of philosophy was inspiring.”

Gale entered the job market in January of 1961 and discovered that he had a peculiar knack for “blowing” job interviews because his aggressive quality “scared people.” In a panic, he changed strategy and took his wife Maya along to an interview at Vassar. He was offered the job. According to Gale, the better part of his success in life can be credited to Maya. His major accomplishment was “marrying a woman who was a much better person than I.” After meeting her at a party in New York City in 1957 while still in the music business she became “my moral and spiritual vitamin pills, the centre of gravity for my life.” In the acknowledgments in The Divided Self of William James he profusely thanks Maya for her strength, courage, loyalty, spirituality, and sweetness, and adds “in the most heartfelt words I ever wrote [that it] is no accident that animals and birds are attracted to her. They know something.” With his relationship with his wife on a firm footing it was no surprise when children began appearing in quick succession. Andrew was born in 1961, Laurence in 1963, and Julia in 1965.

Soon after becoming an instructor at Vassar, Gale wanted to relocate. Although the students were good and his colleagues “pleasant enough” he needed to get to a top department where his colleagues and graduate students “would beat on me and from whom I could learn.” “I had to publish my way out of Vassar” because Vassar was filled with great teachers “who thought that if you published it proved you didn’t love your students.”

Eventually Gale received offers from Santa Barbara and Pittsburgh. His reaction to Pittsburgh was: “love at first sight.” Pittsburgh was “the space-time capital of the world.” In addition to praising the distinguished Pitt faculty he describes how the “hot shots” among the graduate students “put the fear of God in me”—high praise indeed since God himself had failed to do so. Pitt’s philosophy “hot shots” engendered “panic attacks” on his way to his graduate classes. He audited Sellars’ courses and describes Sellars as “the master tease”. A typical Sellars undergraduate lecture would consist of about thirty minutes of brilliant discussions of the history of ideas, followed by about twenty minutes of criticisms of someone’s views, but when it came time deliver the goods the “bell would ring” and Sellars “would issue a promissory note that never got cashed.” Gale remarks that Sellars, “like the Shadow” had “the ability to cloud men’s minds and make them see what he wanted them to see”—reminiscent of the criticisms Nietzsche levelled at Socrates. Gale later audited classes by Hempel, Salmon, Glymour, Earman, and numerous international visitors at Pitt’s world-famous Centre for the Philosophy of Science. Gale opines that he received “the equivalent of a second Ph.D. in philosophy” at Pitt.

Although Gale had not had any personal interest in religion since his youth, he became interested in the philosophy of religion through his teaching assignments. For, he loved to argue and “no area is more loaded with foxy arguments than the philosophy of religion.” Gale made it his vocation to be the “fly in the ointment” of the analytically-oriented contemporary defenders of theism, Plantinga, Alston, Adams, and Swinburne. It therefore came as a surprise to him when he came up with a new version of the cosmological argument for God’s existence, which he believed works, “sort of”.

Gale called his friend Adolf, “an atheist’s atheist,” and laid out his new argument. “What’s the catch?,” Adolf asked, and Gale answered “There isn’t any: It works.” After a long silence Adolf asked, “Why did you do it, Richard?” Even though Gale passionately defended his new cosmological argument, he remained a non-believer: “Philosophy has never entered into his personal life.” When asked if he is an atheist, agnostic, or theist, he replied, “None of the above.” “If one needs to ask what is the meaning and purpose of life … one has lost one’s way”. “The examined life isn’t worth living”. “The value of doing philosophy professionally is that it enables one to live unphilosophically.”

Gale’s reservations about the value of philosophy were not simple anti-intellectualism. In response to those who feel that being a good person requires that one engage in philosophical reflections on what sorts of reasons should guide one’s life, Gale replied that he is “incapable of that kind of self-reflection. The one thing that “always shall be and always should be a mystery to us is ourselves”—once again, reminiscent of remarks by Nietzsche. Gale admitted one exception to keeping philosophy out of his personal life. He reported telling his friend Alexander Pruss that his defence of his cosmological argument is the first time he became personally involved in a philosophical argument, to which Pruss replied this is because that is his only argument with a true conclusion.

Richard Gale was a great teacher and a warm and caring person, helpful to students and colleagues, always ready to talk philosophy, and possessed of a wicked sense of humour that brought much levity into the world. After a remarkably varied and satisfying life, Gale passed away in his sleep on July 19, 2015 in Knoxville, Tennessee.

2. The Philosophy of Time

Richard Gale published his widely used anthology, The Philosophy of Time with Anchor in 1967. He recalls requesting that a picture of “the river of time be put on the cover” with an “observer on the bridge shining [a] spotlight of presentness onto the water, illuminating one of the events in [their] history floating on the river’s surface” and was “floored” by the “brilliantly astute” question from Anchor’s art department: “What age should the observer be?” This, Gale remarks, discloses the way in which the observer on the bridge has to be “a transcendent spook, similar to Vonnegut’s [timeless] Tralfamadorians.” Gale asked whether Anchor could put a Picasso-like multi-dimensional depiction of the observer to capture the observer at every age in their life. Anchor declined but put a picture of a pocket watch reflected in several mirrors on the cover, illustrating the difficulty in obtaining a non-paradoxical portrait of the observer’s perspective on their own temporal history.

The distinction between the A and B theories of time was first made by McTaggert in 1908. In 1968 Gale published The Language of Time in which he gives an “impassioned defence” of the A-Theory of time. The competing B-theory is the view that time is nothing but a temporal series of events running from earlier to later, where the distinctions between past, present and future are reducible to temporal relations between the events that are in time. Gale’s A-Theory holds that tense-distinctions are irreducible. B-Theorists typically reply that if tensed distinctions are irreducible it is because they are subjective. It is, however, worth noting that later B-theorists, such as Mellor and Smart, admit that the translation of A-sentences into B-sentences often doesn’t succeed, but hold that one can explain the truth conditions of any tensed declarative sentence without appeal to tensed facts, and then use Occam’s razor to get rid of the offending tensed sentences. See “Are There Essentially Tensed Facts?

B-Theorists often argue for the subject-dependency of the “now” or the present, and thus of the past and future, by an analogy between the “now” and the “here”. B-theorists argue that since everyone admits that there is not an objective “here” (everywhere can be “here” to someone), and since the logic of “now” is analogous to the logic of “here”, the “now” is as subjective as the “here”. Gale replies that there are deep disanalogies between the “now” and the “here” that suggest that the “now” is objective in a way that the “here” is not because our temporal perspectives are “imposed on us” in a way that our spatial perspectives are not. S, currently in Pittsburgh, calls Pittsburgh “here,” but could easily hop a plane to Singapore, whereupon Singapore would be “here,” and then, hop another plane back to Pittsburgh whereupon Pittsburgh would again be “here.” By contrast, S, in 2016, cannot hop into a device that transports S to 100 B.C. and then return via the device, to 2016 again.

Gale (1964c) argues that agential-based asymmetries between the past and the future are rooted in our concept of causation for which there are no spatial analogies. We can bring about events in the future but not the past. S, currently in Pittsburgh, can, theoretically, causally impact events at any point in space, for example, in Singapore, but cannot causally impact events that happened in 100 B.C. Space and time, from the perspective of the causal-agent, are just different.

Gale stresses that he is not arguing for the objective reality of a “queer” entity, the present or the “now”—which is disclosed by that transcendent spotlight of presentness on the River of Time. That “queer” view of temporal becoming leads to a contradiction. Rather, “The present” and the “now” are rigid designators: The proposition expressed by “Now might not be now” is necessarily false. However, if some time passes, and the present shifts to a later time, “then now will not be now – this very moment of time – at some later time, in violation of the necessity of identity between individuals designated by rigid designators.” As Gale puts it, “entiative theories about the present”, theories that see time as some queer sort of entity, confuse “the what with the how of reference.” There is nothing mysterious about what is denoted by a use of “now”. If Sue tells her partner “now’s the time,” no one need consult a philosopher about what is meant. What is mysterious is the how in such references because “we are prisoners of time [with] a unique temporal perspective … imposed upon us.”

Finally, Gale discusses the disagreement between the A- and B-theorists regarding whether there is “a bifurcation between man and nature.” Smart and Grünbaum argue that science abstracts from personal indexical expressions because they are subjective and not needed for a complete objective description of the world. It is not important to science that someone is now rolling balls down inclined planes with certain results. It is only important that at some time t1 certain balls were rolled down an inclined plane with certain results. Gale calls this the “error theory of tensed temporal perspectives”. Presupposing his views about the objective disanalogies between spatial and temporal indexical expressions, Gale rejects this as a “scientistic” bias that needs only to be stated clearly to be refuted.

3. Negation and Non-Being

Gale believes that his critique of B-theories of time prepared him for his next major interest in negation and non-being. Just as the early B-Theorists of time attempt to translate all A-propositions into B-propositions, some philosophers aspire to translate all negative statements into positive ones (to eliminate irreducibly negative truths about the world). Gale attempts to escape this reduction by devising adequate criteria for the legitimate use of negative propositions. Gale (1970a) argues that otherness and incompatibility analyses fail to reduce negative to positive propositions. One cannot analyse “This is not green” into the conjunction of “Every positive property of this is other than greenness” and “There is some positive property of this that is incompatible with greenness.”

Although Gale believes that the attempt to reduce negative propositions to positive incompatibility propositions fails, he does credit it with helping to demystify negative facts and events. The problem with negative facts is that we do not possess adequate identity-conditions for them. There is no non-arbitrary criterion for answering questions like “How many forest fires did not occur yesterday?” Even though Gale rejects the attempt to reduce negative to positive propositions he (1972) admits that incompatibility and otherness analyses do give useful extensional criteria of identity for negative events which satisfy Parmenides’ injunction against referring to that which is not (because these sorts of analyses only quantify over positive properties and existent individuals). If asked how many forest fires did not occur yesterday one can reply, “One did not occur in the Allegheny National forest, another did not occur in the Mendocino National Forest, another did not occur …, and so on” (where the “and so on” is not problematic because it only alludes to existing forests).

4. The Philosophy of Religion

a. The Nature and Existence of God

Although it may seem that analytical philosophy is inherently antithetical to religious belief (1991a, 2), a new breed of analytically-oriented philosophers, notably Plantinga, Alston, Adams, and Swinburne, has emerged to defend the rationality of theism—the view that an omnipotent, omniscient, and omnibenevolent creator of the world exists. Gale’s On the Nature and Existence of God is the fruit of his aim to be the “fly in the ointment” of these analytical theists. Gale’s project is positive in that it aims to improve the theistic concept of God. It is negative in that it argues that the grounds for theism are often shaky (1991a, 2-3). The spirit of Hume’s Philo imbues the book (1991a, 2). Gale does not pretend to answer the question of God’s existence because he does not consider arguments from beauty or design (1991a, 1). The first part of the book, “Atheological Arguments,” considers arguments that attempt to show that there is a logical inconsistency in the theist’s concept of God (1991a, 15). “Blasphemy aside,” this part of the book helps one “redesign” the concept of God (1991a, 3-4). The second part, “Theological Arguments,” critically examines the main traditional arguments for theism. Gale divides these into the epistemological arguments, the ontological argument, the cosmological argument, and the argument from religious experience, and “prudential arguments,” the latter of which purport to establish the prudential or moral benefits of believing in the theist’s God. Gale is primarily concerned with the former and only deals with prudential arguments in Chap. 9 (1991a, 3).

Since the sort of “historical-cum-indexical” account often given for fixing the reference of ordinary proper names does not work for supernatural beings, and since different theists have widely divergent views about their God, Gale, in the introduction, attempts to show how “we” and Abraham can refer to the same God (1991a, 4-10). Gale’s positive account, which does not imply that God exists, distinguishes between “hard” and “soft core” features of God (1991a, 7). A hard core feature, which is important in determining reference, is one without which God could not be God (1991a, 10). Soft core features can be abandoned without change of reference. Absolute simplicity is given as an example of a soft core feature and being supremely great as a hard core feature (1991a, 8). The hard core features are high level “emergent” properties, for example, that God is eminently worthy of worship is an “emergent” property that supervenes on God’s lower-level properties of omnipotence, omniscience, and so forth. Gale does not provide an account of this kind of emergence, claiming that the emergence-connection is “loose” (1991a, 8). Gale’s positive account of how “we” fix the reference of “God” involves both an historical-causal theory and the requirement that the co-referring speakers share the same form of religious life (1991a, 9-10). What counts as the same religious community over time is “a deep issue” that Gale does not pretend to solve (1991a, 10-11).

Chap. 3 deals with “The Omniscience-Immutability Argument.” That is, if God is omniscient He must know everything. That means that He must know temporal facts in the A-series (involving irreducible indexical expressions like “It is now 3 PM”). However, if God is not in time, He cannot know A-series facts because He has no “now” to know (1991a, 58). But if He is in time then He is not immutable. For if He knows what is now true then what is true of Him differs from moment to moment. Gale considers whether one might escape this dilemma by restricting God’s omniscience to B-series facts (A-series facts being beneath His timeless knowledge) (1991a, 60). However, it is religiously important for God to know A-series facts (1991a, 72-73), for example, for God to know that Abraham is now preparing to sacrifice Isaac. Gale concludes that one must eliminate timelessness and immutability from the concept of God (1991a, 95-97).

Chap. 4, the longest chapter in the book, discusses the deductive problem of evil. Gale argues that Mackie’s arguments either fail or depend on disputable premises. He reformulates this as the problem of what “morally exonerating excuses” God might have to allow evil in the world (1991a, 32, 105). If no morally exonerating reasons are found then one concludes on inductive grounds that theism is incompatible with the existence of evil. If, however, theists can show that God has morally exonerating reasons for allowing evil then theism is consistent with the existence of evil. Plantinga’s version of the “free will defence,” the view that the existence of free will outweighs the evil produced by free will, “is a thing of beauty.” (1991a, 113) Plantinga’s key claim is that it is logically possible that a maximally perfect God is not able to create a world in which there are persons that are free to do good or evil but that always do the good. Since a world with free persons who do more good than evil is better than any world without free persons it seems reasonable to infer that it is logically possible that God has a morally exonerating excuse for permitting evil by creating such a world. Gale rejects most of the common objections to Plantinga’s free-will defence. However, Gale argues that Plantinga’s free-will defence eliminates human freedom and makes God not only responsible for people’s evil deeds but morally blameworthy for them.

First, Gale argues that Plantinga’s free will defence makes God the cause (in the sense of moral responsibility and blame) of the actions of those created free persons. Since God knows that S will freely do evil then, in creating S, God is both responsible and blameworthy for the evil that S does (1991a, 153-156). Second, Gale argues that since Plantinga’s God has “middle knowledge” of His created human beings, that is, knowledge of what possible things would do in different circumstances (NEG, 131), they are not really acting freely (1991a, 133). God does not merely know that possible person J is constructed in a certain way. God knows that if J is put in a certain circumstance C then J will behave in a certain way W. Gale then urges an analogy between God’s creating J while possessing “middle knowledge” of what J will do in C with cases where someone programs J’s psychological makeup so that J will act in certain way in C. If one induces J, who has an amorous nature, to call Alice for a date by telling J that Alice is interested in him, one does not cancel J’s freedom—one does not drug, hypnotize, or brainwash J (1991a, 157). However, the nature of God’s control over his created beings is freedom-cancelling (1991a, 122, 153ff). Gale does not claim this analogical argument is conclusive but only that it casts reasonable doubt on Plantinga’s free-will defence (NEG, 158-160).

In the second Part of the book Gale considers several of the traditional arguments for the rationality of belief. Chap. 6 considers the ontological argument, which comes in two forms, Anselm’s original form and the updated modal version. Since Gale agrees with Plantinga’s critique of Anselm’s argument (2003a), most of his discussion is concerned with Plantinga’s modal argument. Plantinga’s argument requires the premise that there is some possible world that contains a being that necessarily exists and is maximally perfect. Following modal theorem S5 (If it is possible that it is necessary that p then it is necessary that p), Plantinga infers that God necessarily exists. Gale replies that many atheists would not agree that there is some possible world in which a necessarily existing maximally perfect being exists (1991a, 226). Indeed, since accepting that possibility-premise as it is understood in S5, where possibly necessary means necessary, asking the atheist to accept it begs the question.

Chap 8 considers the argument from religious experience. Gale questions whether powerful religious experiences are cognitive and whether they provide evidence for the existence of God (1991a, 286). The argument that religious experiences are cognitive is usually premised on an alleged analogy to sensory experiences (1991a, 316). Gale argues that arguments by Alston, Gutting, Swinburne, and Wainwright fail to show that religious experiences are relevantly analogous to sensory experiences (1991a, 326). Since God does not occupy space and time, there can be no veridical (non-sensory) experience of God, and if an experience is non-veridical it cannot constitute evidence for the existence of its “object.”

The most famous of the “prudential” arguments is Pascal’s wager. First, “Pascal’s wager” is not really a wager (1991a, 353). If God exists and one bets wrong, one does not lose one’s earthly life. Gale also raises the “many Gods” objection against Pascal. That is, formulations of the “wager” presuppose that either there is a God who rewards believers with infinite bliss or there is no God at all. However, many kinds of God are possible. Imagine a God who rewards people who step on every third sidewalk crack and condemns those who do not to infinite punishment (1991a, 350). The logic of the “wager” gives one as much reason to believe in this bizarre God as it does to believe in the theist’s God.

Gale argues against James’ “Will to Believe” argument, claiming that James fails to show that one can have sufficient moral reasons for “self-inducing an epistemically unsupported belief” (1991a, 283). Rather, to believe propositions unsupported by evidence violates one’s duty as a rational person and undermines one’s own personhood (1991a, 372, 376, 382-383).

b. The Gale-Pruss Cosmological Argument

Gale (1999c) published a new modal version of the cosmological argument. Gale and his former student, Alexander Pruss, jointly published a simplified and strengthened modal version of the argument (hereafter GP) titled “A New Cosmological Argument.” GP does not purport to prove the existence of Anselm’s God but does purport to prove that “there [necessarily] exists in the actual world a very powerful and intelligent supernatural designer-creator of this world’s universe.” Call this being, similar but not identical to Anselm’s God, “G”. GP also holds that G is a self-explaining being in the sense that there is a successful ontological argument for its existence, even if one is not able to provide it.

GP presupposes the notion of a possible world. GP also employs the notions of contingent and necessary propositions. Contingent propositions are both possibly true and possibly false (true in some possible worlds and false in others). Necessarily true propositions are true in all possible worlds. GP holds that a possible world is “a maximal, compossible conjunction of abstract propositions”. It is maximal in the sense that for every proposition q, either q or not-q is a conjunct in this conjunction. Every possible world is compossible in the sense that it is conceptually or logically possible that all of the conjuncts in the conjunction that specifies that world can be true together. There are no impossible (contradictory) worlds. There are also possible worlds that have different laws of nature than those in the actual world. Further, GP treats necessary truths as facts. It is a “fact” in the actual world that it is not both raining and not raining (at the same time in the same place).

GP admits that it would be unfair to assume the strong Principle of Sufficient Reason (PSRs) that there is a sufficient explanation for any fact in the world because that is close to assuming that there is a sufficient first cause for these facts, which most opponents of the cosmological argument would reject. GP only asserts a weak version of PSR (PSRw), that it is possible that such facts have an explanation. Opponents of the cosmological argument cannot reasonably deny that a sufficient explanation for such facts is possible. Finally, GP assumes the modal axiom S5 that if it is possible that it is necessary that p then it is necessary that p. For example, since it is possible that 2 + 2 = 4 is necessarily true then, by S5, it is necessarily true that 2 + 2 = 4.

The simplified core of GP is this: (1) If it is possible that a necessary supernatural creator of the actual world, G, exists, then it is necessary that a supernatural creator of the world, G, exists. (2) It is possible that a supernatural creator of the world, G, exists. Therefore, (3) it is necessary that a supernatural creator of the world, G, exists.

The first premise is an instantiation of S5. If one accepts S5, the second premise is the key and most of GP is concerned with the second premise. GP defines the Big Conjunctive Fact (BCF) of a possible world, the maximal compossible conjunction of all propositions that would be true of that world if it were actualized. The BCF of the actual world consists in all the propositions (both necessary and contingent) that are actually true.

Since all possible worlds share the same set of necessary propositions, the distinction between the different possible worlds resides in their different contingent propositions that would be true in those worlds if they were actualized. A contingent proposition (or being) is one that possibly, in a broadly conceptual or logical sense, is true (or existent) and possibly is false (or nonexistent). The Big Conjunctive Contingent Fact (BCCF), which contains the contingent propositions that would be true in a possible world, were it actualized. Since all possible worlds share the same set of necessary propositions, the different possible worlds are individuated by their different BCCF’s. Since every possible world is maximal, then, for every contingent proposition p, either p or not-p is a conjunct in this BCCF. Thus, no two possible worlds can have the same BCCF.

GP calls the BCCF of the actual world “p”. Thus, p includes the existence of and non-existence of all contingent beings in the world and the occurrence and non-occurrence of all contingent events in the actual world, but it also includes all the contingent acts of any necessary beings that may exist in the actual word. GP argues that p has a sufficient explanation in the actual world. Although many philosophers will deny that p does or must have a sufficient explanation, they cannot deny that it is possible that there is some possible world w1 that contains p but also contains q and the proposition that q explains p. But this hypothetical possible world w1 turns out to be identical with w, the actual world. Suppose there were some difference between w and w1. Then, since possible worlds are maximal, there must be some proposition r true in w1 but not true in w. But if r is true in w1 and not in w, then, by the law of the bivalence, ~ r is true in w. Since, however, by hypothesis, everything actually true in w is also true in w1 (because w1 contains p), ~ r is true in w1 as well. But this means that r and ~ r are true in w1, which is a contradiction, and that cannot be, because w1 is supposed to be a possible world. Since the assumption that w1 is a different from w implies a contradiction, it follows that w1 is the same as w. Thus, the actual world w contains q and the proposition that q explains p. Thus, the actual world w contains a sufficient explanation of its p.

What is the nature of this q? GP claims that it is a conceptual truth, of which theonly sorts of explanations it can conceive are scientific and personal explanations. Call this claim SPE. A scientific explanation explains why some proposition is true by reference to some conjunction of law-like propositions and at least one contingent proposition that reports a state of affairs at some time. A personal explanation elucidates why some proposition is true by reference to the intentional action of an agent. GP admits that there might be types of explanation that are beyond our ken, but, “in philosophy we ultimately must go with what we can make intelligible to ourselves ….” One is left with the choice between scientific and personal explanations. But q cannot be a scientific explanation. Since scientific propositions are contingent, this would mean that q is part of w’s BCCF and that would mean that the explanation of w’s BCCF is a part of that BCCF. Since “law-like propositions cannot explain themselves,” q can only, by disjunctive syllogism, be a personal explanation that reports the intentional actions of some being.

However, q cannot report the actions of a contingent being. If it were contingent, the proposition that states its existence would be part of the BCCF of w and, therefore, “q itself is not able to explain why the contingent being it refers to exists, since a contingent being’s intentional action” presupposes “and hence cannot explain, that being’s existence.” Thus, q refers to the intentional actions of a necessary being.

One might assume that q is a necessary proposition because it reports the action of a necessary being. But appearances are deceiving. GP argues that “q is a contingent proposition that reports the free intentional action of a necessary being.” GP most favored argument for the claim that q is a contingent proposition is a reductio based on the assumption that q is necessarily true. If q is necessarily true, then q is a conjunct in the BCF of all possible worlds. Since, however, q entails p (the BCCF of w), and since a possible world is individuated by its BCCF, it follows that every possible world is identical with w, which means that there is only one possible world. Since, despite protests from Spinoza and Leibniz, this is absurd, q is a contingent proposition.

Since G is a necessary being it satisfies one key component of the traditional notion of a theistic God. But some versions of traditional theism, for example, Leibniz, found it hard to account for the apparent contingency in the world. That is, even though Leibniz condescends to call certain propositions “contingent”, he is committed to hold that from God’s point of view (the only truly adequate one) all propositions are necessary (Russell, 1967, 60-61). Thus, Leibniz cannot use GP because there is, for Leibniz, really no such thing as a BCCF. The ingenuity of GP is that it finds genuine roles, which Leibniz could not accept, both for the contingent and the necessary. The genuine contingency in the world rests on the free choices of a necessary G.

Oppy (2000) counters that GP is committed to PSRs. Since accepting PSRw commits the opponents of theism to accepting PSRs, GP is committed to accept the PSRs that it hoped to avoid. Gale and Pruss (2002) accept that that PSRw entails PSRs but hold that though PSRs is entailed by their argument, it is not a premise in their argument. Thus, the only reason opponents of GP have for rejecting PSRw is that it leads to the theistic view that they reject.

Despite the facts that Gale has fiercely defended GP, he remains an “unbeliever”. So “why did you do it Richard?” The answer is that Gale is still the “detached … spectator of all time and eternity” following the logic wherever it leads, even to a proof of God’s existence, in which he is personally uninterested, but which, he believes, works—“well sort of”.

5. Pragmatism

 In the early 1990’s Gale found himself drawn back to his “old flames”, James and Dewey, and “fell madly in love with James again”. Gale’s book, The Divided Self of William James, attempts to formulate James’ ethics based on his “Promethean pragmatism”. James’ Promethean pragmatism consists in three propositions: 1.) We are always morally obligated to maximize desire satisfaction over desire dissatisfaction, 2.) Belief is an action, therefore, 3.) We are always morally obligated to believe in such a manner that maximizes desire satisfaction over desire dissatisfaction (1999a, 11, 25). James’s aim is to ground, not just beliefs, but also meaning, reference, and truth, in ethics. Working from his basic empiricism, James attempts to determine the ontological status of ethical terms by analysing one’s experiential reasons for predicating them, which leads him to reject the Platonic view that the good is determined by the Form of the Good prior to the existence of sentient beings. Gale approvingly quotes James’ remark that “neither moral relations nor the moral law can swing in vacuo” (1999a, 27). Gale argues that one of the reasons for the failure to appreciate James is that Dewey co-opted him for his own ends by naturalizing James’ views, thereby eliminating the mystical and spiritual dimensions that had motivated James (1999a, 335).

Gale’s book, John Dewey’s Quest for Unity, is an attempt to come to terms with his other early love. Gale argues that Dewey attempts to combine Prometheanism with his own unique type of mysticism, but never achieves a successful synthesis. Gale’s Dewey sees human beings as Promethean creators of meaning via action in nature where artistic creation is the paradigm of creative synthesis (but something similar holds for the creation of knowledge and moral action). The problem, according to Gale, is that the remnants of the absolute Idealism from Dewey’s early Hegelianism extend into his mature period (2010a, 163). Though Dewey constantly appeals to experience, he has two notions of experience, one being the ordinary common sense notion of experience that results from the interaction between and organism and its environment, the other being a Plotinian-Hegelian “Absolute, the only true individual, with everything emanating out of it” (2010a, 163).

Before beginning the Dewey-book, Gale feared that he would not be able to write a very positive book on Dewey. Many critics agree, arguing that Gale greatly misunderstands Dewey. Dewey’s problem, Gale holds, is that he tries to ground his grand normative vision in a “misbegotten metaphysics” (2010a,16). Gale admits that his vision of a Hegelian Absolute experience “goes well beyond the letter of [Dewey’s] text” (2010a, 162). Gale replies that he is trying to free Dewey’s view, with which he obviously has some sympathy, from the remnants of an obscure Hegelian metaphysics that distorts Dewey’s potentially very valuable philosophy. Indeed, Gale states that if the greatness of a philosopher consists in the beneficial effects their writings have upon the reader, then “Dewey must be reckoned the greatest philosopher of all time” (2010a, 9).

6. References and Further Reading

a. Primary Sources

i. Books Edited

  • Gale, Richard, (ed.). The Philosophy of Time New York Anchor Doubleday Books, 1967
    • Gale contributed five ten-page introductions to the five sections of the book.
  • Gale, Richard, (ed.). The Blackwell Companion to Metaphysics Oxford: Blackwell, 2002
  • Gale, Richard, with Pruss, Alexander, ed’s. The Existence of God International Research Library of Philosophy Dartmouth Publishing, 2003
    • This anthology of articles by numerous authors provides a useful companion to Gale’s On the Nature and Existence of God and contains a “mammoth” introduction by Gale and Pruss.

ii. Books Authored

  • Gale, Richard. The Language of Time London: Routledge & Kegan Paul, 1968
  • Gale’s primary articulation and defence of the A-theory of time against the B-theory
  • Gale, Richard. Negation and Non-being, American Philosophical Quarterly Monograph No. 10. Oxford: Blackwell, 1976
    • Gale claimed that this book fell “stillborn from the presses” because it went un-reviewed but holds that it was at the time the only systematic historical and philosophical discussion of these issues.
  • Gale, Richard. On the Nature and Existence of God London: Cambridge University Press, 1991a
    • Gale’s critical response to a variety of different analytical arguments, both for and against theism
  • Gale, Richard. The Divided Self of William James London: Cambridge University Press, 1999a
    • Considered by many to be Gale’s best book, Gale describes James as a synthesis of ‘Promethean’ Pragmatism, which holds that language and concepts are a means for controlling nature, and a mystic who believes that ultimate reality is inaccessible to conceptualization.
  • Gale, Richard. The Philosophy of William James: an Introduction. London: Cambridge University Press, 2004
    • An accessible introduction to Gale’s view of James
  • Gale, Richard. On the Philosophy of Religion Boston: Wadsworth, 2007.
    • A useful clearly written textbook on central issues in the philosophy of religion
  • Gale, Richard. John Dewey’s Quest for Unity: The Journey of a Promethean Mystic. Amherst: Prometheus Press, 2010
    • Argues that Dewey’s philosophy is valuable because it attempts, unsuccessfully, to synthesize the Promethean view that human beings are creators of meaning with Dewey’s own brand of mysticism
  • Gale, Richard. God and Metaphysics Amherst: Prometheus Press, 2010
    • A collection of Gale’s seminal articles in the various areas of philosophy, including God, time, non-being, and pragmatism

iii. Articles

  • Gale, Richard. “Russell’s Drill Sergeant, Bricklayer and Dewey’s Logic,” Journal of Philosophy 56 (1959): 401-406
  • Gale, Richard. “Natural Law and Human Rights,” Philosophy and Phenomenological Research 20 (1960a): 521-531
  • Gale, Richard. “Mysticism and Philosophy,” Journal of Philosophy 57 (1960b): 471-481
  • Gale, Richard. “Endorsing Predictions,” Philosophical Review 70 (1961a): 376-385
  • Gale, Richard. “Professor Ducasse on Determinism,” Philosophy and Phenomenological Research 22 (1961b): 92-96
  • Gale, Richard. “Tensed Statements,” Philosophical Quarterly 12 (1962a): 53-59
  • Gale, Richard. “Can a Prediction’ Become True’?” Philosophical Studies 13 (1962b): 43-46
  • Gale, Richard. “Dewey and the Problem of the Alleged Futurity of Yesterday,” Philosophical and Phenomenological Research 22 (1962c): 501-511
  • Gale, Richard. “A Reply to Smart, Mayo, and Thalberg on ‘Tensed Statements’,” Philosophical Quarterly 13 (1963a): 351-356.
  • Gale, Richard. “Some Metaphysical Statements about Time,” Journal of Philosophy 60 (1963b): 225-237
    • Argues that the paradox in various metaphysical statements about the unreality of time is revelatory
  • Gale, Richard. “Is It Now Now?” Mind 73 (1964a): 97-105.
    • Attempts to exhibit the three necessary conditions for tensed communication. Argues that the Present and the ‘Now’ are rigid designators and therefore are not subjective
  • Gale, Richard. “A Reply on the ‘Alleged Futurity of Yesterday,” Philosophy and Phenomenological Research 24 (1964b): 421-422
    • Discusses the question whether there is a paradox in the pragmatist view that the meaning of any statement is the sum-total of experiential consequences that can be found in our experiential future
  • Gale, Richard. “The Egocentric Particular and Token-Reflexive Analyses of Tense,” Philosophical Review 73 (1964c): 213-228
    • Argues against the B-theorist’s view that the seemingly irreducible features of tensed propositions are subjective
  • Gale, Richard. “On Believing What Isn’t the Case,” Proceedings of the XIIIth International Congress of Philosophy (1964d)
  • Gale, Richard, with Douglas McGee and Frank Tillman. 1964. “Ryle on Use, Usage, and Utility,” Philosophical Studies 15 (1964): 57-60
    • Argues that Ryle’s distinction between linguistic use and usage, in order to distinguish the practice of philosophers from that of grammarians and philologists, fails
  • Gale, Richard. “Falsifying Retrodictions,” Analysis 26 (1964e):6-9
    • Disposes of several counterexamples to the view that there is no way for us to act now to falsify retrodictions
  • Gale, Richard, and Irving Thalberg. “The Generality of Predictions,” Journal of Philosophy 62 (1965a): 195-210
    • Argues that there are certain crucial logical asymmetries between past and future that are deeply entrenched in the way we speak about the past and the future
  • Gale, Richard. “Why a Cause Cannot Be Later than Its Effect,” Review of Metaphysics 19 (1965b): 209-234.
  • Gale, Richard. “Existence, Tense, and Presupposition,” The Monist 50 (1966a): 98-108
    • Argues that “exists (existed, will exist)” is not a predicate of things and that “is present (past, future)” is not a predicate of events or states of affairs
  • Gale, Richard. 1966b. “McTaggart’s Analysis of Time,” American Philosophical Quarterly (1966b):145–152
    • Argues for Gale’s A-theory of time
  • Gale, Richard. “Pure and Impure Descriptions,” Australasian Journal of Philosophy (1967a) 45:32–43
  • Gale, Richard. “Propositions, Judgments, Sentences, and Statements,” Encyclopedia of Philosophy, vol. 6, Paul Edwards, (ed.) New York, Macmillan (1967b): 494-505
  • Gale, Richard. “Indexical Signs, Egocentric Particulars and Token-Reflexive Words,” Paul Edwards (ed.). Encyclopedia of Philosophy, vol. 4. New York: Macmillan, 1967c
  • Gale, Richard. “Hook’s Views on Metaphysics,” Sidney Hook and the Contemporary World, Paul Kurtz, (ed.). J. Day Co., 1968
  • Gale, Richard. “’Here’ and ‘Now’,” The Monist 53 (1969a): 396-409
    • Argues that there are certain crucial asymmetries between ‘here’ and ‘now’
  • Gale, Richard. “A Note on Personal Identity and Bodily Continuity,” Analysis 30 (1969b):193-195
  • Gale, Richard. “Do Performative Utterances Have any Constative Function?” Journal of Philosophy 67 (1969c): 117-121
  • Gale, Richard. “Negative Statements,” American Philosophical Quarterly 7 (1970a): 206-217
    • Discusses how to construct a reasonably clear criterion between positive and negative statements
  • Gale, Richard. “Strawson’s Restricted Theory of Referring,” Philosophical Quarterly 20 (1970b): 162-165.
    • Argues that Strawson’s restricted (relative to Russell’s) theory of referring leads to absurdities
  • Gale, Richard. “Has the Present Any Duration?” Noûs 5 (1970c): 39-47
    • Argues that the durational present, which can be interpreted as the punctual present, has, except for the irreducible reference to now, has the same characteristics as the corresponding instant in the B-series
  • Gale, Richard. “The Fictive Use of Language,” Philosophy 46 (1971): 324-340
    • Employs Austin’s trilogy of locutionary, illocutionary, and perlocutionary acts to argue that the difference between fictive and non-fictive uses of language are fundamentally pragmatic
  • Gale, Richard. “On What There Isn’t,” Review of Metaphysics 25 (1972): 459-488
    • Argues that negative facts cannot be reduced to positive ones
  • Gale, Richard. “O’Connor on the Identity of Indiscernibles,” Philosophical Studies 24 (1973a): 412-415
  • Gale, Richard. “Bergson’s Analysis of the Concept of Nothing,” Modern Schoolman 51 (1973b): 269-300
    • Applies Gale’s analyses of negative propositions to the question whether absolute nothingness is possible
  • Gale, Richard. “Could Logical Space Be Empty?” Essays on Wittgenstein in Honor of G.H. von Wright, Jaakko Hintikka and G.H. von Wright, (ed’s). Acta Philosophica Fennica, vol. 28, Thos. 1-2, 1976
    • Explores the question whether in Wittgenstein’s Tractatus all positive atomic propositions could be false
  • Gale, Richard. 1977. “A Reply to Oaklander,” Philosophy and Phenomenological Research (1977): 234-238.
    • Defends his claim (that there can be a B-series of events only if these events also form an A-series) against Oaklander’s objection to that claim from The Language of Time
  • Gale, Richard. “Wiggins’ Thesis D (x),” Philosophical Studies 45 (1984): 239-245
    • Argues that Wiggins’ principle of Sortal Dependency easily leads to counterexamples
  • Gale, Richard. “William James and the Ethics of Belief,” American Philosophical Quarterly 17 (1985): 1-14
    • Gale describes this paper as an attempt to capture the “spirit and thrust” of James’ The Will to Believe.
  • Gale, Richard. “Omniscience-Immutability Arguments,” American Philosophical Quarterly 23 (1968a): 319-335.
    • Argues for Gale’s A-theory of time
  • Gale, Richard. “A Priori Arguments from God’s Abstractness,” Noûs 20 (1986b): 531-543.
    • Argues that the a priori arguments based on God’s abstractness that God necessarily exists are uncompelling
  • Gale, Richard. “Parfit’s Arguments Against Partially Relativized Theories of Rationality,” Analysis 47 (1986c): 230-236.
  • Gale, Richard. “Freedom vs. Unsurpassable Greatness,” International Journal for Philosophy of Religion 23 (1988): 65-75.
    • Argues against Plantinga’s new version of the ontological argument
  • Gale, Richard. “Lewis’ Indexical Argument for World-Relative Actuality,” Dialogue 28 (1989): 289-304
  • Gale, Richard. “Freedom and the Free Will Defence,” Social Theory and Practice 16 (1990): 397-423
  • Gale, Richard. “Becoming,” Handbook. Philosophia Verlag, 1991b
  • Gale, Richard. “On Some Pernicious Thought-Experiments,” Thought Experiments in Science and Philosophy, G. Massey and T. Horowtiz, (ed’s). Rowman & Littlefield Publishers, 1991c
  • Gale, Richard. “Pragmatism versus Mysticism: The Divided Self of William James,” Philosophical Perspectives, vol. 5, James Tomberlin, (ed.) Atascadero: Ridgeview, 1991d
  • Gale, Richard; Pruss, Alexander. “Cosmological and Design Arguments,” Oxford Handbook of the Philosophy of Religion Oxford: Oxford University Press, 1991e
  • Gale, Richard. “Reply to Paul Helm,” Religious Studies 29 (1993): 257-263.
    • Argues that religious experiences are conceptualizable in a way that we do not adequately understand but that they do not provide evidence for the existence of the accusative
  • Gale, Richard. “William James on Self-Identity Over Time,” Modern Schoolman 71 (1994a): 165-189.
  • Gale, Richard. “The Overall Argument of Alston’s Perceiving God,” Religious Studies 30 (1994b): 135-149
    • Argues against Alston’s pragmatic and epistemic arguments that it is rational to believe in God’s existence
  • Gale, Richard. “Swinburne’s Argument from Religious Experience,” Alan G. Padgett (ed.), Reason and the Christian Religion Oxford: Clarendon, 1994c
    • Argues against Swinburne’s argument from religious experience
  • Gale, Richard. “McTaggart, John McTaggart Ellis,” Encyclopedia of Time, Samuel Macey, (ed.) New York: Garland, 1994d
  • Gale, Richard. “Russell, Bertrand Arthur William,” Encyclopedia of Time. Samuel Macey, (ed). New York: Garland, 1994e
  • Gale, Richard. “Analytic Philosophy,” Encyclopedia of Time. Samuel Macey, (ed). New York: Garland, 1994f
  • Gale, Richard. “Why Alston’s Mystical Doxastic Practice is Subjective,” Philosophy and Phenomenological Research 54 (1994g): 869 – 875.
    • Argues against Alston that alleged direct perceptions of God are subjective and unreliable
  • Gale, Richard. “Non-Being and Nothing,” Oxford Companion to Philosophy, Ted Hondereich, (ed.). Oxford: Oxford University Press, 1995
  • Gale, Richard, and Earman, John. “Time,” Cambridge Dictionary of Philosophy, Cambridge: Cambridge University Press, 1995
  • Gale, Richard. “John McTaggart Ellis McTaggart,” A Companion to Metaphysics, J. Kim and E. Sosa, (ed’s.). Blackwell-Wiley, 1996a
  • Gale, Richard. “Negation,” Blackwell’s Companion to Metaphysics, Jaegwon Kim and Ernst Sosa, (ed’s). Hoboken: Blackwell-Wiley, 1996b
  • Gale, Richard. “Nothingness,” Blackwell’s Companion to Metaphysics, Jaegwon Kim and Ernst Sosa, (ed’s). Hoboken: Blackwell-Wiley, 1996c
  • Gale, Richard. “Some Difficulties in Theistic Treatments of Evil,” The Evidential Argument from Evil, Daniel Howard-Snyder, (ed.), (pp. 206-218). Bloomington: Indiana University Press, 1996d
  • Gale, Richard. “William James’s Quest to Have It All,” Transactions of the Charles S. Peirce Society 32 (1996e): 568-596.
  • Gale, Richard. “Disanalogies Between Space and Time,” Process Studies 25 (1996f): 72-89.
    • Argues that there are major disanalogies between space and time and ‘here’ and the ‘now’
  • Gale, Richard. “John Dewey’s Naturalization of William James,” The Cambridge Companion to James, Ruth Anna Putnam, (ed.). Cambridge: Cambridge University Press, 1997a
  • Gale, Richard. “William James’s theory of Freedom,” Modern Schoolman 74 (1997b): 227-247.
  • Gale, Richard. “From the Specious to the Suspicious Present: The Jack Horner Phenomenology of William James,” Journal of Speculative Philosophy 11 (1997c): 163-189.
    • Argues that James’ account of how we experience time is based on a faked phenomenology that distorts the way we experience time
  • Gale, Richard. “William James’s Semantics of “Truth’,” Transactions of the Charles S. Peirce Society 33 (1997d): 863-898
    • Argues that James’ moralization of epistemology requires him to reject Tarski’s principle that a sentence “s” is true if and only if s and that this means that James’pragmatic theory of meaning can only supply the conditions under which a belief is epistemically warranted
  • Gale, Richard. “William James’ Ethics of Prometheanism, History of Philosophy Quarterly 15 (1998a): 245-269
  • Gale, Richard. “Ich bin Ein Realist: James’s Attempt to Placate Realism,” International Studies in Philosophy 30 (1998b): 1-17
    • Critically analyses James’ view that we are all morally obligated to maximize desire satisfaction over all other available options
  • Gale, Richard. “Robert M. Adam’s Theodicy of Grace,” Philo 1 (1998c): 36-44.
  • Gale, Richard. “William James and the Willfulness of Belief,” Philosophy and Phenomenological Research 59 (1999b): 71-91.
    • Argues that there is a salvageable core to James’ view that we can believe at will
  • Gale, Richard. “Santayana’s Bifurcationist Theory of Time,” Bulletin of the Santayana Society (17 (1999c): 1-13
  • Gale, Richard. “A New Argument for the Existence of God: One That Works, Well Sort Of,” The Rationality of Theism: Essays in the Philosophy of Religion. G. Bruntrup & R.K. Tacelli, (ed.)’s. Dordrechet: Kluwer, 1999d
    • Gale’s first statement of his new cosmological argument
  • Gale, Richard, Pruss, Alexander. “A New Cosmological Argument,” Religious Studies 35 (1999): 461-476.
    • Gale’s and Pruss’s improved version of Gale’s new modal cosmological argument
  • Gale, Richard. Introduction to the re-issue of William James and Henri Bergson by Horace Kallen London: Thoemmes Press, 2001
  • Gale, Richard. “Time, Temporality, and Paradox,” Blackwell Guide to Metaphysics, Richard Gale, (ed.). Oxford: Blackwell, 2002a
  • Gale, Richard. “The Metaphysics of John Dewey,” Part I and Part II, Transactions of the Charles S. Peirce Society 38 (2002b): 477-519.
    • Explains why Dewey was not the best philosopher of all time but is the best person ever to have been a philosopher.
  • Gale, Richard. “Divine Omniscience, Freedom, and Backward Causation,” Faith and Philosophy 19 (2002c): 85-88.
    • Attempts to deduce a contradiction from the proposition that God is omniscient and immutable and that there are true temporal indexical propositions
  • Gale, Richard. “A Challenge for Interpreters of Varieties,” Streams of William James. The William James Society 4 (2002d): 32-33
  • Gale, Richard. “A Thomist Metaphysics,” Blackwell Guide to Metaphysics. Richard Gale, (ed.) Oxford: Blackwell, 2002e
  • Gale, Richard, Pruss, Alexander. “A Response to Oppy, and to Davey and Clifton,” Religious Studies 38 (2002): 89-99
    • Argues that the fact that the weak principle of sufficient reason entails the strong principle of sufficient reason does not damage their argument and gives one a further reason to accept the weak principle
  • Gale, Richard. “Why Traditional Cosmological Arguments Don’t Work and a Sketch of a New One that Does,” Contemporary Debates in Philosophy of Religion, Michael Peterson, (ed.) Oxford: Blackwell, 2003a
  • Gale, Richard. “God Eternal and Paul Helm,” Reason, Faith and History: Essays in Honour of Paul Helm, Martin Stone, (ed.). London: Ashgate, 2003b
  • Gale, Richard, and Pruss, Alexander. “A Response to Almeida and Judisch,” International Journal for Philosophy of Religion 53 (2003): 65-72.
    • Denies that their argument leads to a contradiction but acknowledges the need to clarify the nature of their conclusion
  • Gale, Richard. “The Ecumenicalism of William James,” William James and The Varieties of Religious Experience: Centenary Celebration. Jeremy Carrett, (ed.). London: Routledge, 2004a
  • Gale, Richard. “William James and John Dewey: the Odd Couple,” Midwestern Studies in Philosophy 28 (2004b): 149–167.
  • Gale, Richard. “The Still Divided Self of William James: A Response to Pawelski and Cooper Transactions of the Charles S. Peirce Society 40 (2004c): 153-170.
  • Gale, Richard, and Pruss, Alexander. “Cosmological and Teleological Arguments,” Oxford Companion to the Philosophy of Religion, William Wainwright, (ed.), 2005
  • Gale, Richard. “Response to My Critics,” Philo 6 (2003): 132-165
  • Gale, Richard. “John Dewey’s ‘Time and Individuality’,” Modern Schoolman, 82 (2005a): 175-192
  • Gale, Richard. “On the Cognitivity of Mystical Experiences,” Faith and Philosophy 22 (2005b): 426-441
  • Gale, Richard. “Comments on the Will to Believe,” Social Epistemology 20 (2006a): 35-39
  • Gale, Richard. “The Problem of Ineffability in Dewey’s Theory of Inquiry,” Southern Journal of Philosophy 44 (2006b): 75-90.
  • Gale, Richard. “The Failure of Traditional Theistic Arguments,” Cambridge Companion to Atheism, edited by M. Martin, (ed.) Cambridge: Cambridge University Press, 2007a
  • Gale, Richard. “Evil and Alvin Plantinga,Alvin Plantinga, (ed.) Deane-Peter Baker Cambridge: Cambridge University Press, 2007b
  • Gale, Richard. “Relations,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R. Talisse. Nashville: Vanderbilt University Press, 2007c
  • Gale, Richard. “Healthy-Mindedness,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R., Talisse. Nashville: Vanderbilt University Press, 2007d
  • Gale, Richard. “The Influence of William James,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R. Talisse Nashville: Vanderbilt University Press, 2007e
  • Gale, Richard. “God,” Encyclopedia of American Philosophy. Nashville: Vanderbilt University Press, 2007f
  • Gale, Richard. “Time,” Encyclopedia of American Philosophy. Nashville: Vanderbilt University Press, 2007g
  • Gale, Richard. “Timothy Sprigge: The Grinch that Stole Time”, Consciousness, Reality and Value: Essays in Honour of T.L.S. Sprigge. Pierfrancesco Basile & Leemon B. McHenry (eds.). Frankfurt: Ontos Verlag, 2007h
  • Gale, Richard. “The Deconstruction of Traditional Philosophy by William James,” Pragmatism,” Looking Toward Last Things: Pragmatism 100 Years After James’s,” John Stuhr, (ed.). Bloomington: Indiana University Press, 2009
  • Gale, Richard. “The Naturalism of John Dewey,” Cambridge Companion to John Dewey, (ed.) Molly Cochoran, Cambridge: Cambridge University Press, 2010
  • Gale, Richard. “The Problem of Evil,” Routledge Companion to the Philosophy of Religion, (ed’s), Chad Meister and Paul Copan, (ed’s). London: Routledge, 2012
  • Gale, Richard. “James,” Twentieth-Century Philosophy of Religion: The History of Western Philosophy of Religion. New York: Routledge, 2013
  • Gale, Richard. “Autobiography” (unpublished)

iv. Book Reviews

  • Gale, Richard. What Is Political Philosophy? by Leo Strauss Philosophy and Phenomenological Research, 1961
  • Gale, Richard. The Natural Philosophy of Time by G. J. Whitrow Philosophy and Phenomenological Research, 1961
  • Gale, Richard. Time and the Physical World by Richard Schlegel Philosophy and Phenomenological Research, 1963
  • Gale, Richard. Analytical Philosophy of History by Arthur Danto Foundations of Language, 1968
  • Gale, Richard. Essays on Wittgenstein’s Tractatus by Irving Copi and Robert Beard, (ed’s). Philosophy and Phenomenological Research, 1968
  • Gale, Richard. Joint review of “The Problem of Future Contingencies” by Richard Taylor and “Present Truth and Future Contingency” by Rogers Albritton Journal of Symbolic Logic, 1969
  • Gale, Richard. Kant’s theory of Time by Al-Azum, Studies in History and Philosophy of Science, 1971
  • Gale, Richard. Remarks on Colours by Ludwig Wittgenstein, Review of Metaphysics, 1981
  • Gale, Richard. Time: A Philosophical Analysis by T. Chapman. Dialogue 23 (1984): 153
  • Gale, Richard. Eternity, Internal Studies in Philosophy by Brian Leftow, 1995
  • Gale, Richard. The Correspondence of William James, v. 4, Ignal Skrupskelis, (ed.), Transactions of the Charles Sanders Peirce Society, 1996
  • Gale, Richard. Judaism and the Doctrine of Creation by Norbert Samuelson CCAR Journal: A Reform Jewish Quarterly, 1999
  • Gale, Richard. Heaven’s Champion by Ellen Suckiel Journal of Value Inquiry, 1999
  • Gale, Richard. William James and the Metaphysics of Experience by David Lamberth Philosophy and Phenomenological Research, 2000
  • Gale, Richard. Questions of Time and Tense by Robin Le Poidevin, (ed.). Philosophical Books, 2000
  • Gale, Richard. Dewey’s Empirical Theory of Knowledge and Reality by John Shook Transactions of the Charles Sanders Peirce Society, 2001
  • Gale, Richard. Truth, Rationality, and Pragmatism: Themes from Peirce by Christopher Hookway. Philosophical Quarterly, 2001
  • Gale, Richard. A. J. Ayer: A Life by Ben Rogers. Free Inquiry, 2003
  • Gale, Richard. The Unknown God by Antony Kenny Religious Studies, 2004
  • Gale, Richard. God and Philosophy by Antony Flew. Transactions of the Charles Sanders Peirce Society, 2005
  • Gale, Richard. God, the Best, and Evil by Richard Langtry Notre Dame Philosophical Reviews: An Electronic Journal, 2009
  • Gale, Richard. William James at the Boundaries: Philosophy, Science, and the Geography of Knowledge by Franchesca Bordogna Journal of the History of Philosophy 48 (2010): 252-253

v. Critical Studies

  • Gale, Richard. Studies in Metaphilosophy by Morris Lazerowtiz Philosophical Quarterly, 1965
  • Gale, Richard. Referring by Leonard Linsky Journal of Philosophy, 1969
  • Gale, Richard. The Philosophy of Space and Time by Richard Swinburne Journal of Philosophy, 1969
  • Gale, Richard. The Cement of the Universe by J.L. Mackie Modern Schoolman, 1976
  • Gale, Richard. The Concept of Identity by Eli Hirsch Journal of Philosophy, 1983
  • Gale, Richard. The Epistemology of Religious Experience by Keith Yandell Faith and Philosophy, 1995
  • Gale, Richard. Ontological Arguments by Graham Oppy Cambridge: Cambridge University Press. Philosophy and Phenomenological Research 58 (1999): 715-19
  • Gale, Richard, and Pruss Alexander Atheism and Theism by J. J. C. Smart and John Haldane Faith and Philosophy 16 (1999): 106-113
  • Gale, Richard. 1999. Providence and the Problem of Evil by Richard Swinburne, Religious Studies 36 (2):209-219

b. Secondary Sources

  • Scott Aikin. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Transactions of the Charles S. Peirce Society 46 (2010): 656-659
  • Alston, William. Review of On the Existence and Nature of God by Richard Gale. The Philosophical Review 102 (1993): 433-435
  • Alston, William. Perceiving God: The Epistemology of Religious Experience. Ithaca: Cornell, 2014
  • Beilby, James “Plantinga’s Model of Warranted Christian Belief,” Alvin Plantinga. Deane-Peter Baker, (ed.) Cambridge: Cambridge University Press, 2007
  • Bird, Graham. Review of The Divided Self of William James by Richard Gale. MIND 111 (2002): 100-103
  • Boyle, Deborah. “William James’s Ethical Symphony,” Transactions of the Charles S. Peirce Society, 34 (1998): 977-1003
  • Butler, Clark. “Motion and Objective Contradictions,” American Philosophical Quarterly 19 (1981): 131-139
  • Cocchiarella, Nino. Review of The Language of Time by Richard Gale. Journal of Symbolic Logic 37 (1972): 170-172
  • Craig, W.L. “Divine Timelessness and Personhood,” International Journal for Philosophy of Religion, 43 (1998): 109-124
  • Craig, W.L. Time and Eternity. Wheaton, Illinois: Crossway Books, 2001
  • Craig, W.L. The Tensed Theory of Time: A Critical Examination. New York: Springer, 2000
  • Craig, W.L. The Tenseless Theory of Time: A Critical Examination. New York: Springer, 2013
  • Davey, Kevin, Clifton, Robb. “Insufficient Reason in the ‘New Cosmological Argument,” Religious Studies 37 (2001): 485-490
  • Dowden, Bradley. “Time” Internet Encyclopedia of Philosophy URL: https://iep.utm.edu/time/#SH9e
  • Dyke, Heather. Review of Blackwell Guide to Metaphysics by Richard Gale, (ed.) Australasian Journal of Philosophy 81 (2003): 620-621
  • Evans, Charles. “Timeless Truth,” The Philosophical Review 71 (1962): 241-242
  • Ford. Lewis. “The Duration of the Present,” Philosophy and Phenomenological Research 35 (1974): 100-106
  • Franks, Christopher. “Passion and the Will to Believe,” The Journal of Religion 84 (2000): 431-449
  • Gellman, Jerome, 2000. “Prospects for a sound stage 3 of cosmological arguments,” The Existence of God, Richard Gale and Alexander Pruss, (ed’s). Farhnam, UK: Ashgate
  • Goldman, Loren. Review of John Dewey’s Quest for Unuty: The Journey of a Promethean Mystic by Richard Gale. Education and Culture 29 (2013): 135-139
  • Goodman, Russell B. Review of The Divided Self of William James by Richard Gale. Religious Studies 36 (2000): 227-245
  • Helm, Paul. “Gale on God,” Religious Studies 29 (1993): 245-255
  • Hobbs, Charles. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. The Journal of Speculative Philosophy (25 (2011): 28-430
  • Howard-Snyder, Daniel, (ed.). The Evidential Argument from Evil Bloomington: Indiana University Press, 1996
  • Humber, James. “Response to Gale,” Social Theory and Practice: Proceedings from the Georgia State University Conference on Human Freedom 16 (1990): 425-433
  • Iannone, Pablo. Review of The Philosophy of William James: An Introduction by Richard Gale. Review of Metaphysics 59 (2005): 173-174
  • King-Farlow, John. “The Positive McTaggart on Time,” Philosophy 49 (1974): 169-178
  • Khatchadourian, Haig. “Do Ordinary Spatial and Temporal Expressions Designate Relations?” Philosophy and Phenomenological Research 34 (1973): 82-94
  • Lamberth, David C. Review of The Divided Self of William James by Richard Gale. Journal of the American Academy of Religion 68 (2000): 890-893
  • Leftow, Brian. “Time, Actuality and Omniscience,” Religious Studies 26 (1990): 303-321
  • McDonough, Richard. “The Gale–Pruss cosmological argument: Tractarian and advaita Hindu objections,” Religious Studies, 2016. http://journals.cambridge.org/abstract_S0034412516000123.
  • McHugh, Christoper. “A Refutation of Gale’s Creation-Immutability Arguments,” Philo 6 (2003): 5-9
  • McTaggert, J.M.E. “The Unreality of Time,” Mind 17 (1908): 457-474
  • Mellor, D. H., Matters of Metaphysics, Cambridge: Cambridge University Press, 1991
  • Meyers, Gerald. Review of The Divided Self of William James by Richard Gale. Philosophy and Phenomenological Research 64 (2002): 491-494
  • Meyers, William, and Pappas, Gregory. “Dewey’s Metaphysics: A Response to Richard Gale,” Transactions of the Charles S. Peirce Society 40 (2004): 679-700
    • Argues that Gale’s analytical outlook leads him to fundamentally misunderstand Dewey’s philosophy
  • “Moe Gale Dies” Sept. 3, 1964. New York Times
  • Newton-Smith, W. Review of The Language of Time by Richard Gale. The British Journal for the Philosophy of Science 20 (1969): 281-283
  • Oaklander, L. Nathan. 1977. “The Timelessness of Time,” Philosophy and Phenomenological Research 38 (1977): 228-233
    • Argues that Gale’s purported reduction of B-relations to A-determinations fails because it cannot account for the timelessness of time
  • Oaklander, L. Nathan, and Smith, Quentin. The New Theory of Time. New Haven: Yale University Press, 1994
    • An excellent anthology on the philosophy of time that includes interesting discussion both of the older snd the newer versions of the A and B theories of time.
  • Oppy, Graham. Ontological Arguments and Belief in God Cambridge: Cambridge University Press, 1995
  • Oppy, Graham. “On ‘A New Cosmological Argument,’” Religious Studies 36 (2000): 345–353
    • Argues that Gale’s and Pruss’s weak principle of reason entails the strong version of the principle of sufficient reason
  • Oppy, Graham. Arguing about Gods Cambridge: Cambridge University Press, 2006
  • Oppy, Graham. Describing Gods: An Investigation of Divine Attributes Cambridge: Cambridge University Press, 2014
  • O’Connor, David. God and Inscrutable Evil: In Defence of Theism and Atheism. Lanham, MD: Rowman and Littlefield, 1996
  • Padgett, Alan. “God and Time: Toward a New Doctrine of Divine Timeless Eternity,” Religious Studies 25 (1989): 209-215
  • Pawelski, James. “William James’ Divided Self and the Process of its Unification: A Reply to Richard Gale.” Transactions of the Charles Sanders Peirce Society 39 (2003): 645-656
    • Argues that Gale’s view that James is a mystic and that this produces a self that is divided from its pragmatism is wrong
  • Plantinga, Alvin. Warranted Christian Belief New York: Oxford University Press, 2000
  • Plumer, Gilbert. “Detecting Temporalities,” Philosophy and Phenomenological Research 47 (1987): 451-460
  • Post, John F. Review of On the Nature and Existence of God by Richard Gale. Philosophy and Phenomenological Research 53 (1993): 950-954
  • Plecha, James. “Tenselessness and the Absolute Present,” Philosophy 59 (1984): 529-534
    • Argues contra Gale that one can acknowledge that language can be detensed while accepting the absolute present and rejecting the block universe
  • Prior, A.N. Review of The Language of Time by Richard Gale. Mind 78 (1969): 453
  • Pruss, Alexander. “The Hume-Edwards Principle and the Cosmological Argument,” International Journal for Philosophy of Religion 43 (1998): 149-165
  • Pruss, Alexander. “A Restricted Principle of Sufficient Reason and the Cosmological Argument,”
  • Religious Studies 40 (2004): 165-179
  • Pruss, Alexander. The Principle of Sufficient Reason: A Reassessment Cambridge: Cambridge University Press, 2010
  • Pruss, Alexander. Actuality, Possibility and Worlds London: Bloomsbury Academic, 2011
  • Rankin, K.W. Review of The Language of Time by Richard Gale. The Philosophical Quarterly 19 (1969): 176-177
  • Raposa, Michael. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. American Journal of Theology & Philosophy 31 (2010): 275-278
  • Reichenbach, Bruce R. The Cosmological Argument: A Reassessment Springfield, Illinois: Charles C. Thomas, 1972
  • Robison, John. “A Note on ‘Falsifying Retrodictions’,” Analysis 26 (1965): 9-11
  • Rowe, William. Review of On The Nature and Existence of God by Richard Gale. Journal of the American Academy of Religion 63 (1995): 592-595
  • Rutten, Emmanuel. A Critical Assessment of Contemporary Cosmological Arguments: Towards a Renewed Case for Theism. Doctoral Thesis: Vrije University. Amsterdam, 2012
  • Saka, Paul. “Pascal’s Wager and the Many Gods Objection,” Religious Studies 37 (2001): 321-341
  • Sanford, David H. “McTaggart on Time,” Philosophy 43 (1968): 371-378
  • Schellenberg, J. L. Review of On the Nature and Existence of God by Richard Gale Review of Metaphysics 46 (1992): 402-404
  • Schlesinger, George. “The Stillness of Time and Philosophical Equanimity,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 30 (1976): 145-159
  • Schlesinger, George. “The Reduction of B-Statements,” Philosophical Quarterly 28 (1978): 162-165
  • Schlesinger, George. Aspects of Time Indianapolis: Hackett, 1980
  • Sherry, Patrick. Review of On the Nature and Existence of God by Richard Gale. Philosophy 67 (1992): 563
  • Shook, John. Review of The Philosophy of William James: An Introduction by Richard Gale. Philosophy in Review 25 (2005): 179-181
  • Slattery, Michael. “More on what there Isn’t,” Review of Metaphysics 26 (1972): 344-348
  • Smart, J.J.C. “Tensed Statements: A Comment,” Philosophical Quarterly 12 (1962): 264-265
  • Smith, Quentin. “The Co-Reporting Theory of Tensed and Tenseless Sentences,” Philosophical Quarterly 40 (1990): 213-222
  • Smith, Quentin. “Sentences about Time,” Philosophical Quarterly 37 (1987): 37-53
  • Stein, Howard. Review of The Language of Time by Richard Gale. Journal of Philosophy 66 (1969): 350-355
  • Stephens, Matthew. Review of The Divided Self of William James by Richard Gale. Philosophy in Review 21 (2001): 113-115
  • Suckiel, Ellen Kappy. Review of The Divided Self of William James by Richard Gale. Transactions of the Charles S. Peirce Society 36 (2000): 161-168
  • Suckiel, Ellen Kappy. “The Authoritativeness of Mystical Experience: An Innovative Proposal from William James,” International Journal for Philosophy of Religion 52 (2002): 175-189
  • Swineburn, Richard. “Reply to Richard Gale,” Religious Studies 36 (2002): 221-225
    • Swineburn responds to Gale’s criticism that his thesis is ambiguous and makes clear that he intends a strong version of the thesis that God is justified in allowing evil to occur.
  • Swineburn, Richard. Review of On the Nature and Existence of God by Richard Gale. The Journal of Theological Studies, NEW SERIES 43 (1992): 784-788
  • Talisse, Robert. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Philosophical Quarterly 61 (2011): 863-864
  • Tiles, J.E. 2010. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Notre Dame Philosophical Reviews 5 (2010)
  • Tooley, Michael. Time, Tense and Causation Oxford: Clarendon Press, 1997
  • Van Inwagen, Peter. “Reflections on the Chapters by Draper, Russell, and Gale,” The Evidential Argument from Evil, Daniel Howard Snyder, (ed.), Bloomington: Indiana University Press, 1996
  • Watson, Justin. Review of The Divided Self of William James by Richard Gale. Religion & Literature 31(1999): 124-126
  • White, David A. “Can Alston Withstand the Gale?” International Journal for Philosophy of Religion 39 (1996): 141-149
    • Argues that Gale’s argument that God is not the sort of being that can be pinned down and baptised in an act of naming is wrong.
  • Williams, Clifford. “The Metaphysics of A- and B- Time,” Philosophical Quarterly 46 (1996): 371-381
    • Argues that the alleged difference between A- and B- time remains undescribed
  • Williams, Clifford. “Bergsonian Approach to A- and B- Time,” Philosophy 73 (1998): 379-393
  • Zimmerman, Dean. “Richard Gale and the Free Will Defence,” Philo 6 (2003): 78-113

 

Author Information

Richard McDonough
Email: rmm249@cornell.edu
Arium School of Arts & Sciences
Singapore

Wittgenstein: Epistemology

WittgensteinAlthough Ludwig Wittgenstein is generally more known for his works on logic and on the nature of language, but throughout his philosophical journey he reflected extensively also on epistemic notions such as knowledge, belief, doubt, and certainty. This interest is more evident in his final notebook, published posthumously as On Certainty (1969, henceforth OC), where he offers a sustained and, at least apparently, fragmentary treatment of epistemological issues. Given the ambiguity and obscurity of this work, written under the direct influence of G. E. Moore’s A Defense of Commonsense (1925, henceforth DCS) and Proof of an External World (1939, henceforth PEW), in the recent literature on the subject, we can find a number of competing interpretations of OC; at first, this article presents the uncontentious aspects of Wittgenstein’s views on skepticism, that is, his criticisms against Moore’s use of the expression “to know” and his reflections on the artificial nature of the skeptical challenge. Then it introduces the elusive concept of “hinges,” central to Wittgenstein’s epistemology and his views on skepticism; and it offers an overview of the dominant “Wittgenstein-inspired” anti-skeptical strategies along with the main objections raised against these proposals. Finally, it briefly sketches the recent applications of Wittgenstein’s epistemology in the contemporary debate on skepticism.

Table of Contents

  1. Wittgenstein on Radical Skepticism: A Minimal Reading
  2. The Therapeutic Reading
  3. The Epistemic Reading
  4. The Contextualist Reading
  5. The Non-epistemic Reading
  6. The Non-propositional Reading
  7. The Framework Reading
  8. Concluding Remarks
  9. References and Further Reading

1. Wittgenstein on Radical Skepticism: A Minimal Reading

The feature of Cartesian-style arguments is that we cannot know certain empirical propositions (such as “Human beings have bodies,” or “There are external objects”) as we may be dreaming, hallucinating, deceived by a demon, or be “brains in the vat” (BIV), that is, disembodied brains floating in a vat, connected to supercomputers that stimulate us in just the same way that normal brains are stimulated when they perceive things in a normal way. Therefore, as we are unable to refute these skeptical hypotheses, we are also unable to know propositions that we would otherwise accept as being true if we could rule out these scenarios.

Cartesian arguments are extremely powerful as they rest on the Closure principle for knowledge. According to this principle, knowledge is “closed” under known entailment. Roughly speaking, this principle states that if an agent knows a proposition (for example, that she has two hands), and competently deduces from this proposition a second proposition (for example, that having hands entails that she is not a BIV), then she also knows the second proposition (that she is not a BIV). More formally

The “Closure” Principle:

If a subject S knows that p, and p entails q, then S knows that q.

Let’s take a skeptical hypothesis, SH, such as the BIV hypothesis mentioned above, and M, an empirical proposition such as “Human beings have bodies” that would entail the falsity of a skeptical hypothesis. We can then state the structure of Cartesian skeptical arguments as follows:

(S1) I do not know not-SH

(S2) If I do not know not-SH, then I do not know M

(SC) I do not know M

Considering that we can repeat this argument for each and every one of our empirical knowledge claims, the radical skeptical consequence we can draw from this and similar arguments is that our knowledge is impossible.

A way of dealing with “Cartesian-style” skepticism is to affirm, contra the skeptic, that we can know the falsity of the relevant skeptical hypothesis; for instance, in DCS and PEW, G. E. Moore famously argued that we can have knowledge of the “commonsense view of the world,” that is of statements such as “Human beings have bodies,” “There are external objects,” or “The earth existed long before my birth” and that this knowledge would offer a direct response against skeptical worries.

Moore himself (1942) was not fully convinced by the anti-skeptical strength of DCS and PEW, which have engendered a huge debate that will be impossible to summarize here (see Malcolm, 1949; Clarke, 1973; Stroud, 1984). Nonetheless, it is important to notice that Moore’s affirmation that he knows for certain the “obvious truisms of the commonsense” is pivotal in his anti-skeptical strategy; his knowledge-claims would allow him to refute the skeptic.

But, argues Wittgenstein, to say that we know “obvious truisms” such as “There are external objects” is misleading for a number of reasons. First, because in order to claim “I know,” one should be able to, at least in principle, produce evidence or offer compelling grounds for her beliefs. That is to say, the “language-game” of knowledge involves and presupposes the ability to give reasons, justifications, and evidence; but crucially (OC 245), Moore’s grounds are not stronger than what they are supposed to justify. In other words, as per Wittgenstein, if a set of evidence has to count as compelling grounds for our belief in a certain proposition, then that evidence must be more certain than the belief itself; this cannot happen in the case of Moore’s “obvious truisms” because, at least in normal circumstances, nothing is more certain than the fact that we have two hands or a body (OC 125).

Just imagine, for instance, that one attempted to legitimate one’s claim to know that p by using the evidence that one has for p (that is, what one sees, what one has been told about p and so on). Now, if the evidence we adduce to support p is less certain than p itself, then this same evidence would be unable to support p.

However, Wittgenstein argues that if it would be somewhat odd to claim that we still know Moore’s “obvious truisms,” they cannot be an object of doubt. For instance, if someone is seriously pondering whether she has a body or not, we would not investigate the truth-value of her affirmations; rather, we would question her ability to understand the language she is using or her sanity, for a similar false belief would probably be the result of a sensorial or mental disturbance (OC 526).

Moreover, for Wittgenstein, the kind of never-ending doubt put forward by a proponent of radical skepticism, far from being a legitimate intellectual enterprise, will prevent his proponents from engaging in any intellectual activity at all; to support his point, Wittgenstein gives the example (OC 310) of a pupil who constantly interrupts a lesson, questioning the existence of things or the meanings of words. His doubts will lack any sense, and at most they will lead him to a sort of epistemic paralysis; he will just be unable to learn the skill/subject we are trying to teach him (OC 315).

More generally, for Wittgenstein, any proper epistemic inquiry presupposes that we take something for granted; if we start doubting everything, there will be no knowledge at all. As he remarks at one point:

If you are not certain of any fact, you cannot be certain of the meaning of your words either […] If you tried to doubt everything you would not get as far as doubting anything. The game of doubting itself presupposes certainty (OC 114–115).

That is to say, the questions that we raise and our doubts depend on the fact that some propositions are exempt from doubt, are as it were like hinges on which those turn […] But it isn’t that the situation is like this: We just can’t investigate everything, and for that reason we are forced to rest content with assumption. If I want the door to turn, the hinges must stay put (OC 341–343).

Neither knowable nor doubtable, for Wittgenstein, Moore’s “obvious truisms of the commonsense” are “hinges” (OC 341–343): apparently empirical contingent beliefs which perform a different, basic role in our epistemic practices.

2. The Therapeutic Reading

The “therapeutic reading” of OC (Conant, 1998) stems from the remarks in which Wittgenstein talks about Moore’s “misuse” of the expression “to know”:

Now, can one enumerate what one knows (like Moore)? Straight off like that, I believe not. For otherwise the expression “I know” gets misused. And through this misuse a queer and extremely important mental state seems to be revealed (OC 6).

I know that a sick man is lying here? Nonsense! I am sitting at his bedside, I am looking attentively into his face. So I don’t know, then, that there is a sick man lying here? Neither the question nor the assertion makes sense. Any more than the assertion “I am here”, which I might yet use at any moment, if suitable occasion presented itself. […] And “I know that there’s a sick man lying here”, used in an unsuitable situation, seems not to be nonsense but rather seems matter-of-course, only because one can fairly easily imagine a situation to fit it, and one thinks that the words “I know that…” are always in place where there is no doubt, and hence even where the expression of doubt would be unintelligible (OC 10).

According to the proponents of the “therapeutic reading,” we should read these passages in light of the theory, pivotal in later Wittgensteinian thought, that the meaning of a word consists in its use in ordinary situations. As he writes in his Philosophical Investigations (1997, henceforth PI):

For a large class of cases—though not for all—in which we employ the word “meaning” it can be defined thus: the meaning of a word is its use in the language (PI 43).

This would allow us to reconstruct Wittgenstein’s treatment of skepticism as follows. Moore fails to mean something quite particular by stating his “obvious truisms” outside of a language-game, that is outside any of our everyday epistemic practices; thus, in the circumstances in which they are actually used, it is not clear what has been said, if anything.

At the same time, a radical skeptic fails to recognize the role—or, using Wittgenstein’s expression, the use—that expressions such as “knowledge” and “doubt” play in our ordinary epistemic practices; as in our everyday life there is nothing similar to the kind of general investigation pursued by the radical skeptic (OC 209 “A doubt without an end is not even a doubt.”), the skeptical challenge would be, strictly speaking, senseless (1998, 241–248).

Following the therapeutic reading of OC, then, both Cartesian skepticism and Moore’s anti-skeptical strategy would be based on a misunderstanding of how language works; it is not clear what Moore and the skeptic are doing with their words and therefore, even if apparently they have a meaning, they lack any sense.

This rendering of Wittgenstein’s anti-skeptical position has appeared to many commentators (see, for instance, Salvatore, 2013) to be simply too crude.

Consider the following entries:

The statement “I know that here is a hand” may then be continued: “for its my hand that I am looking at”. Then a reasonable man will not doubt that I know. Nor will the idealist; rather he will say that he was not dealing with the practical doubt which is being dismissed, but there is a further doubt behind that one. That this is an illusion has to be shewn in a different way (OC 19). But it is an adequate answer to the skepticism of the idealist, or the assurances of the realist, to say that ‘There are physical objects’ is nonsense? For them after all it is not nonsense. It would, however, be an answer to say: this assertion, or its opposite, it’s a misfiring attempt to express what can’t be expressed like that. And that it does misfire can be shewn; but that isn’t the end of the matter. We need to realize that what presents itself to us as the first expression of a difficulty, or of its solution, may as yet not be correctly expressed at all. Just as one who has a just censure of a picture to make will often at first offer the censure where it does not belong, and an investigation is needed in order to find the right point of attack for the critic (OC 37).

These remarks alone seem to suggest that Wittgenstein was well aware that simply showing that the skeptic was using terms such as “knowledge” and “doubt” outside of any ordinary practice was not enough to dismiss skeptical worries. On the contrary, he seems to concede that nothing prevents us from thinking that the skeptic and his opponent are engaged in a “language-game,” that is philosophical inquiry, in which the expressions “to know” and “to doubt” are used in a way that is at odds with their everyday usage but is still, at least apparently, meaningful and legitimate.

Also, these passages show that Wittgenstein would not consider a rebuttal of skepticism on the basis of pragmatic considerations alone to be satisfactory. Recall that on Conant’s reading, Wittgenstein would dismiss skeptical doubts as they are at odds with our ordinary practices, thus unintelligible on closer inspection; on the contrary, he stresses the necessity for a philosophical analysis of the hidden assumptions that make the skeptical “doubt” so apparently compelling.

3. The Epistemic Reading

A very influential reading of OC is Crispin Wright’s notion of “rational entitlement” (2004a/2004b), which stems from his famous diagnosis of Moore’s Proof (1985). If in DCS, as we have already seen, Moore argued that we can have knowledge of the “commonsense view of the world,” that is of very general “obvious truisms” such as “I am a human being,” “Human beings have bodies,” “The earth existed long before my birth,” and so on, in PEW, he famously maintained that even an instance of everyday knowledge such as “This is a hand” can offer a direct response against skeptical worries. Moore’s Proof is standardly rendered as follows:

(MP 1) Here is a hand

(MP 2) If there is a hand here, then there are external objects

(MP C) There are external objects

As per Wright, we can reconstruct PEW as follows:

  • I. It perceptually appears to me that there are two hands
  • II. There are two hands
  • III. Therefore, there are material objects

In other words, to state I) amounts to saying that there is a proposition that correctly describes the relevant aspects of Moore’s experience in the circumstances in which the Proof was given; in the case of the Proof, for instance, I) will sound like “I am perceiving (what I take to be) my hand.” Then, from I) follows II) and from II) follows III), since “a hand” is a physical object; and given that the premises are known, so is the conclusion.

But, argues Wright, the passage from I) to II) is highly problematic: if Moore was victim of a skeptical scenario such as the “Dream hypothesis” and thus was just dreaming his hand, II) would no longer follow from I). More generally, I) can ground II) only if we already take for granted that our experience is caused by our interaction with material objects; thus, sensory experience can warrant a belief about empirical objects only if we already assume that there are material objects.

Hence, we need to already have a warrant for III) in order to justifiably go from I) to II); and this is why Moore’s Proof would be question begging or epistemically circular: in order to consider the premises of Moore’s Proof true, we are implicitly assuming the truth of its conclusion.

Thus, Moore’s Proof would lead to another, more subtle form of skepticism that Wright calls Humean ; while Cartesian-style skepticism goes from uncongenial skeptical scenarios to show that we cannot know any of our empirical beliefs, Humean skepticism argues that anytime we make an empirical knowledge claim, we are already assuming that, so to say, things outside of us are already the way we take them to be and, more generally, that there are material objects.

Again, in order to go from I) to II) to III), we need to have an independent warrant to believe that III) is true; and as we do not have this independent warrant, then the argument fails to provide warrant for his conclusions. This is a phenomenon that Wright calls “failure of transmission of warrant” (or transmission failure for short). However, Wright argues, in many cases our inquiries are based on commitments or presuppositions that cannot be justified, but that nonetheless we take for granted whenever we are involved in an epistemic practice; and this is what happens with Moore’s “obvious truisms of the commonsense.” On Wright’s account, “hinges” are beliefs whose rejection would rationally necessitate extensive reorganization, or the complete destruction, of what should be considered as empirical evidence or more generally of our epistemic practices.

This reading of OC leads Wright to propose the following “Wittgenstein-inspired” anti-skeptical account. Each and every one of our ordinary inquiries would then rest on ungrounded presuppositions, “hinges”; but still, argues Wright, since the warrant to hold Moore’s “obvious truisms” is acquired in an epistemically responsible way, we cannot dismiss them simply because they are groundless as this would lead to a complete cognitive paralysis (2004a, 191).

As per Wright, then, Cartesian skepticism would only show that every process of knowledge-acquisition rests on ungrounded presupposition. However, a system of thought, purified of all liability to hinges, would not be that of a rational agent; and because rational agency is a basic way for us to act, we therefore have a default rational basis, an entitlement, to believe in “hinges” and thus to know them, even if in an unwarranted way (hence “epistemic reading”).

Wright’s rational entitlement has engendered a huge debate that would be impossible to summarize here (see Pryor 2000, 2004, 2012; Davies 2003, 2004; Coliva, 2009a, 2009b, 2015). This article presents only the main objections raised against the plausibility of the “epistemic reading” and its anti-skeptical strength.

A first issue (Salvatore, 2013) is that following this account, the Cartesian skeptical challenge, even if ultimately illegitimate, would nonetheless have the merit to highlight a constitutive limit of our epistemic practices, namely that they rest on ungrounded presuppositions. On the contrary, far from revealing the structure of our epistemic practices, for Wittgenstein, Cartesian-style skepticism will undermine the same notion of what an epistemic practice is. For once we doubt a hinge such as “There are external objects,” expressions like “evidence,” “justification,” and “doubt” will radically alter if not completely lose their meaning. Wittgenstein stresses this point in many entries of OC, as in the following remark, where he writes:

If, therefore, I doubt or am uncertain about this being my hand (in whatever sense), why not in that case about the meaning of these words as well? (OC 456).

That is to say, once we assume ex hypothesis that we could be victims of a skeptical scenario, it would be hard to understand what could count as evidence for what, as each and every one of our perceptions would be the result of constant deception. Thus, to doubt a hinge would put in question the same meaning of the words with which we are expressing our doubts.

Another objection against Wright’s proposal goes as follows. Recall that, for Wright, we are rationally entitled to believe in the truth of hinges such as “Human beings have bodies” or “There are external objects.” A consequence of this thought (Pritchard, 2005) is that following this strategy, it would be possible to know the denials of skeptical hypotheses, even if in an unwarranted way: an anti-skeptical move which is excluded by Wittgenstein in many remarks of OC, as in the following entries:

Moore has every right to say he knows there’s a tree there in front of him. Naturally he may be wrong. (For it is not the same as with the utterance “I believe there is a tree there”.) But whether he is right or wrong in this case is of no philosophical importance. If Moore is attacking those who say that one cannot really know such a thing, he can’t do it by assuring them that he knows this and that […] (OC 520).

Moore’s mistake lies in this—countering the assertion that one cannot know that, by saying “I do know it” (OC 521).

That is to say, to claim against a Cartesian skeptic that we know Moore’s “obvious truisms of the commonsense” would be at the same time misleading and unconvincing.

First, as we have already seen “hinges” cannot be evidentially grounded, for any evidence, we could adduce to support a proposition p such as “I have a hand” would be less secure than p itself. As Wittgenstein writes at some point:

One says “I know” when one is ready to give compelling grounds. “I know” relates to a possibility of demonstrating the truth. Whether someone knows something can come to light, assuming that he is convinced of it. But if what he believes is of such a kind that the grounds he can give are no surer than his assertion, than he cannot say that he knows what he believes (OC 243).

Also, and more importantly, to claim that we “know” Moore’s “‘obvious truisms of the commonsense” on the basis of pragmatic considerations would simply miss the point of the skeptical challenge.

Moreover (Pritchard, 2005; Jenkins, 2007) following Wright, it is entirely rational to set aside skeptical concerns whenever we want to pursue a given epistemic practice; but here Wright seems to conflate practical and epistemic rationality. That is to say, a Cartesian skeptic can well agree that we have to dismiss skeptical concerns whenever we are involved in a given epistemic inquiry, as not to do so would lead to a cognitive paralysis. But even conceding that it would be practically rational to set aside skeptical worries in order to achieve cognitive results, what is at issue in the skeptical challenge is the epistemic rationality of trusting our senses when skeptical hypotheses are in play. In other words, a skeptic can grant that we have to rule out skeptical worries when we need to form true beliefs about the world in our everyday life; still, she can argue that the fact that we need true beliefs about the world does not make our acceptance of “hinges” epistemically rational as long as we cannot rule out skeptical scenarios. Wittgenstein himself was well aware of this point; consider OC 19:

The statement “I know that here is a hand” may then be continued: “for its my hand that I am looking at”. Then a reasonable man will not doubt that I know. Nor will the idealist; rather he will say that he was not dealing with the practical doubt which is being dismissed, but there is a further doubt behind that one. That this is an illusion has to be shewn in a different way.

4. The Contextualist Reading

Another influential reading of OC is Michael Williams’ “Wittgensteinian Contextualism” which he has proposed in his book Unnatural Doubts and in a number of other more recent works (1991, 2001, 2004a, 2004b, 2005). A first formulation of a contextualist interpretation of OC can be found in Morawetz (1978). (For a general introduction to epistemic contextualism, along with an overview of other “non Wittgensteinian” contextualist anti-skeptical proposals, see here.)

Recall that in some passages of OC (OC 114, 115, 315, 322), Wittgenstein argues that any proper inquiry presupposes certainty, that is, some unquestionable prior commitment. In these remarks, Wittgenstein also alludes to the importance of the context of inquiry; hence stating that without a precise context, there is no possibility of raising a sensible question or a doubt. Williams generalizes this part of Wittgenstein’s argument as follows: in each context of inquiry, there is necessarily a set of “hinge” beliefs (that he names methodological necessities), which will hold fast and which are therefore immune to epistemic evaluation in that context.

A motivation for this reading is the extreme heterogeneity of the “hinges” mentioned by Wittgenstein. Along with Moore’s “obvious truisms,” in fact, throughout OC he considers as “hinges” propositions whose certainty is indexed to a historical period (“No man has ever been on the moon.”) together with basic mathematical truths (“12 × 12 = 144”) and contingently empirical claims (“This is a hand.”). As per Williams, this would be a way to stress a basic feature of our inquiries: namely, they all would rest on unsupported presuppositions that can nevertheless be dismissed/questioned where new questions arise or when we are switching from a context of inquiry to another. For instance, a historical inquiry about whether, say, Napoleon won at Austerlitz presupposes “hinge commitments” such as “The world existed long before my birth”; all our everyday epistemic practices presuppose hinges such as “There are external objects” or “Human beings have bodies,” and so on.

Crucially, for Williams to take for granted the “methodological necessities” of a given epistemic practice is not only a matter of practical rationality as in Wright’s “entitlement strategy”; rather, it is a condition of possibility of any sensible enquiry. That is to say, while following Wright, we have to rest on “hinges” mostly because it is the only practical alternative, for Williams our confidence in Moore’s “obvious truisms” would highlight the constitutively “local” and “context-dependent” nature of all our epistemic practices.

Thus, in ordinary contexts, it would be illegitimate to doubt hinges such as “Human beings have bodies” or “There are external objects”; but once these are brought into focus, for instance by running skeptical arguments, we are simply switching from a context of inquiry to another, that is, from the everyday context into the philosophical one.

Therefore, by doubting the “hinges” of our most common epistemic practices, the skeptic is simply leading us from a context in which it is legitimate to hold these hinges fast toward a philosophical one in which everything can be doubtable.

Nonetheless, the skeptical move cannot affect our everyday knowledge claims, which are made by taken-for-granted “methodological necessities” such as “The earth existed long before my birth” or “Human beings have bodies.” At most, what the Cartesian skeptic is able to show us is that, in the more demanding context of philosophical reflection, we do not know, strictly speaking, anything at all.

A consequence of this thought is that, even if legitimate and constitutively unsolvable at a philosophical level, the Cartesian skeptical challenge would not affect our everyday epistemic practices as they belong to different contexts, with completely different methodological necessities or “hinges.” Moreover, the same propositions that we cannot claim to know at a philosophical level are known to be true, albeit tacitly, in other contexts even if they lack evidential support. Evidential support is something that they cannot constitutively possess, insofar as any hinge has to be taken for granted whenever we are involved in an epistemic practice.

There are many problems that Williams’ “Wittgensteinian contextualism” has to face in order to be considered a plausible interpretation of Wittgenstein’s thought. First, on his account, the skeptical enterprise is both completely legitimate and constitutively unsolvable. That is to say, in the context of our ordinary epistemic practices, it would be illegitimate to doubt “obvious truisms” such as “Human beings have bodies” or “This is a hand”; nonetheless, “hinges” would still be doubtable and dismissible in the more demanding context of philosophical inquiry.

Even if in some passages of OC, as we have seen while discussing the “therapeutic reading,” Wittgenstein seems to concede that a skeptic might be using the expressions “to know” and “to doubt” in a specialized and, so to say, “philosophical” way, this does not lead him to admit that the skeptic would be somewhat “right,” even if only in the philosophical context. Rather, throughout OC, Wittgenstein stresses that there is no context in which we can rationally hold a doubt about Moore’s “obvious truisms”; as we have seen supra, to seriously doubt a “hinge” would look more similar to a sign of mental illness than to a legitimate philosophical inquiry:

In certain circumstances a man cannot make a mistake. (Can here is used logically, and the proposition does not mean that a man cannot say anything false in those circumstances.) If Moore were to pronounce the opposite of those propositions which he declares certain, we should not just not share his opinion: we should regard him as demented (OC 155).

If I now say “I know that the water in the kettle in the gas-flame will not freeze but boil”, I seem to be as justified in this “I know” as I am in any. “If I know anything I know this”. Or do I know with still greater certainty that the person opposite me is my old friend so-and-so? And how does that compare with the proposition that I am seeing with two eyes and shall see them if I look in the glass? I don’t know confidently what I am to answer here. But still there is a difference between cases. If the water over the gas freezes, of course I shall be as astonished as can be, but I shall assume some factor I don’t know of, and perhaps leave the matter to physicists to judge. But what could make me doubt whether this person here is N.N., whom I have known for years? Here a doubt would seem to drag everything with it and plunge it into chaos (OC 613, my italics).

Thus, even if Wittgenstein seems somewhat to concede a prima facie plausibility to skeptical hypotheses, he nonetheless denies Moore’s “obvious truisms of the commonsense” can be sensibly doubted or denied, even if only in the context of philosophical inquiry (cfr. OC 231, 234).

Also, for Williams (2004a), Wittgenstein’s treatment of Moore’s “obvious truisms” would differ sensibly throughout OC; while the “hinges” listed by Moore would be “methodological necessities,” a statement such as “There are external objects,” namely the conclusion of PEW, would be plain nonsense (2004a, 86–87).

OC would then present two different anti-skeptical strategies, influenced respectively by Moore’s PEW and DCS. The first 60 entries of OC would be concerned with Moore’s Proof. On Williams’ reading, in these remarks, Wittgenstein would consider the skeptical challenge and Moore’s anti-skeptical strategy as constitutively senseless; both Moore and the skeptic would, in fact, treat “There are external objects” as a hypothesis that can be either confirmed by evidence or dismissed. But “There are external objects” is not an empirical hypothesis that can be tested or doubted; the very fact that we think, talk, and make judgments about the world shows that “There are external objects” and so any attempt to prove or to doubt this “proposition” would be misguided.

Williams motivates this division also to make sense of Wittgenstein’s saying that “There are external objects” is nonsense, but (Moyal-Sharrock, 2004, 91–92) Wittgenstein does not necessarily use the term nonsense in a derogatory way. Just consider this passage of the Philosophical Grammar (1974, henceforth PG):

[…] when we hear the two propositions, “This rod has a length” and its negation “This rod has no length”, we take sides and favor the first sentence, instead of declaring them both nonsense. But this partiality is based on a confusion; we regard the first proposition as verified (and the second as falsified) by the fact “that rod has a length of 4 meters” (PG 129).

 For Wittgenstein, then, nonsense is not only what violates sense, but also what defines or elucidates it. Thus, Wittgenstein calls “There are physical objects” nonsense; still, this does not amount to saying that it is unintelligible or senseless. Also, while it is undeniable that, for Wittgenstein, “There are external objects” cannot be treated as a hypothesis, there is no clear suggestion that he would consider other “hinges” as open to doubt or verification . Consider the following entry:

It is clear that our empirical propositions do not all have the same status, since one can lay down such a proposition and turn it from an empirical proposition into a norm of description. Think of chemical investigations. Lavoisier makes experiments with substances in his laboratory and now he concludes that this and that takes place when there is burning. He does not say that it might happen otherwise another time. He has got hold of a definite world-picture—not of course one that he invented: he learned it as a child. I say world-picture and not hypothesis, because it is the matter-of-course foundation for his research and as such also does unmentioned (OC 167, my italics).

5. The Non-epistemic Reading

If for Wright and Williams, it is then possible to know hinges such as “There are external objects” or “Human beings have bodies,” whether out of practical considerations or in the context of our ordinary epistemic practices; according to the proponents of the “non-epistemic reading” of OC, “hinges” are, strictly speaking, unknowable; still, this will not lead to skeptical conclusions.

This reading of OC was firstly proposed by Strawson (1985; for similar proposals, see also Wolgast 1987, Conway 1989). According to Strawson, for Wittgenstein skeptical doubts are neither meaningless nor irrational but simply unnatural. This is because, since the radical skeptical doubts are raised with respect to propositions that we find it natural to take for granted (such as “There are external objects” or “Human beings have bodies”), given our upbringing within a community that collectively holds them fast, we cannot help accepting Moore’s “obvious truisms of the commonsense” while lacking reasons and grounds in their favor. Thus, following this interpretation of OC, skeptical hypotheses should not rebutted by argument, but simply recognized as idle and unreal as they call into doubt what we cannot help believing, given our shared “form of life.”

A similar account has been proposed by Stroll (1994). According to Stroll, “hinges” lie at the foundation of our language-games with ordinary empirical propositions; as such, they cannot be said to be propositions in the ordinary sense, as they are not subject to truth and falsity or to verification and control.

Thus, the certainty of “hinges” such as “There are external objects” or “Human beings have bodies” has a non-propositional, pragmatic, or even animal nature that are not be subject to any epistemic evaluation.

This reflection on the “animal,” pre-rational certainty of “hinges” is the starting point of another “non-epistemic reading” of OC, proposed by Moyal-Sharrock (2004, 2005). As per Moyal-Sharrock, our confidence in Moore’s “obvious truisms of the commonsense” such as “There are material objects” or “Human beings have bodies” is not a theoretical or presuppositional certainty but a practical certainty that can express itself only as a way of acting (OC 7, 395); for instance, a “hinge” such as “Human beings have bodies” is the disposition of a living creature, which manifests itself in her acting in the certainty of having a body (Moyal-Sharrock, 2004, 67), and manifests itself in her acting embodied (walking, eating, not attempting to walk through walls, and so forth).

Accordingly, Cartesian skeptical arguments, even if prima facie compelling, rest on a misleading assumption: the skeptic is simply treating “hinges” as empirical, propositional knowledge-claims, while on the contrary, they express a pre-theoretical animal certainty, which is not subject to epistemic evaluation of any sort.

Due to this categorical mistake, a proponent of Cartesian skepticism conflates physical and logical possibility (2004, 170). That is to say, skeptical scenarios such as the BIV one are logically possible but just in the sense that they are conceivable; in other words, we can imagine skeptical scenarios, then run our skeptical arguments, and thus conclude that our knowledge is impossible. Still, the mere hypothesis that we might be disembodied BIV has no strength against the objective, animal certainty of “hinges” such as “There are material objects” or “Human beings have bodies,” just as merely thinking that “human beings can fly unaided” has no strength against the fact that human beings cannot fly without help.

Therefore, skeptical beliefs such as “I might be a disembodied BIV” or “I might be the victim of an Evil Deceiver” are nothing but belief-behavior (2004, 176), as the skeptic is doubting objectively certain “hinges”; thus, we should simply dismiss skeptical worries, for a skeptical scenario such as the BIV one does not and cannot have any consequences whatsoever on our epistemic practices or, more generally, on our “human form of life.”

This reading of OC has attracted several criticisms (see, for instance, Salvatore 2013, 2016). If, from one side, Moyal-Sharrock stresses the conceptual, logical indubitability of Moore’s “truisms,” she nonetheless seems to grant that the certainty of “hinges” stems from their function in a given context, to the extent that they can be sensibly questioned and doubted in fictional scenarios where they can “play the role” of empirical propositions. But crucially, if “hinges” are “objectively certainties” because of their role in our ordinary life, a skeptic can still argue that in the context of philosophical inquiry, Moore’s “commonsense certainties” play a role which, similar to the role they play in fictional scenarios, is both at odds with our “human form of life” and still meaningful and legitimate.

Moreover, despite Moyal-Sharrock’s insistence on the conceptual, logical indubitability of Moore’s “truisms of the commonsense,” her rendering of Wittgenstein’s strategy seems to resemble Williams’ proposal, thus incurring the objections we have already encountered against this reading. As it is argued throughout this work, to simply state that Cartesian skepticism has no consequence on our “human form of life” sounds like too much of a pragmatist response against the skeptical challenge. This is so because a skeptic can well agree that skeptical hypotheses have no consequence on our everyday practices or that they are just fictional scenarios; also, she can surely grant that Cartesian-style arguments cannot undermine the pre-rational confidence with which we ordinarily take for granted Moore’s “obvious truisms of the commonsense.” But crucially, and as Wittgenstein was well aware, a skeptic can always argue that she is not concerned with practical doubt (OC 19) but with a, so to speak, purely philosophical one.

Also and more importantly, even if we agree with Moyal-Sharrock on the “nonsensical” nature of skeptical doubts, this nonetheless has no strength against Cartesian-style skepticism. Recall the feature of Cartesian skeptical arguments: take a skeptical hypothesis SH such as the BIV one and M, a mundane proposition such as “This is a hand.” Now, given the Closure principle, the argument goes as follows:

(S1) I do not know not-SH

(S2) If I do not know not-SH, then I do not know M

Therefore

(SC) I do not know M

 In this argument, whether an agent is seriously doubting if she has a body or not is completely irrelevant to the skeptical conclusion, “I do not know M.” Also, a proponent of Cartesian-style skepticism can surely grant that we are not BIV, or that we are not constantly deceived by an Evil Genius and so on. Still, the main issue is that we cannot know whether we are victim of a skeptical scenario or not; thus, given Closure, we are still unable to know anything at all.

6. The Non-propositional Reading

Wittgenstein’s reflections on the structure of reason have influenced a more recent “Wittgenstein-inspired” anti-skeptical position, namely Pritchard’s “hinge-commitment” strategy (2016b), for which “hinges” are not beliefs but rather arational, non-propositional commitments, not subject to epistemic evaluation.

To understand his proposal, recall the following remarks we have already quoted supra:

If you are not certain of any fact, you cannot be certain of the meaning of your words either […] If you tried to doubt everything you would not get as far as doubting anything. The game of doubting itself presupposes certainty (OC 114–115).

The question that we raise and our doubts depend on the fact that some propositions are exempt from doubt, are as it were the hinges on which those turn [….] that is to say, it belongs to the logic of our scientific investigations that certain things are indeed not doubted […] If I want the door to turn, the hinges must stay put (OC 341–343).

As per Pritchard, here Wittgenstein would claim that the same logic of our ways of inquiry presupposes that some propositions are excluded from doubt; and this is not irrational or based on a sort of blind faith but, rather, belongs to the way rational inquiries are put forward (see OC 342) . As a door needs hinges in order to turn, any rational evaluation would then require a prior commitment to an unquestionable proposition/set of “hinges” in order to be possible at all.

A consequence of this thought (2016b, 3) is that any form of universal doubt such as the Cartesian skeptical one is constitutively impossible; there is simply no way to pursue an inquiry in which nothing is taken for granted. In other words, the same generality of the Cartesian skeptical challenge is then based on a misleading way of representing the essentially local nature of our enquiries.

This maneuver helps Pritchard to overcome one of the main problems facing Williams’ “Wittgensteinian Contextualism.” Recall that, following Williams, the Cartesian skeptical challenge is both legitimate and unsolvable, even if only in the more demanding philosophical context. On the contrary, argues Pritchard, as per Wittgenstein, there is simply nothing like the kind of universal doubt employed by the Cartesian skeptic, both in the philosophical and in the, so to say, non-philosophical context of our everyday epistemic practices. A proponent of Cartesian skepticism looks for a universal, general evaluation of our beliefs; but crucially, there is no such thing as a general evaluation of our beliefs, whether positive (anti-skeptical) or negative (skeptical), for all rational evaluation can take place only in the context of “hinges” which are themselves immune to rational evaluation.

Each and every one of our epistemic practices rests on “hinges” that we accept with certainty, a certainty which is the expression of what Pritchard calls “‘über-hinge’ commitment.” This would be an arational commitment toward our most basic beliefs (such as that “There are external objects” or “Human beings have bodies”) that, as we mentioned above, is not itself opened to rational evaluation, but that importantly is not a belief.

To understand this point, just recall Pritchard’s criticism toward Wright’s rational entitlement. As we have seen, Wright argues that it would be entirely rational to claim that we know Moore’s “obvious truisms of the commonsense” whenever we are involved in an epistemic practice which is valuable to us; but, Pritchard argues, in order to know a proposition, we need reasons to believe that proposition to be true. And as, following Wright, we have no reason to consider “hinges” true other than the fact that we need to take them for granted, then we cannot have knowledge of them either.

With these considerations in mind, we can come back to Pritchard’s “‘über-hinge’ commitment.” As we have seen, this commitment would express a fundamental arational relationship toward our most basic certainties, a commitment without which no knowledge is possible. Crucially, our basic certainties are not subject to rational evaluation; for instance, they cannot be confirmed or disconfirmed by evidence and thus they would be non-propositional in character (that is to say, they can be neither true nor false). Accordingly, they are not beliefs at all; rather, they are the expression of arational, non-propositional commitments. Thus, the skeptic is somewhat right in saying that we do not know Moore’s “obvious truisms of the commonsense”; but this will not lead to skeptical conclusions, for our “hinge commitments” are not beliefs so they cannot be objects of knowledge. Therefore, the skeptical challenge is misguided in the first place.

Pritchard’s account is concerned first and foremost with the psychology of our inquiries, and not with the epistemic status of the “hinges”; thus, his reflections on the structure of reason are just meant to stress the local nature of our epistemic practices, for which we have to rule out general doubts such as the skeptical one. But (Salvatore, 2016) even if, following his strategy, we are able to retain our knowledge of “mundane” propositions, the skeptic will still be able to undermine our confidence in the rationality of our ways of inquiry; under skeptical scrutiny, we will be forced to admit that all our practices rest on unsupported, ungrounded arational presuppositions that are not, and crucially cannot be, rationally grounded.

7. The Framework Reading

Another reading of OC is the “framework reading” (McGinn, 1989; see also Coliva, 2010) according to which “hinges” are “judgments of the frame,” that is, conditions of possibility of any meaningful epistemic practice.

This reading stems from the passages in which Wittgenstein highlights the analogy between Moore’s “obvious truisms of the commonsense” and basic mathematical truths:

But why am I so certain that this is my hand? Doesn’t the whole language-game rest on this kind of certainty? Or: isn’t this “certainty” (already) presupposed in the language-game? […] Compare with this 12×12=144. Here too we don’t say “perhaps”. For, in so far as this proposition rests on our not miscounting or miscalculating and on our senses not deceiving us as we calculate, both propositions, the arithmetical one and the physical one, are on the same level. I want to say: The physical game is just as certain as the arithmetical. But this can be misunderstood. My remark is a logical and not a psychological one (OC 446– 447).

I want to say: If one doesn’t marvel at the fact that the propositions of arithmetic (e.g. the multiplication tables) are “absolutely certain”, then why should one be astonished that the proposition “This is my hand” is so equally? (OC 448).

According to McGinn, we should read Wittgenstein’s remarks on “hinges” in light of his views about mathematical and logical truths. In the Tractatus Logico-Philosophicus (henceforth TLP), Wittgenstein held what we might call an “objectivist” account of logical and mathematical truths, for which they were a description of the a priori necessary structure of reality. In the later phase of his thinking, Wittgenstein completely dismissed this view, suggesting instead that we should think of logical and mathematical truths as constituting a system of techniques originating and developed in the course of the practical life of human beings. What is important in these practices is not their truth or falsity but their technique-constituting role; so, the question about their truth or falsity simply cannot arise. Quoting Wittgenstein:

The steps which are not brought into question are logical inferences. But the reason is not that they “certainly correspond to the truth-or sort-no, it is just this that is called “Thinking”, “speaking”, “inferring”, “arguing”. There is not any question at all here of some correspondence between what is said and reality; rather is logic antecedent to any such correspondence; in the same sense, that is, as that in which the establishment of a method of measurement is antecedent to the correctness or incorrectness of a statement of length (RFM, I, 156).

That is to say, logical and mathematical truths define what “to infer” and “to calculate” is; accordingly, given their “technique-constituting” role, these propositions cannot be tested or doubted, for to accept and apply them is a constitutive part of our techniques of inferring and calculating.

If logical and mathematical propositions cannot be doubted, this is also the case for Moore’s “obvious truisms of the commonsense.” Even if they resemble empirical, contingent knowledge claims, all these “commonsense certainties” play a peculiar role in our system of beliefs; namely, they are what McGinn calls “judgment of the frame” (1989, 139).

As mathematical and logical propositions define and constitute our techniques of inferring and calculating, “hinges” such as “This is a hand,” “The world existed long before my birth,” and “I am a human being” would then define and constitute our techniques of empirical description. That is to say, Moore’s “obvious truisms of the commonsense” would show us how to use words: what “a hand” is, what “the world” is, what “a human being” is and so on (1989, 142).

Both Moore and the skeptic misleadingly treat “hinges” such as “Human beings have bodies” or “There are external objects” as empirical propositions, which can be known or believed on the basis of evidence. But Moore’s “obvious truisms” are certain, their certainty being a criterion of linguistic mastery; in order to be considered a full participant of our epistemic practices, an agent must take Moore’s “obvious truism” for granted.

Even though the “framework reading” has generally been considered a more viable interpretation of Wittgenstein’s thought (see Coliva, 2010), its anti-skeptical strength has been the focus of some serious analysis. First, following this account, to take “hinges” for granted is a condition of possibility of our epistemic practices (1989, 116-120); still (Minar, 2005, 258), a skeptic can nonetheless argue that the indubitability of Moore’s “obvious truisms of the commonsense” is nothing but a fact about what we do. That is to say, “hinges” such as “Human beings have bodies” or “There are external objects” are presupposed by our ordinary linguistic exchanges and constitute what McGinn calls our “framework judgments”; but once these “obvious truisms” are brought into focus, the skeptic will find that their not being up for questioning is simply what happens in normal circumstances. As we have already seen while presenting Conant’s and Wright’s readings of OC, the very fact that in our ordinary life we have to rule out skeptical hypotheses has no strength against Cartesian skepticism; again, what is at issue in the skeptical challenge is not the practical, but the epistemic rationality of setting aside skeptical concerns when we cannot rule out Cartesian-style scenarios.

8. Concluding Remarks

Given the elusive nature of Wittgenstein’s remarks on skepticism, there is still little to no consensus on how they should be interpreted or, more generally, whether Wittgenstein’s remarks alone can represent a valid response to radical skepticism. Pritchard’s (2016b, c) reflection on “hinges,” for example, are just one part of a more complex anti-skeptical framework that he calls epistemological disjunctivism (Pritchard, 2016b) and that would be impossible to summarize here. Coliva (2010, 2015) has recently proposed a version of the “framework reading” in which “hinges,” even if propositional, have a normative role, and their acceptance is a “condition of possibility” of any rational enquiry (on Coliva’s reading of OC and its anti-skeptical implications, see Moyal-Sharrock, 2013, and Pritchard & Boult, 2013). On the contrary, Salvatore (2015, 2016) has used the analogy drawn by Wittgenstein between “hinges” and “rules of grammar” in order to argue for the nonsensicality of skeptical hypotheses, which would be nonsensical combination of signs excluded by our epistemic practices (defined and constituted by “hinges” such as “Human beings have bodies” or “There are external objects”).

9. References and Further Reading

  • Clarke, T. (1972), “The Legacy of Skepticism,” The Journal of Philosophy, Vol. 69, No. 20, Sixty-Ninth Annual Meeting of the American Philosophical Association Eastern Division, 754–769.
  • Conway, G. D. (1989), Wittgenstein on Foundations, Atlantic Highlands, N.J., Humanities Press.
  • Coliva, A. (2015), Extended Rationality: A Hinge Epistemology, Palgrave MacMillan.
  • Coliva, A. (2010), Moore and Wittgenstein. Scepticism, Certainty and Common Sense, Palgrave MacMillan.
  • Coliva, A. (2009a), “Moore’s Proof and Martin Davies’ epistemic projects,” Australasian Journal of Philosophy.
  • Coliva, A. (2009b), “Moore’s Proof, Liberals and Conservatives. Is There a Third Way?” in A. Coliva (ed.) Mind, Meaning and Knowledge. Themes from the Philosophy of Crispin Wright, OUP.
  • Conant, J. (1998), “Wittgenstein on Meaning and Use,” Philosophical Investigations.
  • Malcolm, N. (1949), “Defending Common Sense,” Philosophical Review.
  • Minar, E. (2005), “On Wittgenstein’s Response to Scepticism: The Opening of On Certainty,” in D. Moyal-Sharrock and W.H. Brenner (eds.), Readings of Wittgenstein’s On Certainty, London, Palgrave, 253–274.
  • Moore, G. E. (1925), “A Defense of Common Sense,” in Contemporary British Philosophers, 1925, reprinted in G. E. Moore, Philosophical Papers, London, Collier Books, 1962.
  • Moore, G. E. (1939), “Proof of an External World,” Proceedings of the British Academy, reprinted in Philosophical Papers.
  • Moore, G. E. (1942), “A Reply to My Critics,” in Paul Arthur Schilpp (ed.), The Philosophy of G. E. Moore. Open Court.
  • Moyal-Sharrock, D. (2013), “On Coliva’s Judgmental Hinges,” Philosophia, Vol. 41, No. 1, 13–25.
  • Moyal-Sharrock, D. (2004), Understanding Wittgenstein’s On Certainty, London, Palgrave Macmillan.
  • Moyal-Sharrock, D. and Brenner, W. H. (2005), Readings of Wittgenstein’s On Certainty, London, Palgrave.
  • McGinn, M. (1989), Sense and Certainty: A Dissolution of Scepticism, Oxford, Blackwell.
  • Morawetz, T. (1978), Wittgenstein & Knowledge: The Importance of “On Certainty,” Cambridge, MA, Harvester Press.
  • Pritchard, D. H. (2016a), Epistemic Angst. Radical Scepticism and the Groundlessness of Our Believing, Princeton University Press.
  • Pritchard, D. H. (2016b), “Wittgenstein on Hinges and Radical Scepticism in On Certainty,” Blackwell Companion to Wittgenstein, H.-J. Glock & J. Hyman (eds.), Blackwell.
  • Pritchard, D. H. and Boult, C. (2013), “Wittgensteinian anti-scepticism and epistemic vertigo,” Philosophia, Vol. 41, No. 1, 27–35.
  • Pritchard, D. H. (2005), “Wittgenstein’s On Certainty and Contemporary Anti-skepticism,” in Readings of Wittgenstein’s On Certainty, D. Moyal-Sharrock and W.H. Brenner (eds.), London, Palgrave, 189–224.
  • Salvatore, N. C. (2016), “Skepticism and Nonsense,” Southwest Philosophical Studies.
  • Salvatore, N. C. (2015), “Wittgensteinian Epistemology and Cartesian Skepticism,” Kriterion-Journal of Philosophy, Vol. 29, No. 2, 53–80.
  • Salvatore, N. C. (2013), “Skepticism, Rules and Grammar,” Polish Journal of Philosophy, Vol. 7, No. 1, 31–53.
  • Strawson, P. F. (1985), Skepticism and Naturalism: Some Varieties, London, Methuen.
  • Stroll, A. (1994), Moore and Wittgenstein On Certainty, Oxford, Oxford University Press.
  • Stroud, B. (1984), The Significance of Philosophical Scepticism, Oxford, Oxford University Press.
  • Williamson, T. (2000), Knowledge and Its Limits, Oxford, Oxford University Press.
  • Williams, M. (2004a), “Wittgenstein’s Refutation of Idealism,” in Wittgenstein and Skepticism, D. McManus (ed.), London, New York, Routledge, 76–96.
  • Williams, M. (2004b), “Wittgenstein, Truth and Certainty,” in Wittgenstein’s Lasting Significance, M. Kolbel, B. Weiss (eds.), Routledge, London.
  • Williams, M. (2005), “Why Wittgenstein isn’t a Foundationalist,” in Readings of Wittgenstein’s On Certainty, D. Moyal-Sharrock and W. H. Brenner (eds.), 47–58.
  • Wright, C. (1985), “Facts and Certainty,” Proceedings of the British Academy, Vol. 71, 429–472.
  • Wittgenstein, L. (2009), Philosophical Investigations, revised 4th edn. edited by P. M. S. Hacker and Joachim Schulte, tr. G. E. M. Anscombe, P. M. S. Hacker and Joachim Schulte, Wiley-Blackwell, Oxford.
  • Wittgenstein, L. (1979), Wittgenstein’s Lectures, Cambridge 1932–35, from the Notes of Alice Ambrose and Margaret MacDonald, ed. Alice Ambrose, Blackwell, Oxford.
  • Wittgenstein, L. (1974) Philosophical Grammar, ed. R. Rhees, tr. A. J. P. Kenny, Blackwell, Oxford.
  • Wittgenstein, L. (1969), On Certainty, ed. G. E. M. Anscombe and G. H. von Wright, tr. D. Paul and G. E. M. Anscombe, Blackwell, Oxford.
  • Wolgast, E. (1987), “Whether Certainty is a Form of Life,” The Philosophical Quarterly, Vol. 37, 161–165.
  • Wright, C. (2004a), “Warrant for nothing (and foundation for free)?” Aristotelian society Supplement, Vol. 78, No. 1, 167–212.
  • Wright, C. (2004b), “Wittgensteinian Certainties,” in Wittgenstein and Skepticism, D. McManus (ed.), 22–55.

 

Author Information

Nicola Claudio Salvatore
Email: n162970@dac.unicamp.br
University of Campinas – UNICAMP
Brazil

Socialism

Socialism is both an economic system and an ideology (in the non-pejorative sense of that term). A socialist economy features social rather than private ownership of the means of production. It also typically organizes economic activity through planning rather than market forces, and gears production towards needs satisfaction rather than profit accumulation. Socialist ideology asserts the moral and economic superiority of an economy with these features, especially as compared with capitalism. More specifically, socialists typically argue that capitalism undermines democracy, facilitates exploitation, distributes opportunities and resources unfairly, and vitiates community, stunting self-realization and human development. Socialism, by democratizing, humanizing, and rationalizing economic relations, largely eliminates these problems.

Socialist ideology thus has both critical and constructive aspects. Critically, it provides an account of what’s wrong with capitalism; constructively, it provides a theory of how to transcend capitalism’s flaws, namely, by transcending capitalism itself, replacing capitalism’s central features (private property, markets, profits) with socialist alternatives (at a minimum social property, but typically planning and production for use as well).

How, precisely, socialist concepts like social ownership and planning should be realized in practice is a matter of dispute among socialists. One major split concerns the proper role of markets in a socialist economy. Some socialists argue that extensive reliance on markets is perfectly compatible with core socialist values. Others disagree, arguing that to be a socialist is (among other things) to reject the ‘anarchy of the market’ in favor of a planned economy. But what form of planning should socialists advocate? This is a second major area of dispute, with some socialists endorsing central planning and others proposing a radically decentralized, participatory alternative.

This article explores all of these themes. It starts with definitions, then presents normative arguments for preferring socialism to capitalism, and concludes by discussing three broad socialist institutional proposals: central planning, participatory planning, and market socialism.

Two limitations should be noted at the outset. The article focuses on moral and political-philosophical issues rather than purely economic ones, discussing the latter only briefly. Second, little is said here about socialism’s rich and complicated history. The article emphasizes the philosophical content of socialist ideas rather than their historical development or political instantiation.

Table of Contents

  1. Socialism and Capitalism: Basic Institutional Contrasts
    1. Ownership: Some Preliminaries
    2. Private, State, and Social Ownership
    3. Economic Systems as Hybrids
  2. Socialism vs. Communism in Marxist Thought
  3. Why Socialism? Economic Considerations
  4. Why Socialism? Democracy
    1. Scope
    2. Influence
  5. Why Socialism? Exploitation
    1. Exploitation as Forced, Unpaid Labor
    2. Eliminating Exploitation
  6. Why Socialism? Freedom and Human Development
    1. Formal Freedom
    2. Effective Freedom
  7. Why Socialism? Community and Equality
    1. Why Produce? Communal vs. Market Reciprocity
    2. Justice, Inequality, Community
  8. Institutional Models of Socialism for the 21st Century
    1. Central Planning
    2. Participatory Planning
      1. Parecon: Basic Features
      2. Allocation in Parecon: Economic Coordination Through Councils
      3. Evaluating Parecon
    3. Market Socialism
      1. Schweickart’s “Economic Democracy”
      2. Evaluating Economic Democracy
  9. References and Further Reading

1. Socialism and Capitalism: Basic Institutional Contrasts

Considered as an economic system, socialism is best understood in contrast with capitalism.

Capitalism designates an economic system with all of the following features:

  1. The means of production are, for the most part, privately owned;
  2. People own their labor power, and are legally free to sell it to (or withhold it from) others;
  3. Production is generally oriented towards profit rather than use: firms produce not in the first instance to satisfy human needs, but rather to make money; and
  4. Markets play a major role in allocating inputs to commodity production and determining the amount and direction of investment.

An economic system is socialist only if it rejects feature 1, private ownership of the means of production in favor of public or social ownership. But must an economic system reject any of features 2-4 to count as socialist, or is rejection of private property sufficient as well as necessary? Here, socialists disagree. Some, often called “market socialists”, hold that socialism is compatible, in principle, with wage labor, profit-seeking firms, and extensive use of markets to organize and coordinate production and investment. Others, sometimes called “orthodox” or “classical” socialists, contend that an economic system with these features is scarcely distinguishable from capitalism; true socialism, on this view, requires not merely social ownership of the means of production but also planned production for use, as opposed to “anarchic”, market-driven production for profit.

This section explores the core socialist commitment to social ownership of the means of production. Other important aspects of socialism—for instance, its stance towards markets and planning—are discussed in later sections (especially section 8).

a. Ownership: Some Preliminaries

Consider a society’s instruments of production, its land, buildings, factories, tools, and machinery; consider also its raw materials, its oil and timber and minerals and so on. Together, these instruments and these materials comprise society’s means of production. To whom should these means of production belong: to society as a whole, or to private individuals or groups of individuals? This is the central question dividing capitalists and socialists, with capitalists advocating extensive rights of private ownership of the means of production and socialists advocating extensive social or public ownership of these means.

Notice that the capitalist/socialist dispute does not concern the desirability of private property in items unrelated to production. The issue between socialists and capitalists is not whether individuals should be able to own “personal property” (for example, toothbrushes, houses, clothing, and other articles of everyday use) but whether they should be able to own “productive property” (for example, stores, factories, raw materials, and other productive assets).

But what does it mean to own something? Standardly, to own something is to enjoy a bundle of legally enforceable rights and powers over that thing. These rights and powers typically include the right to use, to control, to transfer, to alter (at the limit, even to destroy), and to generate income from the thing owned, as well as the right to exclude non-owners from interacting with the owned thing in these ways. Because these rights admit of gradations, so too does ownership, which is scalar—a matter of degree—rather than dichotomous. In general, the wider one’s rights of use, control, and so on over an object, the fewer restrictions one faces in exercising these various rights, and the wider one’s ownership rights over that object. Ownership, notice, may be narrowed and restricted without ceasing to be ownership. Limited ownership is not an oxymoron.

Another important distinction here is that between legal and effective ownership. These can go together, as when a person owns her car both in law and in fact: she not only has the title, but also possesses actual powers of use, control, and so on over the vehicle. But so too can they come apart. “The means of production belong to all the people,” proclaimed the Soviet Union’s constitution, but these were just words, for in reality democratically unaccountable bureaucrats and party officials grasped all the important economic levers. Something similar could be said of the relationship between shareholders in large capitalist corporations, on the one hand, and management and executives on the other: the former have “paper” ownership, but it is the latter that really exercise control. In general, it is effective rather than merely legal or formal ownership that is of interest in the present context. Capitalists and socialists alike want to realize their preferred patterns of ownership not just on paper, but also in reality.

b. Private, State, and Social Ownership

To understand socialism, one must distinguish between three forms of ownership. Under private ownership, individuals or groups of individuals (for example, corporations) are the primary agents of ownership; it is they who enjoy the various rights of use, control, transfer, income generation, and so on discussed above. Under state ownership, the state retains for itself these rights, and is thus the primary agent of ownership. Both of these forms of ownership should be familiar to anyone who has frequented a business or driven on an interstate highway.

Much less familiar is the key socialist idea of social ownership. Social ownership of an asset means that “the people have control over the disposition of that asset and its product” (Roemer, A Future for Socialism 18). Social ownership of the means of production, then, obtains to the degree that the people themselves have control over these means: over their use and over the products that eventuate from that use. This is a conceptually simple idea, but it can be difficult to grasp its practical implications. How, in concrete terms, could social control over the means of production be realized?

Historically, socialists have struggled to answer this question, and for good reason: it is not at all obvious how meaningful control over something as massive and complex as a modern economy might be shared across tens or even hundreds of millions of people. Broadly speaking, socialists have identified two main strategies of socialization. The first seeks to socialize the economy by nationalizing it. The second seeks the same end by radically decentralizing and democratizing economic power. These strategies will be investigated in greater detail below (see section 8), but for now a few orienting remarks are in order.

First, regarding nationalization: state ownership functions as a vehicle for socialization only to the extent that the people are themselves in control of the state. Otherwise nationalization amounts to little more than statism, not socialism; it constitutes economic rule by state officials rather than by society as a whole. Any genuinely socialist program of nationalization, then, must adhere to a two-part recipe: nationalize the economy, but also democratize the state, thereby putting the people in control of the economy at one remove.

This second step has proven rather elusive in practice. It was not accomplished—indeed, it was not even really attempted—by the so-called “socialist” authoritarianisms of the 20th century such as the Soviet Union and China. And certainly considerable barriers to genuine democratization exist even in countries with longstanding liberal democratic traditions, such as the United States. These barriers include the awesome influence of special interests and concentrated wealth on the political process, corporate domination of political media, voter ignorance and apathy, and so on. Democracy—popular control over the state—is, in short, an ideal easier praised than implemented, even under favorable conditions. However, these considerable practical problems aside, there seems to be nothing incoherent in principle with the idea of a genuinely socialist—because genuinely democratic—program of nationalization.

Or is there? Many socialists argue that state ownership can never fully realize socialism’s promise, no matter how democratic the relationship between the people and the state. This is because real social ownership involves not only control-at-a-remove, so to speak, but also active involvement and participation. On this conception, it is not enough for democratically accountable politicians and bureaucrats to steer the economy in your name; rather, you must do (or at least have the real opportunity to do) some of the steering yourself. The core idea here is well expressed by Michael Harrington:

Socialization means the democratization of decision making in the everyday economy, of micro as well as macro choices. It looks primarily but not exclusively to the decentralized, face-to-face participation of the direct producers and their communities in determining the matters that shape their social lives (197).

In a socialist society, average, everyday people must be active rather than passive, empowered rather than subordinated, involved rather than excluded. But if this is what genuine socialization requires, then socialism is

not a formula or a specific legal mode of ownership, but a principle of empowering people at the base, which can animate a whole range of measures, some of which we do not yet even imagine (Harrington, 197).

The point is not that nationalization can never play a role in making socialism real, but that it cannot play the outsized role often assigned to it.

But if socialists should not rely exclusively on nationalization, to what else should they appeal instead? Different socialists will answer this question in different ways, as we will see in section 8. But most would recommend leavening democratically controlled state ownership with sizable helpings of workplace democracy (as found, for instance, in the Mondragon and La Lega cooperatives in Spain and Italy, respectively), social control over investment, and various other measures to economically empower local communities and individuals (for instance, the “participatory budgeting” process found in Porto Allegre, Brazil, through which citizens meet in popular assemblies to decide how the city’s resources should be spent). By knitting together nationalization of major industries with these and other programs and initiatives, socialists hope to bring to fruition the “truly audacious project of empowering people to take command of their everyday lives” (Harrington, 197).

c. Economic Systems as Hybrids

In principle, an economy could be wholly capitalist, statist, or socialist. An economy would be wholly capitalist just in case all its productive assets were privately controlled; wholly statist, provided all such assets were state-controlled; and wholly socialist, provided all such assets were socially-controlled. While these are coherent theoretical possibilities, they have not been implemented in practice. In reality, all economies are hybrids that blend together private, social, and state ownership. It is better, then, to think of capitalism, statism, and socialism “not simply as all-or-nothing ideal types of economic structures, but also as variables” (Wright, 124). According to this analysis, an economy can be more or less capitalist, socialist, or statist, depending on the particular balance it strikes between the three forms of ownership.

For example, even in the United States—widely seen as a bastion of capitalism—the state plays a considerable role in controlling economic activity and in distributing the proceeds thereof. Does this mean it is a statist or perhaps even a socialist economy? No. Economies should be individuated with reference to their dominant mode of ownership. Since capitalist ownership dominates the United States’ economy—most of its productive assets being privately owned—it should be thought of as capitalist, albeit with some non-capitalist aspects. Similarly, an economy within which most productive assets are socially controlled should count as socialist, even if (as would almost certainly be the case) it also included statist or capitalist elements.

2. Socialism vs. Communism in Marxist Thought

Although this article focuses on socialism rather than Marxism per se, there is an important distinction within Marxist thought that warrants mention here. This is the distinction between socialism and communism.

Both socialism and communism are forms of post-capitalism. Both feature social rather than private ownership of the means of production. Both, within Marxist orthodoxy, reject market production for profit in favor of planned production for use. But beyond these important similarities lie significant differences. In the Critique of the Gotha Progam, Marx’s fullest discussion of these matters, he divides post-capitalism into two parts, a “lower phase” (later called “socialism” by followers of Marx) and a “higher phase” (communism). The lower phase follows immediately on the heels of capitalism, and so resembles it in certain ways. As Marx memorably puts this point, socialism is “in every respect, economically, morally and intellectually still stamped with the birth marks of the old society from whose womb it emerges” (Critique of the Gotha Program 614). These capitalist “birth marks” include:

  • Material scarcity. Like capitalism, socialism does not overcome scarcity. Under socialism, the social surplus increases, but it is not yet sufficiently large to cover all competing claims.
  • The state. Socialism transforms the state but does not do away with it. What was a “dictatorship of the bourgeoisie” under capitalism becomes a “dictatorship of the proletariat” under socialism: a state controlled by and for the working class. (Since workers make up the vast majority, this is less authoritarian than it sounds.) Workers must seize the state and use it to implement, deepen, and secure the socialist transformation of society. Only after this transformation is complete can the state “wither away”, and the “government of people” be replaced by the “administration of things”.
  • The division of labor. Socialism, like capitalism, will feature occupational specialization. Having developed under capitalist educational and cultural institutions, most people were socialized to fit narrow, undemanding productive roles. They are not, therefore, “all around individuals” capable of performing a wide variety of complex productive tasks. Accordingly, socialism must feature a broadly inegalitarian occupational structure. As under capitalism, there will be janitors and engineers, nurses’ aides and surgeons, factory workers and planners.
  • Finally, under socialism (many) people will retain certain capitalist attitudes about production and distribution. For example, they expect compensation to vary with contribution. Since contributions will differ, so too will rewards, leading to unequal standards of living. Turning from distribution to production, many socialist producers will, like their capitalist predecessors, regard work as merely a “means to life” rather than “life’s prime want”. They work, in short, to get paid, rather than to develop and apply their capacities or to benefit their comrades.

So in all of these ways, the “lower phase” of post-capitalism resembles its capitalist predecessor. Over time, however, these capitalist “birth marks” fade, all traces of bourgeois attitudes and institutions vanish, and humanity finally achieves the “higher phase” of post-capitalist society, full communism.

What would “full communism” be like? Marx never answered this question in detail—and indeed, he disparaged as utopian those socialists who focused excessively on “drawing up recipes for the kitchens of the future”—but from his brief remarks about communist society certain broad outlines can be discerned. Perhaps his most famous description of communism comes in the following passage from the Critique of the Gotha Program:

In a higher phase of communist society, after the enslaving subordination of the individual to the division of labor, and therewith also the antithesis between mental and physical labor, has vanished; after labor has become not only a means of life but also life’s prime want; after the productive forces have also increased with the all-round development of the individual, and all the springs of cooperative wealth flow more abundantly—only then can the narrow horizon of bourgeois right be crossed in its entirety and society inscribe on its banner: from each according to his ability, to each according to his needs (615):

Unpacking this passage, we see that Marx makes all of the following claims about communism:

  • It has done away with the division of labor, especially that between mental and physical labor;
  • Attitudes towards work have changed (perhaps in part because work itself has changed). Communist producers regard work as both instrumentally and intrinsically valuable: they see work not merely as a means to life, but also as “life’s prime want”;
  • Human beings themselves have changed under communism, becoming “all-around”, highly developed individuals (rather than the stunted, one-sided creatures that so many of them were under capitalism and perhaps even under socialism);
  • Material scarcity has been eliminated or at least greatly attenuated, as “the productive forces have increased” and thus “all the springs of cooperative wealth flow more abundantly”;
  • As a result of all these changes, communist society is able to conform to the famous principle: from each according to his ability, to each according to his needs—thus severing the link (found in communism’s “lower phase”) between contribution and reward.

Not only will communism (unlike socialism) do away with class, material scarcity, and occupational specialization, it will also do away with the state. As noted above, the state begins to wither away under socialism. But this process is not completed until the “higher phase” of full communism, for it is only in that phase that lingering class antagonisms are finally eradicated. With these antagonisms cleared away, the state has nothing to do—no class conflict to manage, no further function to perform—and so, like a vestigial limb, it gradually atrophies from disuse. Or, as Engels famously puts this point in Socialism: Utopian and Scientific,

State interference in social relations becomes, in one domain after another, superfluous, and then dies out of itself; the government of persons is replaced by the administration of things, and by the conduct of processes of production. The state is not “abolished”. It withers away (91).

In sum, within Marxist theory socialism and communism are very different indeed. Although both eradicate private property and profits, only the latter also eliminates the division of labor, the state, material scarcity, and perhaps even conflict itself. It is only under communism that mankind completes its ascendance from the “kingdom of necessity” into the “kingdom of freedom” (Engels 95).

3. Why Socialism? Economic Considerations

Is socialism worthy of allegiance, and if so, why?

The standard normative argument for socialism is comparative. Socialists typically single out certain moral and political values, argue that these values are poorly served under capitalism, and then support socialism by contending that these values would fare better—not necessarily perfectly, but better—under socialism. Values drawn upon by socialists vary, but usually include democracy, non-exploitation, freedom (both formal and effective), community, and equality. Sections 4–7 discuss these values and their alleged connections with socialism.

But before turning to these explicitly normative arguments, a word should be said about the purely economic case for socialism. (Since this article’s focus is normative rather than economic, this section will be brief.) Capitalism, many socialists hold, is wild and wasteful, prone to great booms and tremendously destructive busts. The argument goes like this: capitalist competition greatly augments society’s forces of production. Each firm, merely to stay in business, must innovate. As a result, productivity soars. Ever more output can be produced for ever fewer inputs, labor included. Abundance looms.

But this very abundance, paradoxically, is an economic problem. Gluts drive down prices as supply overwhelms demand. Profits decline. Firms, forced to cut costs, sack workers and slash wages. As unemployment and economic insecurity mount, demand plummets still further: people simply don’t have much money to spend. With reduced demand comes reduced opportunities for profits, hence, reduced production. What was a boom has turned into a bust, and society faces the absurd spectacle of idle farms next to hungry people; empty shoe factories beside shoeless workers; foreclosed houses alongside the homeless.

Capitalism, then, makes possible universal abundance. But its central features—market competition, the pursuit of profits, and private property—ensure that this possibility will never be realized. In Marxist language, there is a deep “contradiction” between capitalism’s “forces of production” and its “relations of production”, a contradiction that nothing short of socialist revolution can solve. Society must overthrow capitalist productive relations, replacing anarchic market production for profit with planned production for use. Only then will humanity eliminate the ridiculous concatenation of vast productive potential alongside vast unmet needs. Or so the socialist argument goes.

Socialists find further economic faults with capitalism. Capitalism misallocates resources towards producing what is profitable rather than what is needed. True, what is needed can sometime be profitable. But often the two categories come apart. Think, the socialist will say, of the vast resources spent producing luxuries for the rich, while the needy go without. Or consider the underproduction of critical, but unprofitable, antibiotics, even as “lifestyle drugs” (like Propecia, for baldness) roll off the production line.

Capitalism is also inefficient in its use of human labor power. Capitalism functions best when there exists a “reserve army of the unemployed,” in Marx’s phrase. The credible threat of unemployment reduces workers’ salary demands and increases their work effort. But unemployment means idle workers: able bodied people, willing to work, who cannot find an outlet for their productivity. This is a waste, and it would not exist under socialism (or so it is claimed.)

Further, capitalism allows an entire segment of the (able-bodied) population to live without working: namely, the independently wealthy, who can simply live off investment income. This, again, is wasteful; were these people recruited into the labor process, labor time for the rest could decline. Finally, capitalism misdirects the labor of many of those it does employ. Just think, the socialist will say, of the legions of lawyers, advertisers, marketers, and financial workers. Such workers (and others beside) perform no real productive function. Their jobs are necessary only within the framework of capitalism itself. In a socialist economy, there is no need for marketing, financial speculation, or lawyers specializing in mergers and acquisitions. Socialism would free people currently doing these tasks to apply their talents in a more useful way. Marketers could become teachers; financiers, farmers. And we would all be the better for it.

In sum, socialists seek to upend the common sense view of capitalism. Most people take it for granted that whatever its normative flaws, at the very least capitalism ‘delivers the goods,’ so to speak. Not so, replies the socialist. Because it is prone to economic crises, and is wasteful and inefficient in its use of the means of production (including human labor), capitalism’s economic bona fides must be questioned.

4. Why Socialism? Democracy

The article turns now to the normative case against capitalism and in favor of socialism, starting with democracy.

Democracy means rule by the people, as opposed to rule by the rich, or rule by the excellent, or, more generally, rule by any part of the people over the rest. Systems plausibly claiming to be democratic can vary along at least three dimensions. They can bring a broader or a narrower range of issues under democratic jurisdiction; their members can be more or less directly involved in the exercise of political power; and they can insist upon greater or lesser equality of influence (or perhaps opportunity for influence) over political processes. Call these the scope, involvement, and influence dimensions, respectively.

Other things being equal, as involvement, scope, and equality of influence increase, so too does democracy. Thus it can make sense to say that one democratic system is “more democratic” than another. So too, it is possible to compare different democratic ideals in terms of their “democratic-ness”. A principle or ideal that insists upon maximal equality of influence, for instance, is (other things equal) more democratic than a principle or ideal that does not.

Socialists are radical democrats. They do not merely profess rule by the people; they also interpret that ideal in a highly democratic way, opting for maximalist or near-maximalist positions along all three of the just-mentioned dimensions. They want democracy to have very broad scope; they want citizens to be highly involved in democratic processes; and they want citizens to have roughly equal opportunities to influence these processes. And they typically argue, further, that the democratic ideal, understood in this rich and demanding way, militates against capitalism and in favor of socialism. This article will focus on the scope and influence dimensions.

a. Scope

To see this argument, consider first the scope dimension of democracy, which concerns the question: where should the boundary between public and private, between politics and civil society, be drawn? Which issues should be subject to democratic choice? Many socialists endorse something like the following principle:

All Affected Principle: People affected by a decision should enjoy a say over that decision, proportional to the degree to which they are affected.

However, it is a rather short step—or so say socialists—from this intuitively plausible principle to the radical conclusion that economics should be subordinated to democracy, that large swathes of economic life should be politicized and brought under popular control. All that is required to make that leap is the seemingly incontrovertible premise that many economic issues affect the public. When a local business fires 20% of its workers, this affects the public. When financiers withdraw support for a new shopping center, this affects the public. When society’s productive assets are deployed to make yachts for millionaires rather than affordable housing, this affects the public. When corporations pull up roots and relocate production to greener pastures, this affects the public.

In all of these cases (and many others besides), people’s lives are affected—indeed, often profoundly affected—by economic decisions. Do they get a say in these decisions, as required by the All Affected Principle? Not under capitalism, which grants extensive control over such matters to holders of private property rights. Where private property reigns, owners rather than affected parties decide, for example, whether to hire or fire, to invest, to relocate, and so on. From the socialist point of view, this is a serious offense against democracy. Capitalism, socialists claim, depoliticizes what should remain political; it cedes far too much control over common affairs to private parties. It is, in this way, insufficiently democratic.

But if the root cause of this democratic deficit is private control over productive assets, then the solution, or so socialists argue, must be social control over the same. Social property brings into the democratic domain what private property improperly removes. What touches all must be decided by all; economic matters touch all; therefore economic matters must be decided by all. This is the simple but powerful democratic syllogism at the heart of one major argument for socialism, for social rather than private control of the economy. What might social control over the economy look like in practice? Section 8 explores competing answers to this question.

b. Influence

Socialists find further grounds for rejecting capitalism in democracy’s influence dimension. Standardly, democracy is held to require not merely that all citizens have a say, but that they have an equal say. But what does this really mean? To clarify, suppose that A and B have equal voting rights, but A, being rich, educated, and leisured, has a greater chance to influence the political process than B, who is poor, uneducated, and short on free time (he must work long hours to make ends meet). Do A and B have an “equal say”, in the sense required by democracy?

Nearly all socialists, and indeed, many non-socialists, would say “no”; they would detect a democratic deficit in this scenario, for they typically see democracy as requiring not merely formal equality of opportunity for political influence but also substantive or fair equality of opportunity for political influence. On this view, it is not enough for A and B to enjoy identical legal protections to vote, to run for office, to engage in political speech, and so on. Instead, genuine democracy requires (over and above this merely legal equality) that equally talented and motivated citizens have roughly equal prospects for winning office and/or influencing policy, regardless of their economic and social circumstances—or something along these lines.

Now, capitalism clearly can implement formal political equality. Many capitalist societies grant their citizens equal rights to vote, to run for office, and so on. But can capitalism implement substantive political equality?

Many socialists think not. Capitalism, they point out, generates steep economic inequalities, dividing society into rich and poor. But in a variety of ways, the rich can translate their economic advantages into political ones. This translation can occur relatively directly, as when the rich buy political influence through campaign contributions, or when they hire lobbyists to steer legislative priorities (sometimes going so far as to draft laws themselves). Or it can occur relatively indirectly, as when the wealthy use their ownership of media to shape public opinion (and thus the political process), or when capitalists threaten to take their money out of the country in response to disliked (usually leftist) policies, thereby limiting what government can do.

But whether moneyed interests affect politics directly or indirectly, the net result is the same: capitalism amplifies the voices of the rich, enabling their concerns to dominate the political process. Indeed, some socialists, pressing this objection to its logical conclusion, contend that “democracy” under capitalism is really little more than oligarchy—rule by the rich—covered by a democratic fig leaf. Or, as Vladimir Lenin put this point: “Democracy for an insignificant minority, democracy for the rich—that is the democracy of capitalist society” (79).

Sophisticated defenders of capitalism respond by arguing that capitalism’s democratic deficits can be repaired within a fundamentally capitalist framework. Campaign finance reform, regulation of lobbying, restrictions on corporate domination of media, even limitations on the movement of capital across borders would, together, do much to restore or preserve political equality amidst capitalist economic inequality, and yet none of them are incompatible with capitalism per se. It follows (capitalists argue) that there is no need to throw out the baby of capitalism with the bathwater of political inequality. Sufficiently reformed, capitalism can indeed realize not just formal political equality but also substantive political equality.

The question, socialists would reply, is whether these reforms would ever be chosen by political elites under capitalism. Will capitalist oligarchs willingly undercut the very basis of their rule by socializing control over mass media, installing real campaign finance reform, limiting capital flows, and so on?

Would socialism perform any better than capitalism on this “influence” dimension of democracy? Would it enable equally talented and motivated citizens to have roughly equal prospects for influencing politics? Socialists argue that it would. Because it eliminates class, socialism eliminates the major threat to substantive political equality. (Of course, other forms of exclusion, such as racism and sexism, must also be overcome.) Wealthy property owners will not dominate the political process at the expense of the poor and unpropertied because the latter will be an empty set. Everyone will be a wealthy property owner, in the sense that everyone will share control over the means of production and will have access to a dignified standard of living. Everyone will therefore have roughly equal economic resources to bring to bear on the political process.

Put differently, whereas capitalism attempts to secure political equality despite massive economic inequalities, socialism attempts to secure political equality in large part by eliminating these inequalities.

5. Why Socialism? Exploitation

According to many socialists, one of capitalism’s central moral failings is that it is exploitative. Socialism, by contrast, would not be exploitative—or so these socialists allege—and this is one of the main reasons for preferring it to capitalism.

But what is exploitation? Is capitalism truly exploitative? And would socialism really eliminate exploitation? This subsection explores socialist answers to these questions.

a. Exploitation as Forced, Unpaid Labor

Although there is no universally accepted account of exploitation, Jeffrey Reiman’s Marx-inspired suggestion that exploitation is “a kind of coercive prying loose of unpaid labor” provides a good framework for discussion (3). On this account, a person is exploited if and only if she is forced to work for free. Feudal serfs, for example, were exploited because they were legally and physically compelled, at sword-point if necessary, to spend part of their working time toiling in the lord’s fields for nothing in return. This was forced, unpaid labor of the most obvious sort, and it constituted a serious form of exploitation.

But are capitalist employees exploited? At first glance, it would appear not. Workers get paid wages, so it doesn’t seem as if they are working for free. Nor does it appear that workers are forced to work. Capitalism, being a system of “free labor”, grants workers ownership over their labor power and entitles them to sell it—or not—as they please. So where is the force supposedly inherent in the capital/worker relationship?

Take the issue of force first. In general, a person is forced to do something X whenever she has no reasonable alternative to doing X. Workers, then, are forced to sell their labor power to capitalists just in case they have no reasonable alternative to doing so. But of course they don’t have a reasonable alternative, or so some socialists contend. Their argument is simple. Everyone must make a living. There are, under capitalist property relations, only two main ways to do this: one can live off of investment or property income, or one can live off of wages. By definition, workers cannot pick this first option; they don’t own means of production, so they can’t live off of income generated by such ownership. This leaves wage labor as the only acceptable option. True, workers are formally free to decline capitalist employment, but this does not represent a reasonable option since its consequences are so dire: starvation or, in more enlightened circumstances, life on the dole. Workers therefore have no minimally reasonable choice but to sell their labor power to owners of means of production.

It follows that workers are forced to work for capitalists, even if they are not so forced by capitalists (or indeed, by anyone else). The forcing in question is structural rather than agential; as Reiman explains, it is “an indirect force built into the very fact that capitalists own the means of production and laborers do not.” Or, as Marx puts this point, it is the “the dull compulsion of economic relations” rather than “direct force” that “completes the subjection of the laborer to the capitalist” (Capital Vol. I, 737).

Not all socialists accept this argument. G.A. Cohen, for example, suggests that individual workers do have a reasonable alternative to selling their labor power: they can become capitalists themselves (“The Structure of Proletarian Unfreedom”). Not overnight, perhaps, but with enough scrimping and saving, is it not possible for an individual worker to start a business of her own? Cohen concludes that individual workers are not forced to sell their labor power. (He also argues that workers are “collectively unfree”—unfree as a class—since not all, or even many, workers can escape their class at the same time; the economy can absorb only so many small business owners at any given moment. But this alleged collective unfreedom of workers, though interesting and important, is peripheral to our present topic and so must be set aside.)

In response, some socialists question whether opening a small business really represents a reasonable option for most workers. For one thing, many workers simply can’t save enough to open such a business: their wages are just too small relative to the cost of living. For another, even if a worker is able, through years of thrift, to open his own business, most businesses fail, often leaving the owner much worse off financially than she would have been had she simply remained a wage laborer. Pulling together these ideas, one critic of Cohen concludes that “escaping into the petty bourgeoisie…is a reasonable alternative only for a tiny minority of workers. Thus the vast majority of working-class individuals are forced to sell their labor power to earn a living” (Peffer, 152).

But even if Cohen is wrong, and individual workers are forced to sell their labor power, notice that it does not yet follow that workers are exploited. For forced labor alone does not exploitation make. Exploitation, as described above, involves forced, unpaid labor. Let us turn, then, to the issue of compensation, and in particular, to the question of whether workers toil (at least in part) for free.

Again, surface appearances cut against the socialist position. Wage laborers standardly receive an hourly wage. If they work, say, eight hours, they get eight hours’ pay. It certainly seems, then, that workers receive full compensation for their toil. Perhaps this compensation is unfairly low, but that is a different issue: the exploitation charge, standardly construed, is that workers are forced to work for no pay, not that they are forced to work for low pay.

But probe more deeply, some socialists contend, and the unpaid nature of much work under capitalism becomes clear. To see their argument, it helps to start with an easier case: feudal production. Under feudalism, serfs spent part of their working time working in their own fields and the rest working in their lord’s fields. They kept whatever they could grow on their own plots, and surrendered whatever they grew on the lord’s. Put differently, serfs received compensation for part of their working time, but no compensation at all for the rest of it. A great deal of their work, then, was wholly unpaid: a fact that was very obvious to all involved, given the physical separation between paid work (on the serf’s fields) and unpaid work (on the lord’s).

Marxists argue that precisely the same division between paid and unpaid work exists under capitalism. Workers spend the first part of their working day working, in effect, for themselves. This is the part of the day during which they produce the equivalent of their wages. Marx calls this “necessary labor time”. But the working day does not stop there. Indeed, it cannot stop there, for if it did, there would be no “surplus product” for the capitalist to appropriate, and thus no reason for the capitalist to hire the worker in the first place. So the capitalist requires the worker to perform “surplus labor”, which is just labor beyond “necessary labor”: labor beyond what is required to produce value equivalent to the worker’s wage. The value produced during surplus labor time, Marx calls “surplus value”. Crucially, this surplus value belongs to the capitalist rather than the worker, and is the source of all profits.

To illustrate, consider a worker who produces 1 widget per hour over the course of an eight-hour shift, thus yielding eight widgets in total. Her boss takes these widgets, sells them, and then returns part of the proceeds to the worker in the form of a wage. But this wage must be less than what the capitalist reaped by selling the widgets. Otherwise the capitalist would have nothing left over as profit. To fix ideas, suppose that the worker’s daily wage is equivalent to the value of 2 widgets. To produce this value, she had to toil for 2 hours (at 1 widget per hour). Yet her shift lasts 8 hours. It follows that she spent 2 hours working for herself, and 6 hours working for her boss: which is to say, 6 hours working for free.

We can now appreciate Marx’s remark that “the secret of the self-expansion of capital [that is, the secret of profit] resolves itself into having the disposal of a definite quantity of other people’s unpaid labor” (Capital Vol. I, 534). Profits, on Marxist analysis, are possible only through the extraction of unpaid surplus labor from workers. Wage workers toil gratis no less than serfs. That the division between paid and unpaid labor under capitalism is temporal rather than physical or spatial (as under serfdom) makes this division harder to see, but it does not in any way diminish its reality—or so the socialist argument goes.

b. Eliminating Exploitation

How exactly is socialism supposed to eliminate exploitation? Notice that it would not eliminate work itself, as Marx writes, “Just as the savage must wrestle with Nature to satisfy his wants, to maintain and reproduce life, so must civilized man, and he must do so under all social formations and under all modes of production” (Capital Vol. III, Ch. 48). So even under socialism, work must be done.

However, it does not follow that people must be forced to do it. Society could eliminate the compulsion to labor by partly decoupling income (or access to basic resources more broadly) from work. Philippe van Parijs’s “unconditional basic income” represents one way to achieve this decoupling. On his proposal, which has attracted significant support from socialist quarters, each citizen, no matter how rich or how poor, would be paid a monthly income, set as high as possible, and in any case sufficient to live with dignity. This income would come without any strings attached. In particular, it would not be conditional on working, seeking work, or training for future work. It would go to all members of the political community: leisured surfers off of Malibu no less than industrious steelworkers in Pittsburgh.

Perhaps the economic feasibility of such a proposal may be questioned. But for present purposes, the important thing to appreciate is the way in which a UBI (as it is known) gives each person the “real freedom” to drop out of the paid labor force, thereby eliminating both the compulsion to work and (therefore) exploitation.

From a socialist perspective, there are at least two potential problems with this way of eliminating exploitation.

First, a UBI enables people to live off the hard work of others—no reciprocation required. Again, surfers get the check no less than people with paid employment. But socialists complain when capitalists live off the work of others; shouldn’t they complain when surfers (and so forth) behave similarly?

Second, there is nothing uniquely socialist about a UBI. Capitalist no less than socialist societies can implement a UBI, thereby enabling everyone to live decently without working. A defender of capitalism might therefore insist that when it comes to exploitation, capitalism and socialism are on all fours: both are equally susceptible to exploitation and equally able to enact the policies needed to eliminate it.

In response, socialists might point to the second necessary feature of exploitation, non-compensation. Notice that compensation takes many forms. Acquiring exclusive control over a sum of money, or over a bundle of resources, is one of them. But so too is acquiring a share of control over resources. Say that you and I work to build a tree house which we then jointly control. Neither of us has exclusive say over the tree house. And yet it would be wrong to conclude that our labors have gone uncompensated. We have been compensated; it’s just that our compensation comes in the form of common rather than private property.

This is precisely the sense in which all labor is compensated under socialism. Workers own the means of production together; they (therefore) own the surplus generated by these means. True, they do not own this surplus privately. They share control over its disposition and use. But shared control can be a form of compensation no less than private control.

Under capitalism, workers have private ownership over their wages (and the things these wages buy) but no ownership at all over most of what they produce. This is the sense in which most of their laboring activity goes uncompensated. Workers produce a surplus, hand it over to capitalists, and are then cut out of the picture; their bosses are free to do with the surplus whatever they like: consume it, invest it, burn it, and so forth. Under socialism, by contrast, workers have private ownership over their wages (or, in a money-less economy, over resources for personal use) and collective ownership over the social surplus they produce. They both make the surplus and share control over how to use this surplus. At no point, then, are socialist producers toiling ‘for free’, since their labors go towards building an economy that is shared and controlled by all. It’s as if everyone made a gigantic tree house that everyone is then free to use and to help govern.

So, contrary to the capitalist objection raised 4 paragraphs back, it seems that socialism is uniquely well positioned to eliminate exploitation. Both socialism and capitalism could, in principle, eliminate forced labor by attenuating the link between income and work. But only socialism can ensure that all work is compensated through common ownership of the social surplus. Thus socialism expunges exploitation from economic life even absent something like a UBI, whereas the same cannot be said of capitalism.

Against this argument, critics might reply that the kind of ‘compensation’ for surplus labor promised by socialism is wholly inadequate. Under capitalism, the worker’s surplus is appropriated by the capitalist; under socialism, the worker’s surplus is appropriated by society. From the worker’s point of view, this may seem a distinction without a difference. Both appropriations rob the worker of effective control over the fruits of her labor. True, under socialism the worker is a member of the group doing the appropriating, but, as merely one of millions of such members, her individual influence over that group is infinitesimal. Is it plausible to regard her tiny sliver of decision-making power over the surplus as ‘compensation’ for her surplus labor? Arguably not, in which case socialism does not actually eliminate exploitation.

6. Why Socialism? Freedom and Human Development

Many socialists point to considerations of freedom, broadly understood, to support socialism over capitalism.

Freedom comes in many varieties. This article will discuss two. Formal freedom involves the absence of interference. Effective freedom involves the presence of capability. A person who is unable to walk has the formal freedom to ascend a steep flight of steps—assuming that no one will interfere with her attempt—but lacks the effective freedom to do so.

a. Formal Freedom

It is sometimes suggested that socialism fares poorly with respect to formal freedom. There are two main grounds for this contention, one historical, the other conceptual.

Historically, many countries claiming to be socialist trampled basic liberties such as freedom of expression and religion. They imprisoned and killed political dissidents and other ‘enemies of the people’. Far from being free societies, they were deeply oppressive ones.

Some critics of socialism suggest that this historical correlation between socialism and oppression was no accident. Rather, it reflects a deep flaw in socialism’s design. Socialism concentrates economic and political power in the hands of the state. Abuse is inevitable under such conditions. Milton Friedman, building off of this insight, famously posited a necessary connection between capitalism (which, unlike socialism, disperses economic power rather than concentrating it) and freedom: not all capitalist societies are free, but all durably free societies must be capitalist.

Socialists concede the heart of Friedman’s point, but argue that it does not undermine their position. Friedman, they say, was right to warn against excessive centralization of power. But he was wrong to suggest that socialism necessarily requires said centralization. The contemporary socialist ideal is profoundly democratic and decentralized; it seeks to disperse economic power, not concentrate it. It aspires to an economy and a society controlled from the broad bottom, not the narrow top. So the kind of socialism that contemporary socialists embrace is simply different than the kind of ‘socialism’ targeted by Friedman’s critique. Put differently, Friedman’s worry attacks a view held by very few socialists today—or so it might be argued.

Turning to a different objection, it is sometimes suggested that on purely conceptual grounds socialism is a more restrictive society than capitalism. The argument for this claim is simple. Capitalism permits private ownership of productive assets; socialism does not. Socialism therefore provides less formal freedom than capitalism. It interferes with various economic activities that capitalism allows. Thus, if what you value is formal freedom, then you should prefer capitalism to socialism.

The trouble with this argument, as pointed out by G.A. Cohen, is that it “see[s] the freedom which is intrinsic to capitalism [but not] the unfreedom which necessarily accompanies capitalist freedom” (“Capitalism, Freedom, and the Proletariat” 150). Capitalism does indeed allow some things that socialism forbids: for example, opening a business. But the converse is also true. To use Cohen’s example: I am free to pitch a tent on common land. I am not free to pitch a tent on land that you own privately. Should I try, the state will interfere, thereby reducing my formal freedom. Private property’s effects on formal freedom, then, are not uniformly positive, but mixed. Private property extends formal freedom to owners even as it withdraws it from non-owners. As Cohen writes, “To think of capitalism as a realm of freedom is to overlook half its nature” (“Capitalism, Freedom, and the Proletariat” 152)

Of course, precisely the same can and indeed must be said of socialism. All systems of property, whether capitalist or socialist, exert complex effects on formal freedom; all such systems necessarily distribute both freedom and unfreedom. But in light of this complexity, our guiding question here—which system, capitalism or socialism, provides more formal freedom?—is probably unanswerable. All we can say with confidence is that these systems provide differently shaped zones of formal freedom; each extends formal freedom in some ways while restricting it in others. However, it seems extremely difficult, perhaps impossible, to determine which of these zones is ‘larger’ overall. At the very least, defenders of capitalism must say a great deal more to establish that capitalism is, a priori, a freer society than socialism.

Socialists score this particular fight a draw.

b. Effective Freedom

Whereas socialists tend to play defense regarding formal freedom, they go on offense when discussing effective freedom.

Effective freedom, again, involves the capacity to accomplish one’s ends. This implies but goes beyond formal freedom. Say that my goal is to complete a marathon. One way I can fail to accomplish this goal is by meeting with agential interference. If you physically restrain me from participating in the race, you undermine my effective freedom by undermining my formal freedom. However, effective freedom usually requires much more than the mere absence of interference. I can actually complete a marathon, for example, only if a host of further conditions are in place. Some are broadly social: I must live in a society in which marathons occur. Others are broadly economic: I must be able to afford all the costs associated with training for the race, traveling to the race, entering the race, and so forth. And there are physical or “internal” factors as well. I can’t finish the marathon unless I have sufficient mobility and endurance. All of which is to say that effective freedom depends upon a wide range of factors, many of which have nothing to do with human interference per se.

Now, in a good and just society, which effective freedoms—which “capabilities,” as they are sometimes called—would people have? The typical socialist response runs as follows. At a minimum, everyone must have the effective freedom to meet their basic needs for food, shelter, health care, and so on. With these capabilities in place, people are able to survive. This is a crucial accomplishment, and one demanded by minimal standards of justice and decency. However, a truly good society must set its sights higher; it must enable people not merely to survive, but also to flourish.

And what is human flourishing? Socialists standardly accept a broadly Marxist/Aristotelian account according to which the good life centrally involves not just the passive pleasures of consumption (watching TV, eating tasty food, and so on) but also the more active and enduring satisfactions of “self-realization”, which can be defined as the “development and application of a person’s talents in a way that gives meaning to his or her life” (Roemer, A Future for Socialism 3). Mastering an instrument, playing a sport, solving a physics problem, writing an article, building a shed: these are all examples of potentially self-realizing activities.

Such activities typically have a steep “learning curve” that makes them frustrating at first, but deeply gratifying over the long haul. In this, they contrast sharply with consumption activities, which have the opposite hedonic profile: watching TV is immediately gratifying, but its charms wane with repetition. This contrast is one reason why self-realization is (according to many socialists) more important to human flourishing than consumption. A life replete with consumption but lacking in self-realization becomes stale, cramped, unsatisfying. Indeed, at the limit, it veers towards meaninglessness. It is only through the development and exercise of one’s higher powers and talents that one leads a truly human existence—or so the socialist view has it.

So a genuinely good and just society, then, is one in which “the free development of each is the condition of the free development of all,” as Marx and Engels declare in the Communist Manifesto: it is one in which each person has real access to the material, social, and political preconditions for human development and self-realization. But how, precisely, does any of this amount to an argument for socialism? The answer is that socialists typically see capitalism as a serious barrier to self-realization, a barrier that nothing short of socialism can remove. As Jon Elster puts it, capitalism “offers [the opportunity for self-realization] to a few but denies it to the vast majority” (Introduction to Karl Marx 43). Socialism, by contrast, would democratize self-realization, putting it within reach of average, everyday people for the first time in human history—or so it is claimed.

To fill in these claims, consider the material and social preconditions for self-realization. To develop and apply one’s talents in a way that gives meaning to life, one must, at a minimum, have one’s basic needs met. People who are sick, hungry, or homeless are simply not in a good position to develop and exercise their higher talents. However, since capitalism reliably leads to poverty via frequent busts, structural unemployment, downward pressure on wages, and so on—or so socialists will claim—it therefore reliably depresses access to self-realization for a significant portion of the population. Socialism, by contrast, would eliminate poverty and thus would eliminate this potent material barrier to self-realization.

Suppose, however, that basic needs are met: what else is required for self-realization? Time. Now, under capitalism, most people are forced, through lack of private property, to perform wage-labor for a living (see section 5.a). Their days are thus divided into two parts: working time and leisure time. But time spent in a capitalist workplace is, for the vast majority of people, hardly time for self-realization. Capitalist jobs are oriented around the demands of profit, not self-realization. And quite often, the most profitable way to organize work is to “deskill” it: to make it simple, routine, even stultifying (Braverman).

Granted, there are exceptions. Some workers, such as doctors, engineers, college professors, carpenters, have challenging, complex, autonomous, engaging jobs that help bring self-realization and meaning to their lives. But these are the lucky few. More typical is the experience of, say, assemblers, fast food workers, cashiers, poultry-plant operators, secretaries, human resource clerks, and so on and so forth. Saddled with “alienating” jobs like these, workers work merely to live; as Marx writes, they “feel themselves at home only when they are not working, and when they are working they do not feel at home” (Economic and Philosophic Manuscripts). This is not to demean the people occupying these roles, nor is it necessarily to deny the social importance of these jobs. Rather, it is only to point out that these jobs offer little opportunity to develop and exercise complex talents in a way that brings meaning to life. If capitalism’s armies of cashiers, clerks, and so on are to experience self-realization, then, it will have to be off the clock, during their free time.

Yet here we hit upon a further capitalist barrier to self-realization: its unwillingness to expand what Marx called the “realm of freedom,” leisure, by shrinking the “realm of necessity,” work. Despite ever-rising productivity—more output per unit of labor input—working time rarely declines under capitalism. This is, on its face, rather puzzling. After all, there are, in principle, two ways an enterprise could respond to an increase in productivity. It could keep working hours constant while increasing output, or it could keep output constant while cutting working time. Yet capitalist firms consistently choose the first option over the second; they choose to produce more stuff rather than reduce the working day.

What explains this “output bias”? Profits (Cohen, Karl Marx’s Theory of History Ch. XI). Firms do not make more money by reducing working time; they make more money by increasing output. And so we get, under capitalism, a society chronically short on leisure but drowning in consumer goods; we get the familiar harried rat race, albeit with iPhones.

Now, this mountain of stuff must be sold. This requires spending enormous resources on the “sales effort”—marketing, advertising, and so on—so as to stimulate demand. The result is a highly consumerist society in which many people identify the good life with the life of consumption rather than self-realization. This widespread consumerism may be further promoted by a “sour grapes effect”. Denied self-realization by the alienating nature of work and insufficient free time, the capitalist worker decides self-realization isn’t worth having to begin with. Like the fox in Aesop’s fable, he rejects the tasty fruit he cannot have (self-realization) for the blander fruit within reach (consumption).

In sum, for a variety of interconnected reasons, having to do with its tendency to produce poverty, deskill work, provide inadequate free time, and promote a consumerist orientation, capitalism undermines self-realization and therefore human flourishing. Not, admittedly, for all. But for the vast majority, capitalism renders a rich and meaningful life difficult if not impossible to achieve. Or so the socialist argument goes.

How would things differ in a socialist economy? We have already seen that socialism, by (allegedly) eliminating poverty, would eliminate that particular material barrier to self-realization. Regarding work and leisure, socialists argue that because their system places human beings rather than anarchic market forces in control of the economy, it empowers us to prioritize self-realization and expanded leisure in the design and organization of work. Since we control production, we can tailor it to suit our preferences. If we want better, non-alienating work and more free time, we can get it. Admittedly, this would probably result in lower output. With reduced hours and more engaging labor processes, less stuff would be produced. But from the socialist point of view, this is no great tragedy. Past a certain point, more stuff contributes very little to human flourishing. Once a decent standard of living has been secured, self-realization hinges mainly on access to meaningful work and adequate free time. If the price of securing these things is less stuff, so be it. Fewer iPhones in exchange for more meaningful jobs and no rat race: this is a tradeoff that socialists heartily recommend.

7. Why Socialism? Community and Equality

Capitalism is competitive and cut-throat; socialism is cooperative and harmonious. Capitalism divides; socialism unites, or so many socialists have argued. The crucial value in play in these arguments is “community”.

The concept of “community” admits of at least two different interpretations. The first concerns producers’ motivations: what drives people to wake up each day and go to work? Is it fear and greed, or a desire to serve one’s fellows? The latter is the motivation consistent with community, yet it is relentlessly undermined by capitalism (or so socialists claim). The second sense of community concerns limitations on material inequality. When inequalities in living conditions grow too steep, mutual incomprehension results. People dwell in different worlds. This undermines community (in this second sense), or so it may be argued. These two senses of community, and their fates under capitalism and socialism, will be explored more deeply in what follows.

a. Why Produce? Communal vs. Market Reciprocity

As Adam Smith observed, under capitalism “it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest” (Book 1, Ch 2). The baker hands over a loaf only because you pay him. Remove the payment, and he removes the bread. So it goes in a market society, for as G.A. Cohen argues, in such a society productive activity is motivated “not on the basis of commitment to one’s fellow human beings and a desire to serve them while being served by them, but on the basis of cash reward” (Why Not Socialism? 39). Market participants operate on the maxim “serve-to-be-served”; they serve only in order to receive service in return. They strive to give as little as possible while getting as much as they can—“buy low and sell high,” as the saying goes. Indeed, the best-case scenario (by market logic’s lights) is to give nothing and get everything.

Market logic thus locks us into deeply anti-social relations. The marketeer, looking at humanity, sees not comrades or brothers and sisters, but customers and competitors. Predator-like, he sees mere “sources of enrichment” and “threats to success”. The former are to be fleeced, the latter crushed. Yet these are horrible ways to relate to other people. Market society may deliver the goods, but it does so only by bringing out some of the worst aspects of human nature. Or so some socialists argue.

But is there an alternative? Cohen asks us to consider how people behave on a camping trip. If A needs help setting up her tent, does B use her need strategically as a means to self-enrichment? Does he ‘drive a hard bargain’, making his support contingent on receiving something in return? No; that’s not how decent people behave in such a context. Rather, in the standard case, B helps A simply because A needs help. Service in response to need: this is what motivates productive activity on a camping trip.

Now, this is not to say that B’s assistance comes entirely without strings attached. B needn’t be a sucker, so to speak; he needn’t continue to help A if A consistently fails to return the favor. On a camping trip, one reasonably expects some degree of reciprocity. Campers thus occupy a sweet spot between anti-social market predation on the one hand and self-denying altruism on the other. Cohen labels this sweet spot “communal reciprocity,” and describes its content as “serve-and-be-served”. Campers acting on this motivation value both sides of the conjunction. They regard it as intrinsically desirable to serve each other, yet they also do expect some degree of reciprocation. As Cohen explains, “the relationship between us under communal reciprocity is not the market-instrumental one in which I give because I get, but the non-instrumental one in which I give because you need, or want, and in which I expect a comparable generosity from you” (Why Not Socialism? 43).Cohen recognizes an important caveat here: the responsibility to reciprocate is conditional upon ability. Thus, there’s no problem with disabled people receiving support without making a contribution in return.

Such behavior is entirely normal and functional on a camping trip. Communal reciprocity clearly “works” in such a context. But can it work on a massive, society-wide scale? Can millions or billions of strangers serve each other, with tolerable economic results, out of fraternity and benevolence rather than greed and fear?

Skeptics cite two main grounds for doubt. The first is human nature: surely people are simply too selfish, greedy and tribal for communal reciprocity to work on a massive scale. Treating your actual brothers as brothers is one thing; treating total strangers as brothers is quite another.

Socialists reply that human nature is complex. We are indeed greedy and competitive, but so too are we generous and cooperative. Economic context powerfully influences which of these traits predominate. Edward Bellamy, an influential 19th century American socialist novelist and thinker, compares human nature to a rosebush (Ch. 26). Put a rosebush in a swamp, and it will appear sickly and ugly. One might conclude that rosebushes are, by ‘nature’, noxious little shrubs. But this would be a mistake. We know that rosebushes are capable of great beauty, given the right developmental conditions. Yet surely, argues Bellamy, the same goes for human beings. Shaped by capitalism, people appear greedy, cramped, and fearful. But this is only because we’re mired in a metaphorical swamp. Put us in the more hospitable soil of socialism and we, like the rosebush, would blossom; we would display all the fellow-feeling, generosity, and cooperative instincts socialism requires.

Human nature, in short, poses no serious obstacle to socialism. ‘Socialist man’ dwells within all of us, waiting only for the right social conditions to emerge.

But these social conditions simply cannot emerge, for they are infeasible: this is the second skeptical objection. Without markets, economies simply do not function tolerably well—witness the failure of Soviet-style planning. In response, Cohen argues that this is just one data point. It would be overly hasty to write off all non-market alternatives simply on the basis of one failed experiment. He admits that socialists face a “design” problem. They do not now know how to power an economy on generosity and fraternity rather than greed or fear. But design problems often turn out to be solvable with enough ingenuity and attention. Non-market socialists do not currently have the answers. But in the fullness of time, they might—or so Cohen argues.

Before turning to a different community-based critique of capitalism, a powerful challenge to Cohen’s argument should be noted. Jason Brennan points out that socialism cannot lay claim to communal reciprocity by definitional fiat. Socialism is just communal ownership of the means of production. Whether this particular way of structuring property fosters communal reciprocity, leading to a generosity and a “serve-and-be-served” mentality on a wide scale, is an entirely empirical question that cannot be answered from the ‘philosopher’s armchair,’ as it were. Yet when we turn to the empirical record, we find little support for Cohen’s position.

If Cohen were right, then we should expect to see an inverse relationship between markets and various pro-social attitudes and behaviors. We should expect to see greater levels of greed, mistrust, and so on as markets expand and deepen. The most marketized societies should also be the most anti-social. But this is not at all what we find. In fact, we find precisely the opposite. Studies cited by Brennan suggest that market exchange promotes various pro-social attitudes such as trust, fairness, and reciprocity. Brennan concludes that Cohen has it backwards: if we wish to spread camping trip values across society, we should embrace markets, not reject them.

Notice that Brennan’s critique (if sound) damages only non-market varieties of socialism. It does not undermine (and indeed actually provides some support for) market versions of the same. (Market socialism is discussed further in 8.c.)

b. Justice, Inequality, Community

This article has not said very much about equality as a socialist ideal. This may surprise some readers. Isn’t equality of condition one of socialism’s central aims? Indeed, socialism’s allegedly uncompromising egalitarianism is sometimes used as the basis for a reductio ad absurdum of the socialist position. The reductio runs like this: according to socialism everyone must be equal; one way to do this is to ‘level down’ the better off; but this is morally repugnant; so socialism must be rejected. One thinks here of Kurt Vonnegut’s famous anti-egalitarian tale “Harrison Bergeron”, in which an equality-obsessed government knocks the noggins of the more intelligent, bringing them into alignment with their IQ-disadvantaged peers.

The reductio fails because socialists do not advocate equality of condition, at least not in any straightforward sense. Much light has been shed on this issue by the now-voluminous philosophical literature on egalitarianism. Of particular import is the work by philosophers like Richard Arneson and G.A. Cohen on the question: “equality of what”? Insofar as leftists seek equality, what is it that they wish to equalize? Standard options include 1) resources, 2) welfare, 3) opportunities for resources, and 4) opportunities for welfare.

Most philosophers agree that the first two options are non-starters. Equalizing outcomes (as 1 and 2 would do) improperly ignores personal choice and responsibility. The point is nicely illustrated by Aesop’s fable of the grasshopper and the ant. Stipulate that both bugs know that winter is coming, and that both have the capability, that is, the effective freedom, to build a house and to gather adequate supplies. That is to say, both have equal opportunity to provision themselves. Yet only industrious ant chooses to use this opportunity; carefree grasshopper decides to dance and play instead. Fast forward to winter: there sits ant in his house, warm and well-fed, while grasshopper shivers hungrily outside. Now, no matter which metric we use—resources or welfare—ant is clearly much better off than grasshopper. Between the two bugs, a very significant inequality of condition obtains. But does this inequality constitute an injustice?

Interestingly, many socialists would answer ‘no’ to this question. These socialists hold a “responsibility-sensitive” form of egalitarianism sometimes called “luck egalitarianism”. On this view, outcome inequalities (whether measured in resources, welfare, or some other metric) are just if and only if they “reflect the genuine choices of parties who are initially equally placed and who may therefore reasonably be held responsible for the consequences of those choices” (Cohen, Why Not Socialism? 26). By luck egalitarian lights, then, the grasshopper/ant inequality is perfectly just since it reflects divergent choices rather than differences in unchosen circumstances. (Circumstantially, the bugs were identically placed. Both could have prepared for winter. But only ant chose to do so.)

Notice that the luck egalitarian would reach a different verdict if grasshopper and ant were unequal because (say) grasshopper was born without legs, and thus couldn’t provision himself. Then it would be unjust for him to go without food or shelter. For that outcome would reflect factors beyond his control, namely, his unchosen disability, in violation of the luck egalitarian standard.

In sum, contemporary socialist “luck egalitarians” have a nuanced view on equality and justice. Opportunities (for resources, welfare, or whatever) must be equal. But outcomes may be unequal provided that these inequalities are due to choices rather than circumstances. In a socialist society, then, grasshopper’s shivering does not necessarily signal injustice.

It might, however, signal a different moral defect: namely, a breech of community or compassion. Socialists aspire to a social world within which people care about and (when necessary) care for one another. Dramatically different living conditions put this regime of mutual comprehension, concern, and caring in jeopardy. Condemned to the wintry cold, grasshopper would face trials beyond ant’s understanding. The two bugs would come to dwell in different worlds. Whatever fellow-feeling or mutual concern previously marked their relations would vanish, leaving only a gulf of indifference and estrangement. This is no way for socialist comrades to live: not because it would be unjust (by hypothesis, it would not) but because it would be insufficiently fraternal and compassionate. Cohen concludes that “certain inequalities that cannot be forbidden in the name of [justice] should nevertheless be forbidden, in the name of community” (Why Not Socialism? 37). On this line, it would be just, but not justified, for ant to bar his door.

Are we back to the Harrison Bergeron reductio, then? Does socialism implausibly require absolute equality of condition after all? No, for two reasons.

First, not all inequalities undermine community. Perhaps ant must, in the name of community, provide grasshopper with some of his food and shelter. But does community require him to split his possessions down the middle? Surely not. The point is that while extreme inequalities may place community under strain, more modest ones might not.

Second, Cohen declares, without much argument, that the demands of community trump those of justice. But this ranking may be contested. Why shouldn’t justice trump community, at least occasionally? Perhaps just inequalities should sometimes be allowed to stand even if they undermine community.

8. Institutional Models of Socialism for the 21st Century

What, in practice, would a socialist society actually look like? What concrete institutions and policies—political, economic, and social—would it use to organize, motivate, and direct economic activity? It is difficult to assess the desirability of socialism without answering these questions. The normative case for socialism depends, at least in part, on the attractiveness and feasibility of its institutional vision. More prosaically: even if one is convinced of the abstract philosophical arguments canvassed in section 4, one still has to know what socialism would really be like in order to tell whether one wants it.

This section discusses three broad institutional models of socialism for the 21st century: central planning, participatory/democratic planning, and market socialism. All three models, being socialist, reject private ownership of the means of production in favor of social ownership. But beyond this important point of commonality, many significant differences emerge, especially concerning a) whether planning should be centralized or decentralized, and b) the appropriate role of markets in a socialist economy.

a. Central Planning

Throughout the 20th century, the standard socialist answer to the question “if not capitalism, then what?” was centrally-planned socialism.

Under central planning, “production is organized and coordinated within an administrative hierarchy, with decisions being made at the center and passed down through intermediate levels of the hierarchy to the production units” (Devine 55). Political authorities at the top of this hierarchy decide on broad economic objectives—build up heavy industry, satisfy consumer preferences, develop a backward region, and so on. Central planners then generate a concrete plan to achieve these objectives. To this end, they first gather a massive amount of information. Tens of thousands of enterprises inform planners of their productive capabilities and input requirements; millions of consumers communicate their consumption preferences. With this information in hand, planners, through a complex, multi-stage, “iterative” process, arrive at an overall plan for the economy that sets specific production targets for each enterprise. (Factory A, produce X shoes; factory B, produce Y amount of steel, and so on.) The center sends these orders to enterprise managers, who then devise more specific labor processes through which their workers produce the ordered goods in the right way at the right time. To the extent that the overall plan is fulfilled, sufficient resources are produced to meet whatever broad objectives political authorities have chosen, and the economy ‘works’.

What is to be said in favor of central planning? In theory, quite a bit. Central planning replaces capitalism’s anarchic market production for profit with planned production for use. It therefore promises to eliminate all those problems that socialists associate with private property, markets, and the pursuit of profits—problems like economic instability and poverty, class conflict and exploitation, various barriers to “real freedom” and self-realization, such as alienating labor and inadequate free time, lack of community and solidarity, and unjust economic inequalities. Freed of these capitalist pathologies, a centrally planned society would be classless, prosperous, and harmonious; it would be a society in which “the free development of each is the condition of the free development of all” (Marx and Engels Ch. 2).

Or so the story goes. However, critics allege that in practice central planning performs poorly. There are two problems worth pulling apart here.

The first is economic. Although centrally planned economies eliminate the worst forms of poverty, they do not produce generalized affluence. Under central planning, innovation is sluggish. Product quality is low. Shortages and hoarding are common. Work effort is lacking. These defects stem from underlying information, calculation, and incentive problems. Central planners, critics argue, cannot know what people want, or what producers are able to produce, with sufficient accuracy; nor, even if they could, would they be able to use this massive quantity of information to calculate a coherent overall plan; nor, even if they could calculate such a plan, would they be able to incentivize managers and workers to follow it faithfully.

The second problem with central planning is normative. Central planning, critics say, does not lead to an egalitarian, classless utopia, but to an authoritarian, undemocratic society dominated by a “coordinator class” of political elites, planners, and enterprise managers. Indeed, the basic logic of the system guarantees that central planning is a “road to serfdom” (in Hayek’s famous phrase) rather than a route to democratic empowerment. As one critic explains, “Central planners gather information, calculate a plan, and issue ‘marching orders’ to production units. The relationship between the central agency and the production units is authoritative rather than democratic, and exclusive rather than participatory” (Albert 52). Information flows up the hierarchy; orders flow down. Central planners decide; everyone else obeys. This seems rather far from the “radical empowerment” envisioned by many socialists.

Indeed, central planning’s economic and normative failings are related; the latter compound the former. It is partly because central planning alienates and disempowers workers that it performs so poorly qua economic system. Workers, so treated, expend little effort at work, ignore orders, under-report their productive capabilities, over-report their output, and so on.

Persuaded by these objections, most socialists today reject central planning, holding that it simply doesn’t work sufficiently well and that it comes at too steep a cost to democratic empowerment and freedom. But if they reject central planning, what do they propose to put in its place? There would seem to be only two options: either socialists rehabilitate planning by decentralizing and democratizing it, or they make peace with the market. The first route leads to some form of “participatory planning”; the second, to “market socialism”. The next two subsections explore these models in greater detail.

b. Participatory Planning

Perhaps the problem with central planning has to do with centralization rather than planning: this is the core thought behind “participatory” or “democratic” planning. Advocates of this approach include Pat Devine, Michael Albert, and Robin Hahnel. Because Albert and Hahnel’s model, called “participatory economics,” or “parecon” for short, is especially well developed, this article shall take it as representative of the broader participatory planning approach.

i. Parecon: Basic Features

Parecon rests on five main institutional proposals:

  1. Social ownership of the means of production
  2. Democratic workplaces
  3. Balanced “job complexes”
  4. Remuneration according to effort, sacrifice, and need
  5. Economic coordination based on comprehensive participatory planning, using a complex system of nested “worker” and “consumer” councils

We may move quickly through the first and fourth of these proposals. Social ownership means simply that nobody in particular owns the means of production; rather, “we all own [them] equally, so that ownership has no bearing on the distribution of income, wealth or power” (Albert 9). What does bear on the distribution of income in a parecon is effort and sacrifice (112-117). The underlying rationale here is luck egalitarian (see 7.b). People, Albert and Hahnel argue, should be rewarded or penalized only for those things under their control. But the only thing that people control is their level of effort and sacrifice. Therefore, they should be rewarded and penalized only for their level of effort and sacrifice. Those who work harder or longer should enjoy greater consumption opportunities than those who work less hard and/or less long. There is an exception here: people who are unable to work will be provided with an average income.

Proposals 2, 3, and 5 require more extensive discussion.

Democratic workplaces. Parecon takes as one of its core values the idea of “democratic self-management,” which implies that “each actor in the economy should influence economic outcomes in proportion to how those outcomes affect him or her”. In other words, “Our say in decisions should reflect how much they affect us” (Albert 40). This norm implies that decisions affecting only a given individual should be left entirely to that individual. But other decisions have broader consequences, and are therefore appropriate objects of democratic choice. Many workplace decisions fall into this “other-affecting” category. Albert and Hahnel propose a complex system of nested “councils” for handling such decisions. Some workplace decisions will be entirely up to individual workers; others, assigned to work-teams; still others, to the enterprise as a whole. Indeed, since some workplace decisions affect people beyond the workplace’s four walls, such as consumers, some method for granting an appropriate degree of influence to these affected external actors must be found. Albert and Hahnel propose democratic “consumer councils” and “industry councils”. More will be said about this system of democratic council coordination below.

Balanced job complexes. Parecon proposes to radically remake the division of labor by creating “balanced job complexes” in which “the combination of tasks and responsibilities each worker has would accord them the same empowerment and quality of life benefits as the combination every other worker has” (Albert 10). This is, of course, very far from how occupations are structured currently. Under capitalism, considerations of profit and class power largely determine the way in which different productive tasks are bundled into jobs. The result is a division of labor that assigns routine, boring, disempowering (but profitable) work to the many, while reserving varied, complex, empowering work for the privileged few.

Parecon rejects this inequitable division. It does so on grounds of fairness: why should interesting and enjoyable work be hoarded by some rather than shared by all? It also objects on democratic grounds: unequal division of empowering work “inexorably destroys participatory potentials and creates class differences” (Albert 104). Any workplace with, say, janitors and managers will be a de facto hierarchy, even if it is, on paper, democratically organized. In the name of fairness and democracy, then, we must transform the division of labor so that “every individual [will] be regularly involved in both conception and execution tasks, with comparable empowerment and quality of life circumstances for all” (Albert 111).

ii. Allocation in Parecon: Economic Coordination Through Councils

This brings us to feature 5: economic coordination through councils. Every economy must decide what gets produced and consumed, and in what quantities. This is the problem of allocation. Market systems solve this problem through decentralized, voluntary, self-interested competition and exchange between buyers and sellers. Recall Adam Smith’s baker, who makes bread not because someone tells him to, but because by making bread, he can make money through trade. Centrally-planned economies solve the allocative problem by handing it over to a small group of economic elites, who craft a comprehensive plan for the entire economy and issue binding instructions for realizing it. Moscow decides that X amount of shoes will be produced, Y amount of steel, and so on, and enforces these demands on lower levels in the economic hierarchy.

Parecon rejects both approaches. In place of markets and central planning, it proposes a system of nested worker and consumer councils, through which individuals cooperatively generate an overall plan for production and consumption. Albert and Hahnel call this system “decentralized participatory planning.”

Simplifying greatly, its basic gist is this. To figure out what people want to consume, we ask them. To figure out what they are willing to produce, we ask them. We then aggregate all these responses and compare proposed supply with proposed demand. If they don’t match, we close the gap through democratic negotiation conducted on a footing of equality. Through such negotiation, we eventually reach, say, five feasible plans. We put them to a vote and implement the winner. Decentralized participatory planning thus promises to solve the allocative problem without hierarchy or markets.

The system features several key participants: first, worker councils (and federations thereof—for example, a “software industry council,” a “farming council,” and so forth.); second, consumer councils (and federations thereof—for example, neighborhood councils, city-level councils, state-level councils, and so on); and third, the “iteration facilitation board,” or IFB. The IFB initiates the planning process by announcing provisional prices for all inputs and outputs. Importantly, these prices should reflect the “full social costs and benefits” associated with these inputs and outputs, including opportunity costs and externalities, whether positive or negative. Albert and Hahnel see this as a key difference with market systems. In a parecon, prices will accurately track the true social costs of production. Prices rarely do this in market societies. Think, for instance, of the absurdly low price of gasoline in the United States, despite the ecological and economic costs of its widespread production and use.

With these provisional prices in hand, each economic actor—individuals and councils and federations of councils—proposes both a) a consumption plan and b) a production plan. The former specifies what the actor would like to consume during the period being planned (the upcoming year, say); the latter specifies which outputs the actor proposes to produce, and the inputs they will require to do this. Plans will go to appropriate councils for approval. Thus, a family might submit their consumption plan to the neighborhood consumption council, while a worker might submit her plan to her work-team or to the larger workplace council.

On what basis are proposals approved or rejected? Individual consumption proposals should be approved if the person’s income is equal to or greater than the total cost of the goods requested. Income, remember, is a function of one’s effort and sacrifice at work. Higher-level consumption proposals (for a neighborhood, say) should be approved if the group’s income (minus the costs of members’ personal consumption) suffices to cover the costs of the requested items. The underlying thought here is that a community’s overall consumption should correspond to the amount of effort its members expend producing goods and services for society: the harder the community works, the more it is entitled to consume. Turning to production proposals, these are evaluated by comparing the estimated social benefits of the goods and services produced with the estimated social costs of producing them. If this ratio is positive, then the production proposal should be accepted; if it is negative, then the proposal does not represent a responsible use of societal resources, and it is sent back for revision.

The IFB aggregates all approved proposals and determines whether projected supply matches projected demand. Barring a miracle, it won’t, not at this stage. So the IFB recalculates prices in light of the mismatch between supply and demand, raising prices for goods with excess demand, lowering prices for those with excess supply, and sends the plans back to their originators for revision. Using the new prices, consumers and producers tweak their proposals, perhaps shifting demand to lower-priced goods and increasing supply of goods with high prices. They then send these revised proposals to the relevant councils, which evaluate them as before. Eventually, all approved proposals make their way to the IFB, which recalculates overall supply and demand to see if they match. If they do, then the process is over; if they don’t, then another round of revisions is required. If the process ends with multiple feasible plans, then society votes to determine the winner.

iii. Evaluating Parecon

This is, to be sure, an incredibly complex procedure, indeed, much more complex than this brief sketch indicates. Even Albert and Hahnel admit that it will take multiple rounds of negotiation, and no small amount of paperwork, to arrive at a feasible plan. But the hope is that “as every individual or collective worker or consumer participant negotiates through successive rounds of back and forth exchange of their proposals with all other participants, they alter their proposals to accord with the messages they receive, and the process converges” (Albert 128). And this, notice, without markets or central planning:

There is no center or top. There is no competition. Each actor fulfills responsibilities that bring them into greater rather than reduced solidarity with other producers and consumers. Everyone is remunerated appropriately for effort and sacrifice. And everyone has proportionate influence on their personal choices as well as those of larger collectives and the whole society (Albert 128-129).

The absence of hierarchy is worth emphasizing. Although there is an IFB, and although one’s consumption and production proposals must be approved by others, the overall distribution of power is web-like rather than hierarchical. No one occupies any special position of authority. Economic decisions are not dominated by the wealthy (as under capitalism) or the politically connected (as under central planning). Instead, all economic decision-making is radically democratic and open to negotiation: each person has a say over decisions that affect him or her, proportional to the degree to which he or she is affected. Parecon may have important flaws, but inadequate respect for democratic values would not seem to be among them. Indeed, it is hard to imagine a system more faithful to the core socialist commitment to bottom-up, democratic control over the economy. This, surely, is parecon’s chief virtue from the socialist perspective.

But would it work? Some commentators are skeptical. Erik Olin Wright writes:

The information complexity of the iterated planning process described in Parecon might in the end simply overwhelm the planning process. Albert is confident that with appropriate computers…this would not be a problem…Perhaps he is right. But he may also be horribly wrong. As described…the information process seems hugely burdensome (264).

Defenders of parecon reply to such worries in several ways. First, they argue that critics overestimate the amount of planning that parecon requires. Setting up one’s initial consumption proposal may well take lots of time and energy. But with that initial investment made, planning in subsequent years should be much quicker. One can simply base future plans off of the original one, tweaking here and there as necessary.

Nor need one specify in great detail what one wants to consume. “For planning purposes,” writes Albert, “we need only request types of goods, even though later everyone will pick an exact size, style, and color to actually consume” (130). One’s consumption proposal must “express preferences for socks, but not for colors or type of socks; for soda, books, and bicycles, but not for flavors, titles, or styles of each” (217). (Of course, critics may find new grounds for concern here. Because consumption preferences tend to be rather fine grained—a person wants to read a dystopian science fiction novel, not some generic book; she wants wild-caught salmon, not “food” or even “fish”—parecon seemingly faces a dilemma. If people do not request specific items, then many desired items will be in short supply, and consumer satisfaction will be low. If, on the other hand, people do request specific items, we’re thrown back on the original worry: isn’t this planning process unworkably cumbersome?)

Second reply to the infeasibility worry: we must remember that other economic systems require paperwork and planning, too. Under market socialism (and capitalism) consumers must make budgets, do their taxes, pay bills, go shopping, and so on. Enterprises must decide what they will make and in what quantities. They must also make various personnel decisions, deciding who will work with whom, for how long, on which projects, and so on. Added up over the course of the year, the amount of time spent on such activities is far from trivial. Indeed, one might argue that total planning time will be roughly constant across market- and participatory-planning-based systems.

Finally, suppose that parecon does require a substantial amount of time and energy, or perhaps, more time and energy than alternative systems. Still, these costs must be judged against the potential benefits. Parecon promises a more equal, fraternal, just, democratic society. Is it even remotely reasonable to reject such a society on the grounds that it requires too much paperwork?

In sum, defenders of parecon argue 1) that their proposal won’t prove nearly as burdensome in practice as critics fear; indeed, 2) one may reasonably doubt that it is any more burdensome than other systems; and 3) even if parecon does prove burdensome both absolutely and comparatively, surely the sacrifice is worth the result.

c. Market Socialism

Suppose one rejects central planning, but also doubts the feasibility (or perhaps even the desirability) of parecon-style participatory planning. Must one therefore reject socialism? No, not according to “market socialists” such as John Roemer, David Schweickart, David Miller, Erik Olin Wright, Tom Malleson, and others.

On the traditional view, socialists must, by definition, be opposed not simply to private property, but also to markets. Market socialists disagree. On their view, socialism requires only a certain form of ownership, namely, social rather than private ownership. About markets, socialists should be open-minded. Markets are just tools for communicating information and motivating economic activity. Like any tool, they should be evaluated instrumentally. Do they work better than alternatives? If so, then socialists should embrace them.

And indeed, market socialists characteristically argue that markets do work better than the alternatives: just look at the economic record. This is not to say that markets are perfect, nor is it to say that they should be allowed to operate ‘freely,’ without any constraints. Market regulations are integral to the market socialist vision. Market socialists are no kind of market fundamentalists. Rather, they view themselves as pragmatists. They see the evils of capitalism, but they also see the problems with planning-based socialist alternatives. The way forward, they argue, is to take the good parts of capitalism and combine them with the good parts of socialism. This will displease fundamentalists on both sides, but what alternative is there? Capitalism is a moral disaster. Central planning was worse. Participatory planning is a pipe dream. 21st century leftists must fuse socialism with markets. There is no other way. Or so market socialists argue.

i. Schweickart’s “Economic Democracy”

To further explain the market socialist position, this article will present David Schweickart’s market socialist model, “Economic Democracy” (ED for short). (For a recent proposal very similar in spirit to Schweickart’s see Malleson 2015. For other important developments of market socialism, see Roemer 1994, Miller 1989, and Carens 1981). Boiled down to essentials, ED has three main features: worker self-management, the market, and social control of investment.

Worker self-management: “Each productive enterprise is controlled by those who work there” (After Capitalism 49). Workers together decide all aspects of production: what to make, how to make it, workplace policies, compensation, and so on. This does not preclude the use of managers or experts. In large firms especially, some delegation of authority will almost certainly prove necessary. Schweickart suggests that workers might elect a workers’ council which will then appoint executives, managers, and so on.

The market. In stark contrast with planning-based forms of socialism, ED solves the problem of allocation using market competition between profit-seeking enterprises. ED’s enterprises start with a sum of money (M). They use this money to buy productive inputs on the market, which they transform into commodities. They then compete with other enterprises to sell these commodities to consumers or other enterprises, thereby ending up with a new amount of money (M’). (Prices are determined mainly by market forces of supply and demand, although price regulations may sometimes be appropriate: again, Schweickart is no market fundamentalist.) In the normal case M’ > M, indicating that the enterprise has turned a profit. Indeed, turning a profit is the immediate aim of production in ED: enterprises produce to make money, not (primarily) to satisfy human needs. As Schweickart says, “profit is not a dirty word in this form of socialism” (After Capitalism 51).

This may sound rather close to capitalism, but in fact there is an important difference here. Under capitalism, profits go to owners, not workers, who receive wages. Under ED, by contrast, there are no wages; rather, “workers get all that remains once nonlabor costs…have been paid” (After Capitalism 51) Precisely how workers divvy up the enterprise’s surplus is up to them. In theory they could split it equally. But given the need to outcompete other enterprises—hence, to attract and retain skilled labor—some degree of inequality is likely to be chosen. More productive workers, or workers with skills in higher demand, will almost certainly earn more than their fellows. Notice the difference here with parecon, which links income to effort rather than contribution or other “morally arbitrary” factors beyond the agent’s control. Empirical evidence suggests that self-managed firms (like those in the Mondragon cooperative in Spain) opt for a 4 or 5:1 ratio between the incomes of the highest- and lowest-paid employees: quite a dramatic difference from the 300:1 spread typical in large capitalist corporations.

Social control over investment. This is the most clearly “socialist” piece of Schweickart’s model. In an ED, the means of production belong to all members of society, not to the enterprises that happen to deploy them. To reflect this social rather than sectional or private ownership, all enterprises must pay a capital assets tax. “This tax,” writes Schweickart, “may be regarded as a leasing fee paid by the workers of the enterprise for use of social property that belongs to all” (After Capitalism 52). Revenues from this tax constitute the national investment fund, which is the sole source of investment money in ED. By tweaking the tax rate, society can determine the size of the national investment fund—hence, the amount of money available for investment, and thus the overall level of economic growth and development.

Note the contrast with capitalism: under capitalism, most investment comes from private rather than public sources. Both the amount and direction of economic development therefore depend on the whims and abilities of private investors. This leads to the boom and bust cycle discussed in section 3, as well as other pathologies such as excessive growth, ecological devastation, underdeveloped regions alongside overdeveloped ones, unemployment, poverty, and all the rest. Under ED, by contrast, investment is democratically controlled by all members of society. In theory, this should enable “more rational, equitable, and democratic development than can be expected under capitalism” (After Capitalism 52)—a point that will be explained further below.

How, specifically, should social control over investment be institutionalized? There are many options. At one extreme, society might opt for a planning-heavy system in which a democratically accountable planning board draws up a plan for all new investment, which Schweickart estimates would constitute about 15% of GDP, and allocates funds accordingly. For example, the planning board might decide to prioritize renewable energy, or consumer goods, or whatever. At the other extreme, society might prefer a laissez-faire model in which funds are channeled through public banks to enterprises using essentially the same criteria that capitalist banks use: namely, profitability. In this version of ED, market forces would largely determine the pattern of investment.

Schweickart himself proposes something in the middle of these two options. Funds should go to regions (for example, Texas) and communities (for example, Fort Worth) on a per capita basis. If the Fort Worth region has the same population as Silicon Valley, then it will receive the same amount of investment. In ED, then, there will be no economic backwaters, no regions or communities left behind. Nor will regions or communities be locked into a destructive neoliberal “race to the bottom” to attract investment dollars. Each community receives its “fair share” no matter how business-friendly (or unfriendly) its policies.

Once distributed to regions and communities on a per capita basis, investment funds are channeled to regional and community enterprises by public banks. Enterprises in need of investment (say, to expand production) apply to area banks for funds. Banks assess applications on the basis of a) profitability, b) job creation, and c) any other democratically chosen criteria, such as ecological impact. This mixed standard implies that while profitability matters, it is not all that matters. Projects that further socially chosen goals may be chosen over more profitable, but less socially desirable alternatives.

Summing up, Schweickart’s model strategically transplants certain core elements of capitalism into a broadly socialist framework. We get markets and profit-seeking enterprises, but also workplace democracy and social control over investment. The result, Schweickart argues, is an economy that outperforms all rivals—whether socialist or capitalist—in terms of values dear to socialist hearts, such as equality, economic stability, human development, democracy, and environmentalism. ED thus promises to deliver “socialism that would really work,” to quote the title of one of Schweickart’s early articles on the topic.

ii. Evaluating Economic Democracy

Perhaps it would really work, but would it be socialism? This, in essence, is the main criticism of Schweickart’s model (and of market socialism more generally).

That his proposal would work—that it is feasible—seems relatively uncontroversial. Markets work. Self-managed enterprises work, as illustrated by decades of empirical evidence. Social control over investment is the only truly novel piece of Schweickart’s model, but its basic mechanisms—the capital assets tax, the national investment fund, the system of public banks allocating funds on the basis of profitability as well as other socially chosen considerations—raise no obvious feasibility worries. Granted, neoclassical economists will complain that because ED regulates and interferes with markets in various ways, it sacrifices efficiency. But “less than perfectly efficient” does not mean “infeasible”. And efficiency isn’t the only thing we want from an economy anyway. Better to sacrifice some efficiency, Schweickart would argue, for gains in employment, more equitable development across regions, greater democratic empowerment at work, and so on. So all things considered, market socialism seems eminently feasible. This is perhaps its greatest selling point.

But is it desirable? Critics right and left will argue that it is not. Those on the right will complain that ED limits basic economic freedoms, such as the formal freedom to own the means of production, to hire wage labor, and to run a business in a un-democratic fashion. Market socialists will reply that not all formal freedoms are worth protecting. They will further suggest that ED will enhance effective economic freedom for the vast majority, even if this means diminishing economic freedom, both formal and effective, for those elites who would, absent ED, enjoy greater workplace control and authority. Under capitalism, most workers control no productive property and enjoy no real say over their work. Economic power is monopolized by a tiny class of owners. Under ED, by contrast, economic hierarchies are flattened. Economic power within the enterprise is distributed equally to all workers on a one worker, one vote basis. Consequently, everyone has the effective freedom to shape workplace decisions. Seen from this angle, ED enhances rather than reduces economic freedom.

Market socialism attracts critical fire from the left as well as the right. It is a strange form of socialism indeed, leftist critics will argue, that features anarchic market production for profit rather than planned production for use. With markets and profits come competition, greed, fear, and the diminution of community; with markets and profits come consumerism, ever-expanding hours of work, and the ecologically insane desire for never-ending economic growth.

Schweickart replies that these worries are overblown. Yes, ED features competition; yes, there will be advertising and some degree of consumerism; yes, enterprises may, under certain circumstances, seek to grow. But the details make a difference. Competition, consumerism, and economic growth are all held in check in ED by countervailing forces. Social control over investment means that we can democratically determine the overall rate and direction of economic growth. We can prioritize environmental aims, for instance, over the rapacious quest, so characteristic of capitalism, for additional output at whatever cost. Workplace democracy means that we can choose shorter working hours in exchange for reduced consumption opportunities. Moreover, because democratic firms seek to maximize profit per-worker (rather than total profit, as do capitalist firms), they will not expand as aggressively as their capitalist counterparts. But reduced expansion means less output that needs to be sold, which, in turn, reduces demand for advertising and marketing. In short, for all of these reasons ED is absolutely compatible with the socialist vision of a less-consumerist, more leisurely, ecologically sane world, or so defenders of market socialism would argue.

Indeed, market socialists would draw a more general lesson here. From the fact that markets in a capitalist context lead to undesirable effects X, Y, or Z, we cannot automatically infer that they would lead to X, Y, or Z in the dramatically different political-economic framework of market socialism. Maybe they would, but maybe they wouldn’t. The only way to tell, insist market socialists, is to work carefully through the details.

9. References and Further Reading

  • Albert, Michael. Parecon: Life After Capitalism. London: Verso, 2003.
    • Presents Albert (and Hahnel’s) participatory planning model of socialism.
  • Albert, Michael, and Robin Hahnel. Looking Forward: Participatory Economics for the Twenty First Century. South End Press, 1991.
    • An early statement of Albert and Hahnel’s participatory planning model of socialism.
  • Arneson, Richard. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1), 77-93, 1989.
    • A canonical statement of the “luck egalitarian” position.
  • Bellamy, Edward. Looking Backward. Dover, 1996 [1888].
    • A utopian novel, widely acclaimed in its day, depicting political, economic and social arrangements in socialist Boston, some 100 years after a successful revolution.
  • Braverman, Harry. Labor and Monopoly Capital: The Degradation of Work in the Twentieth Century. 25th Anniversary Edition. New York: Monthly Review Press, 1998 [1974].
    • Important Marxist analysis of work, according to which the imperatives of profit-maximization force capitalists to simplify and routinize labor processes, thereby degrading work.
  • Brennan, Jason. Why Not Capitalism? New York: Routledge, 2014.
    • A sharp parody of and rejoinder to G.A. Cohen’s Why Not Socialism? that defends capitalism on moral (rather than pragmatic) grounds.
  • Carens, Joseph. Equality, Incentives, and the Market: An Essay in Utopian Politico-Economic Theory. Chicago: University of Chicago Press, 1981.
    • Describes a market socialist economic system that—unlike capitalist and non-market socialist alternatives—fully realizes the values of equality, freedom, and economic efficiency.
  • Cohen, G.A. “The Structure of Proletarian Unfreedom.” Philosophy and Public Affairs, Vol. 12, No. 1, 3–33, 1983.
    • Argues that workers are individually free (since they are not forced to work for capitalists) but not collectively free (since few workers can escape proletarian status at any given time).
  • Cohen, G.A. History, Labour, and Freedom: Themes From Marx. Oxford: Clarendon Press, 1988.
    • Collection of Cohen’s essays on Marxist themes.
  • Cohen, G.A. “On the Currency of Egalitarian Justice.” Ethics 99 (4), 906-944, 1989.
    • Important statement of luck egalitarianism.
  • Cohen, G.A. Karl Marx’s Theory of History: A Defence. Expanded edition. Princeton, NJ: Princeton University Press, 2000.
    • Cohen’s classic reconstruction and qualified defense of Marx’s theory of history, “historical materialism”. Widely regarded as a founding text of the so-called “Analytical Marxism” movement.
  • Cohen, G.A. Why Not Socialism? Princeton: Princeton University Press, 2009.
    • Argues that—bracketing issues of feasibility—socialism is morally desirable, but concedes that socialists do not know whether socialism is feasible.
  • Cohen, G.A. “Capitalism, Freedom, and the Proletariat.” In G.A. Cohen, On The Currency of Egalitarian Justice and Other Essays. Princeton: Princeton University Press, 2011.
    • Analyzes freedom under capitalism, arguing that private property restricts formal freedom in underappreciated ways.
  • Devine, Pat. Democracy and Economic Planning. Cambridge: Polity Press, 1988.
    • Rich, detailed, economically sophisticated statement of a democratic alternative to central planning, with especially interesting ideas about the division of labor.
  • Elster, Jon. An Introduction to Karl Marx. Cambridge: Cambridge University Press. 1986.
    • An often-critical reconstruction of central Marxist themes by one of the central figures in the Analytical Marxism movement.
  • Elster, Jon. Self-Realization in Work and Politics: The Marxist Conception of the Good Life. Social Philosophy and Policy, Vol. 3, No. 2, 1986.
    • Analytically crisp discussion of self-realization and the prospects for achieving it under capitalism and socialism.
  • Engels, Frederick. Socialism: Utopian and Scientific. Pathfinder Press, 2008 [1880].
    • Important overview of historical materialism and the socialist critique of capitalism by Marx’s intellectual partner; arguably more accessible to beginners than anything by Marx himself.
  • Friedman, Milton. Capitalism and Freedom. 40th Anniversary Edition. Chicago: University of Chicago Press. 2002 [1962].
    • Friedman’s classic defense of libertarian capitalism on moral grounds.
  • Gilabert, Pablo. “The Socialist Principle ‘From Each According to Their Abilities, To Each According to Their Needs’.” Journal of Social Philosophy, Vol. 46, No. 2, 197-225, 2015.
    • Interesting recent paper that brings the needs/abilities principle into dialogue with other positions in distributive justice.
  • Harrington, Michael. Socialism: Past and Future. New York: Little, Brown & Co, 1989.
    • Historically learned, empirically informed overview of socialism’s development and future trajectory by an important figure in American socialist politics.
  • Hayek, Friedrich. The Road to Serfdom: Text and Documents—The Definitive Edition. Chicago: University of Chicago Press, 2007.
    • Hayek’s celebrated broadside against socialist planning and the creeping threat to freedom that it represents.
  • Holmstrom, Nancy. “Exploitation.” Canadian Journal of Philosophy, Vol. 7, No. 2, 353-369, 1977.
    • Early, analytically sharp defense of the view that exploitation is forced, uncompensated labor, the products of which producers do not control.
  • Lenin, Vladimir. The State and Revolution. New York: Penguin, 2009 [1918].
    • Argues, to give one example, that genuine democracy is impossible under capitalism.
  • Levine, Andrew. Arguing for Socialism. London: Verso. 1988.
    • Rigorous, subtle work that mounts a qualified case for socialism using tools of contemporary moral and political philosophy.
  • Malleson, Tom. After Occupy: Economic Democracy for the 21st Century. New York: Oxford University Press, 2015.
    • Empirically and philosophically rich development of a broadly market-socialist position with an especially interesting defense of workplace democracy.
  • Marx, Karl. Capital: A Critique of Political Economy, Vol. 1. New York: Vintage Books, 1977 [1867].
    • Marx’s masterwork lays bare capitalism’s “laws of motion”, but says little about alternatives.
  • Marx, Karl. Critique of the Gotha Program. In David McLellan (Ed.), Karl Marx: Selected Writings, second edition. Oxford: Oxford University Press, 2000 [1875].
  • Marx, Karl, and Frederick Engels. The Communist Manifesto. London: Verso, 1998 [1848].
    • An enormously influential political pamphlet outlining core elements of the Marxist theory of history, critique of capitalism, and program for a socialist future.
  • Miller, David. Market, State, and Community: Theoretical Foundations of Market Socialism. Oxford: Oxford University Press, 1989.
    • Important, philosophically sophisticated statement of market socialist ideas.
  • Ollman, Bertell, ed. Market Socialism: The Debate among Socialists. New York: Routledge, 1998.
    • Brings together leftist critiques and defenses of market socialism.
  • Peffer, Rodney. Marxism, Morality, and Social Justice. Princeton: Princeton University Press, 1991.
    • Accessible reconstruction of Marxist themes, using techniques of analytic philosophy, that brings Marxism into dialogue with liberal egalitarians like John Rawls.
  • Reiman, Jeffrey. “Exploitation, force, and the moral assessment of capitalism: Thoughts on Roemer and Cohen.” Philosophy and Public Affairs, Vol. 16, No. 1, 3-41, 1987.
    • Argues that exploitation is forced, unpaid labor, and further contends—contrary to Cohen—that individual workers are indeed forced to work for capitalists.
  • Roemer, John. “Should Marxists Be Interested in Exploitation?” Philosophy and Public Affairs Vol. 14, No. 1, 30-65, 1985.
    • His answer is no: Marxists should focus on distributive justice rather than exploitation.
  • Roemer, John. A Future for Socialism. Cambridge, MA: Harvard University Press, 1994.
    • Important statement of market socialism by a leading figure in the Analytical Marxist movement.
  • Schweickart, David. “Economic Democracy: A Worthy Socialism That Would Really Work.” Science & Society, Vol. 56, No. 1 (Spring), 9-38, 1992.
    • A capsule presentation of Schweickart’s market socialist model, “economic democracy”.
  • Schweickart, David. “Nonsense on Stilts: Michael Albert’s Parecon.” Schweickart’s website. Posted January, 2006.
    • Argues that Albert and Hahnel’s “participatory economics” can’t work, and wouldn’t be desirable even if it did.
  • Schweickart, David. After Capitalism. Second edition. Lantham, MD: Rowman & Littlefield, 2011.
    • Argues for a heterodox form of socialism that blends profits and markets with workplace democracy and social control over investment.
  • Smith, Adam. The Wealth of Nations: Books 1-3. New York: Penguin, 1982 [1776].
    • Smith’s classic discussion of early capitalism.
  • Van Parijs, Philippe. Real Freedom For All. Oxford: Oxford University Press, 1997.
    • Defends a “basic income” on “real libertarian” grounds.
  • Wright, Erik Olin. Envisioning Real Utopias. London: Verso, 2010.
    • Drawing on a vast fund of research from social science and philosophy, reimagines socialism for the 21st century.

 

Author Information

Samuel Arnold
Email: s.arnold@tcu.edu
Texas Christian University
U. S. A.

Egalitarianism

Are all persons of equal moral worth? Is variation in income and wealth just? Does it matter that the allocation of income and wealth is shaped by undeserved luck? No one deserves the family into which they are born, their innate abilities, or their starting place in society, yet these have a dramatic impact on life outcomes.

Keeping in mind the extreme inequality in many countries, is there some obligation to pursue greater equality of income and wealth? Is inequality inherently unjust? Is equality a baseline from which we judge other distributions of goods? Do inequalities have to be justified by people somehow deserving what they have, or by inequality somehow improving society?

As a view within political philosophy, egalitarianism has to do both with how people are treated and with distributive justice. Civil rights movements reject certain types of social and political discrimination and demand that people be treated equally. Distributive justice is another form of egalitarianism that addresses life outcomes and the allocation of valuable things such as income, wealth, and other goods.

The proper metric of equality is a contentious issue. Is egalitarianism about subjective feelings of well-being, about wealth and income, about a broader conception of resources, or some other alternative? This leads us to the question of whether an equal distribution of the preferred metric deals with the starting gate of each person’s life (giving everyone a fair and equal opportunity to compete and succeed) or with equality of life outcomes. Egalitarianism also raises a question of scope. If there is an obligation to pursue distributive equality, does it apply only within particular states or globally?

See also Moral Egalitarianism.

Table of Contents

  1. What is Egalitarianism?
  2. Equality of What?
    1. Welfare
    2. Resources
    3. Capabilities
    4. Democratic/Social Equality
    5. Primary Goods
    6. Luck Egalitarianism
  3. Equality of Opportunity
  4. Anti-Egalitarianism
    1. Sufficiency vs. Equality
  5. Domestic or Global?
  6. References and Further Reading

1. What is Egalitarianism?

Consider three different claims about equality:

  1. All persons have equal moral and legal standing.
  2. In some contexts, it is unjust for people to be treated unequally on the basis of irrelevant traits.
  3. When persons’ opportunities or life outcomes are unequal in some important respect, we have a reason to lessen that inequality. (This reason is not necessarily decisive.)

All of these claims express a commitment to equality. They are each progressively more egalitarian. Understanding the difference between these claims, their normative implications, and the various ways the content of the third claim can be further specified, are crucial to understanding the disparate collection of philosophical views that compose egalitarianism.

Claim (1) entails claim (2), and therefore captures part of contemporary egalitarianism. If all persons are equal, then there are political constraints on how they can be treated unequally. Disenfranchisement and differential rights violate the equality affirmed in (1). (3) is even stronger than (2), because it is not only committed to treating people equally, but ensuring that people have equal amounts of some important good. There is controversy whether (1) entails, is merely compatible with, or is incompatible with (3).

The descriptive thesis found in claim (1) affirms the equality of all persons. This must not be the plainly false assertion that for any given trait, all persons are equal. We differ in our abilities, resources, opportunities, preferences, and temperaments. The claim must be about something more specific. All persons have equal moral worth or equal standing. The United States Declaration of Independence famously states that “all men are created equal.” Jeremy Bentham’s dictum “each to count for one, none to count for more than one,” is another expression of the descriptive thesis. While the conditions in which people live, their wealth and income, their abilities, their satisfaction, and their life prospects may radically differ, they are all morally equal. In moral and political deliberation, each person deserves equal concern. All should have equal moral and legal standing.

If all persons are equal in this way, then some forms of unequal treatment must be unjust. The descriptive thesis, applied within a particular state, at least entails equal rights and equal standing. Therefore (1) constrains how a just political society can be structured because it entails some degree of support for claim (2). The degree is debatable in terms of which contexts require equal treatment, what types of institutions must treat people equally, and so on. At least in terms of basic political rights, discrimination on the basis of gender, ethnicity, and caste is prohibited. Many would also extend these to commerce and the wider public sphere: businesses should not be able to refuse service on the basis of race, gender, or sexual orientation. The descriptive thesis must entail some commitment to equal treatment, but the scope of that commitment is disputed.

Claim (3) (let us call it the egalitarian thesis) is closely related to the descriptive thesis. (1) is taken by some as ground for affirming (3). Denying (1) is grounds for rejecting the imperative in (3). Yet the two theses are distinct. A commitment to (1) does not obviously entail a commitment to (3), because (3) is more robust and has wider scope. (1) may entail (3), but establishing this requires a substantive argument. The descriptive thesis’ extension into the social standing, well-being, wealth, income, and life outcomes of citizens is controversial. Unlike (3), (1) is not on its face opposed to radical inequalities in income, wealth, capabilities, welfare, life prospects, or social standing. If those inequalities arise within legitimate political institutions that respect the equal standing of all persons, they may be just.

The egalitarian thesis addresses more than the moral worth of persons. It expresses an obligation to pursue distributive equality. Deviations from equality are prima facie unjust. But along which dimension ought we pursue greater equality? Candidate metrics include resources, income, wealth, welfare, or capabilities to perform certain functions. The obligation to pursue equality along some such dimension makes (3) fully egalitarian in the contemporary sense of the term. (1) does not necessarily prohibit dramatic inequalities, whether they are deserved or undeserved, due to hard work or luck, recent or hereditary. Absent further argument, the content of (1) is only concerned with such inequalities conditionally, when they violate the equal moral status of persons. Of course if social exclusion, caste discrimination, and unequal rights are prohibited in light of the fact that (1) entails some level of commitment to (2), this will influence the distribution those metrics. This is not the same as a direct obligation to pursue distributive equality of one of those metrics.

(1) is descriptive in content but has normative implications. Egalitarianism is essentially prescriptive and normative. (3) directly states what ought to be done with regard to the inequalities among persons. It is an imperative to reduce distributive inequality along some dimension. The normative commitments that follow from (1) set minimal standards: states must not violate the equal standing of persons.  The normative commitments of (3) are stronger and more aspirational: we continually pursue equality by reducing inequality. This is a pursuit of substantive distributive justice—equality of some sort of condition or opportunities. It is not mere formal equality of rights, or of economic notions such as considering everyone equal as long as their income is determined by their marginal product.

Egalitarians are thus committed to distributive justice in a way that (1) need not be. (1) may entail a certain conception of distributive justice having to do with equality of opportunity and individual rights, especially property rights. For example, John Locke argued that all persons are equal and have the same rights. The equal standing and equal rights of all persons, even in the pre-civilized state of nature, is a crucial component of his theory of just government. This is a commitment to equality, but it is not egalitarian in the contemporary sense. It does deal with distributive justice, but only in terms of respecting property rights and the right to free exchange of property. A commitment to equality is not yet a commitment to substantive distributive justice (a commitment to have a fair and equitable distribution of goods), and is compatible with merely formal or historical distributive justice (defining a just distribution as one that respects standing property rights and the right of people to trade without theft or coercion).

What is an egalitarian commitment to substantive distributive justice? In the most literal sense, it requires equalizing the distribution of some quantifiable thing among persons, such as income or wealth. An egalitarian may see distributive justice as an end in itself. This would mean it is constitutive of a just society. It can also mean that we choose a metric of equality that is intrinsically good, such as welfare or well-being. Those things are desirable in themselves, not because they are instrumental in acquiring other goods. Alternatively, egalitarianism can be seen as merely instrumental. For example, distributive justice can be seen as a means to achieving some other social end, such as creating social relationships among citizens that are equal and non-oppressive, and allowing them to flourish and function as citizens. An example of an instrumental metric of equality is resources, because resources can be used to generate welfare.

Strictly speaking, all non-equalizing views of substantive distributive justice are alternatives to egalitarianism. This would exclude Rawls’ difference principle, which allows for inequalities when they are required to raise the absolute condition of the worst off. It would also exclude views that prioritize aid to the worst off or argue in favor of redistribution to guarantee a sufficient minimum for all. The contemporary usage of the term is not restricted to equalizing views. While there are contemporary debates between egalitarianism narrowly defined and non-equalizing views such as Rawls’, the most illuminating contemporary definition of the term is that it is a commitment to substantive distributive justice as opposed to merely formal or historical distributive justice.

Egalitarianism therefore comprises divergent views about equality that go beyond the merely descriptive thesis and affirm at least one of the following theses: first, some important type of thing should be distributed equally among persons; second, distributive inequality (along some relevant dimension) is prima facie unjust and should be reduced.

Both principles further specify the normativity contained in (3), yet still give little concrete guidance. Consider a different normative principle with similar form: the current level of infant mortality is unjust and should be reduced. While this thesis does not tell us how to achieve our end, it clearly specifies the end. We know what counts as success because we know what infant mortality is and how to measure it. These two distributive principles, while clearly egalitarian, do not articulate any specific end. They give no guidance on what quantifiable thing matters to distributive justice. What form must a just distribution take? Is it about wealth? Income? Well-being? Preference satisfaction? Something else?

The remainder of this article focuses on the following topics:

  1. What is the proper egalitarian metric? Well-being? Resources? Income? Capabilities?
  1. Once we settle on a metric, are we then concerned with ex ante or outcome equality? In other words, is egalitarianism concerned with a fair allocation of holdings among persons at the starting gate of each life, so that the ensuing competition is fair, or is it concerned with equal life outcomes? Do choice and responsibility matter to this question? What if a given inequality is due to informed and avoidable choices made by the relevant persons? Can such inequalities be just? Should our shares be determined by our choices and actions? If so, then what is genuine equality—a pattern of distribution in which each person is maximally responsible for their holdings, with the role of luck minimized?
  1. Anti-egalitarianism. Many deny the fundamental equality of persons. Some think men are superior to women, certain races are superior to others, and certain castes should dominate others. If so, there is no general moral imperative to lessen inequality among persons. Anti-egalitarianism of this sort rejects both (1) and (2). This article will not address such views. The more philosophically compelling anti-egalitarianism stems not from a rejection of (1) but rather from one of the following readings of it:(3) does not follow from (1).Pursuit of (3) is counterproductive or has bad consequences. This includes political objections about incentives and productivity, an objection that if equality is desirable then it is desirable to lower the condition of those who have more even when this does not objectively aid those who have less, and objections that egalitarianism is motivated by envy.Engaging in redistribution to pursue the aim of (3) is incompatible with (1). For example, pursuing (3) violates rights that follow from (1).
  1. The relationship between egalitarianism and global justice. Does egalitarianism apply to the global community of humanity, or only within particular states? If it does not apply globally, is this a justified deference to the moral value of specific political attachments, a temporary compromise on the way to a more defensible form of egalitarianism, or is it simply unjustifiable favoritism?

2. Equality of What?

Egalitarianism requires a commitment to equalizing our holdings or at least reducing distributive inequality. Neither of these aims can solely be about equal standing or equal moral worth, if equal moral worth can be respected in a society that exhibits inequality among one of the specified dimensions. Respect for (1) puts some constraints on either inequality or the acceptable material minimum (say, by respect for equal rights entailing the minimum holdings to make those rights effective). That has to do with distributive justice, but in an attenuated sense that falls short of egalitarianism. Similarly, a society with radical inequality may make a rational calculus that some minimal redistribution is required for social stability, but this is prudential and conditional, not genuinely egalitarian.

What, other than equal standing or moral worth, is egalitarianism about? We examine five of the most influential candidates: welfare, resources, capabilities, democratic/social equality, and primary goods.

a. Welfare

Welfare is well-being or one’s quality of life. There are two main variants of welfare. The first is hedonic: welfare is pleasure or happiness. Your welfare increases as you experience more pleasures and fewer pains. The second is desire or preference satisfaction. Your welfare increases the more your desires, goals, and preferences are satisfied.

According to hedonic welfare egalitarianism, this feeling is what fundamentally matters in life. Welfare is the purpose of our actions. This view is common in ethics generally and is not restricted to political egalitarianism. Jeremy Bentham argued that humans seek pleasure and avoid pain, and that this is both a descriptive truth about human psychology and a normative truth about what we morally ought to do. Welfare is an intrinsic good. Other goods are useful in an instrumental sense. They can be used to obtain welfare.

If the use of material resources generates welfare, then equalizing welfare will attain substantive outcome equality even among people who exhibit different levels of efficiency in welfare generation. An able-bodied person may require fewer resources than a disabled person to achieve a given level of well-being. Suppose a disabled person needs a wheelchair. If she holds an equal amount of resources as a non-disabled person, then the able bodied person is better off than the disabled person. The disabled person must exchange resources for a wheelchair. So either she is not mobile or she is mobile but has fewer remaining resources than the able-bodied, and in either case she is worse off. Welfare equality accounts for variation in talents and abilities and opportunities. Equality of welfare attempts to neutralize the impact of these variations on the distribution of welfare.

From a welfare egalitarian perspective, a just distribution of material resources is merely instrumental to achieving what really matters. We cannot redistribute welfare directly; we can only redistribute the resources that persons can use to generate welfare. Since equality of welfare accounts for variations in how efficiently a person can convert resources into welfare, it is markedly different from equality of resources. An egalitarian welfare distribution will not distribute resources equally.

A problem facing this approach is that preferences adapt to one’s living conditions. Therefore, if preferences help determine one’s level of welfare, unjust inequalities in living conditions might not be rectified by welfare egalitarianism. Nussbaum gives examples of women deprived of resources and opportunities adapting their preferences. This leads to them reporting similar satisfaction levels to women who are objectively less deprived. The adaptive preferences worry is that when there are unjust inequalities, those at the bottom will adapt their preferences to this injustice. A preference can adapt such that you no longer desire that which you are denied. Someone for whom college is an impossible goal may adapt their preferences so that they do not desire to attend college. “Sour grapes” is an even stronger negative preference or aversion to the thing denied. Empirical studies support the thesis that preferences adapt to environmental factors and expectations. Thus someone with fewer opportunities than another may eventually report equivalent welfare levels to those with more opportunities, merely because their preferences, expectations, and standards have lowered. Welfare egalitarianism might therefore convert inequality to equality via subjugated persons internalizing and accepting their inferior status, thereby increasing their satisfaction and reported welfare. (For more on preferences see Harsanyi 1982; and Nussbaum 1999 Ch.5, 2001a.)

However, adaptive preferences are also a benefit for welfare egalitarianism. If persons did not adapt their preferences and ends in response to what they can reasonably expect to attain, aggregate life outcomes would be worse. If goals and preferences were completely non-adaptive, our collective welfare levels would suffer. Adapting one’s ends and preferences is part of forming a rational plan of life. Consider someone who pursues a goal of being a professional athlete at the expense of other professional and personal options. If that person lacks the relevant physical ability, this goal is harmful to their welfare.

Another question facing welfare egalitarianism is whether we should adopt an objective or subjective conception of welfare. Thus far, the description of welfare has been subjective. But what if someone derives high levels of welfare from objects or activities that have low or negative social worth? What if the person experiences higher level of welfare in pursuit of an idiosyncratic end rather than securing the objective necessities for survival? What if there are higher and lower forms of welfare?

Scanlon (1975) gives an example of someone who prefers to have resources to build a temple rather than to provide for his own health and physical well-being.  If he would experience greater subjective welfare under the former scenario, is that the ideal outcome? Or should we take an objective view, specify welfare in terms of the most objectively urgent needs, and guarantee that those are met? Suppose a person will have a below average level of subjective welfare if they have their basic necessities but not the temple, and a very high level of subjective welfare if they have the temple but not the basic necessities. What would welfare egalitarianism have us do? This is a dispute over whether any objective welfare standards are sovereign over individual preferences.

Two other problems for welfare egalitarianism deal with psychological variations among persons. Consider variation in disposition. The cheerful and the gloomy will vary in welfare levels as their share of resources holds constant. Do the gloomy deserve compensation? If resources are the raw material for generating welfare, this would lead to subsidizing the gloomy merely for being gloomy. The opposing view is that the gloomy should adapt rather than be subsidized, and if they do not adapt this is a personal matter, not an unjust inequality.

Expensive and inexpensive tastes are further problems for equality of welfare. Someone might have tastes and preferences that require a large number of resources, or particularly scarce resources, to satisfy. Those with expensive tastes require more resources to achieve a given level of welfare than those with less expensive tastes. While both disabilities and expensive tastes are inefficiencies in the conversion of resources to welfare, it seems a mistake to lump them together. Tastes can change over time. They are subject to their bearer’s agency in ways disabilities are not. People can cultivate, modify, and abandon their tastes and preferences. Also, being deprived of the goods made possible by, say, being ambulatory is not clearly equivalent to the deprivation suffered by someone with an unsatisfied preference for exotic food and wine. There seems to be a difference between using society’s resource to subsidize those with disabilities and subsidizing those with expensive tastes. Proponents of resource egalitarianism find welfare egalitarianism inadequately sensitive to this difference.

Some of the objections to welfare egalitarianism just outlined can be answered by moving to equality of opportunity for welfare. Equality of opportunity for welfare accounts for the luck egalitarian principle that what is bad, is for someone to be worse off than others through no fault of their own. Equality of opportunity for welfare does not commit itself to subsidizing the imprudent or those who cultivate expensive tastes. For an example of equality of opportunity for welfare, see Arneson (1989, 1990). For an equality of opportunity view with a wider metric that includes aspects of both welfare and resources, see Cohen (1989). Equality of opportunity is addressed in greater detail in Section 3.

b. Resources

Resources are things one can possess or use. Think of the various things you can use to generate welfare: wealth, income, land, food, consumer goods. Wider conceptions of resources include one’s own talents and abilities. Resources can also be social: social capital, respect, and opportunities.

Welfare is an intrinsic good, resources are instrumental goods. Resources are good because they can be used to generate welfare, or to guarantee that people are fully capable of functioning and thriving, or able to pursue some specific conception of the good life. Why focus on an instrumental good rather than the intrinsic good? Recall that different persons may require different amounts of resources to achieve equivalent levels of welfare. For example, we can understand disability as inefficiency in welfare generation. Equality of welfare counteracts disabilities, variations in talent and ability, and so on. From the welfare egalitarian perspective, focusing on resources misses the point.

On the other hand, equality of resources gives an attractive answer to other forms of resource-to-welfare inefficiencies that are not obviously matters of justice. What if my tastes are simply more expensive than yours? If you can achieve a specific welfare level with low-grade hamburger, but I need wagyu beef to reach the same level, then equality of welfare, at least in principle, requires subsidizing my share of resources above yours. You get fewer resources than I do only because your tastes are less expensive. Is this just? Many find it to be implausible in principle and inapplicable in practice. Consider the problems of implementing a scheme of distributive justice that would subsidize expensive tastes. This would generate resentment and reduce the commitment to distributive justice in society. There is also a problem of knowledge and trust—how do I know you have expensive tastes? Everyone has an incentive to report having expensive tastes when they are subsidized.

The bad sort of adaptive preferences amplifies this problem. Suppose your tastes are less expensive than mine because you were raised in a less privileged environment with fewer resources and opportunities. This institutionalizes prior inequalities and subsidizes further those who were already better off. If that seems unjust, then it is attractive to shift focus from the intrinsic good to the instrumental good. If we equalize resources, we can give everyone a fair opportunity to generate welfare and leave variations in tastes as a private concern.

One welfare-egalitarian response to these problems is to distinguish between tastes that are under the control of the person and those for which the person is not responsible. If my taste is out of my control, then its impact on my welfare levels is a matter of justice. If I intentionally cultivated the taste, or refuse to expend effort attempting to revise it, then it is a private concern. But this distinction raises perplexing empirical questions. How could we ascertain whether or not a taste is under one’s control? This is a counterfactual claim about what would happen if the person tried to change it, or a historical question about what happened when in fact they tried to change it.

Equality of resources provides a compelling answer to these problems.  If we all have an equivalent bundle of resources, and have control over how we expend them, then whatever tastes an individual has is a private concern. It is not a matter of justice. But the advantage gained in terms of expensive tastes generates a cost: we may no longer have a sufficiently egalitarian response to unjust inefficiencies such as disabilities. Even if expensive tastes and gloominess should not be concerns of distributive justice, inefficiencies involving disability should be. If you and I have the same bundle of resources, but you need a wheelchair to be mobile and I do not, then you are disadvantaged. Our positions are not equal. If equal shares of resources define distributive justice, the disabled are at a disadvantage.

Dworkin (1981b, 2002) takes this as one reason to treat some features of the self as resources. This allows resource egalitarianism to differentiate expensive tastes and disabilities. Dworkin sees both as inefficiencies in welfare generation, but only disability is also a resource deprivation. Someone who can walk has more bodily resources than someone who cannot. This wide conception of resource egalitarianism sees disability as a resource deprivation and therefore a matter of distributive justice. Equal shares of resources now account for disabilities. In the example from the previous paragraph, you will receive the same bundle as me plus a wheelchair. Our total bundles are comparable, because mine includes an ambulatory body while yours includes a non-ambulatory body plus a wheelchair. This approach also applies to innate talents. Someone with abilities or talents that are in high demand already has more resources than someone without such innate talents.

Dworkin’s strategy immediately raises the question of how to determine the value of specific traits and abilities. If we want to implement such a scheme of redistributive justice, how would we specify the value of all these resources? It is a trivial matter to specify equality of wealth or income, but not to quantify the resource variation among persons with various abilities, disabilities, and talents. Dworkin attempts to solve such problems by abstracting away from particular cases and looking at decisions that rational people would make in a hypothetical insurance market. Rational agents, unaware of their own actual talents, abilities, and disabilities, purchase coverage against having disabilities or a lack of valued skills. For example, one considers what sort of policy would be attractive to insure against blindness, lack of in-demand talents, and so on. Then the actual redistributive scheme in society should redistribute resources to actual persons in accord with the insurance coverage that it would have been rational to purchase. Think of it along the lines of medical insurance or unemployment insurance. The hypothetical insurance market provides a rough guide for determining the value of specific resources, giving a baseline of compensation for those who lack such resources.

Resource egalitarianism aims to secure for everyone an equal set of resources and an equal opportunity to convert those resources into welfare. How well people do this, and resulting inequalities stemming from their choices, are not core concerns of this conception of distributive justice.

c. Capabilities

Capabilities are potential functionings, such as walking to work, reading a book, travelling, or being safe and secure in one’s home. If you have the capability to do a specific thing then you have both the abilities and resources required to do it, whether or not you actually choose to do it. A person has the capability to participate in a town hall discussion when they have the physical ability to move into that space (their body, or lack of assistive devices, or the infrastructure does not prevent this motion), the safety to do so without being assaulted, the ability to become informed about the issues (literacy, access to information), and so on. Whatever material and social conditions are required for a specific functioning are possessed by whoever has the relevant capability.

Capabilities-based approaches to distributive justice are sufficientarian rather than equalizing. What is unjust is not the number of capabilities possessed by those on the top compared to others, but the objective inadequacy of the capabilities of those on the bottom. While not equalizing, this is egalitarian. It is concerned with substantive distributive justice. These theories are meant to provide a minimal component of justice that can be combined with further normative principles. When coupled with egalitarian principles, the view is no longer sufficientarian. In terms of its minimal core, though, just as with resource egalitarianism, its commitment to distributive justice is instrumental: a more egalitarian distribution of resources can bring more persons up to the threshold capability level.

The capabilities-based approach’s distinction between capability and function accounts for responsibility and autonomy. What the theory attempts to secure is a sufficient level of capabilities for all. Whether an individual functions is up to his or her own choice. The capabilities approach is therefore not subject to the adaptive preferences objection. No matter how much one adapts their tastes, preferences, and expectations downward, it is unjust whenever they lack the essential capabilities. They may, through free choice or conditioning, choose not to function in certain ways—but they must have the relevant capabilities. In this case, the agent is not making a judgment that something is not worth doing when they currently cannot do it, they are making a judgment that they do not want to do something that they are capable of doing. They have the abilities and resources required to do so. Thus, adaptive preferences can still lead to inequalities in functioning, but this does not impact distributive justice. A sufficient level of capabilities for all requires a certain pattern of the distribution of resources. That pattern is not impacted by the choice of some persons not to function in certain ways.

This approach raises an obvious and crucial question: which capabilities matter to distributive justice? Not every capability should matter, such as the capability to pollute the environment. It also seems that capabilities must be specified in a coarse rather than fine-grained way. The theory would be intractable if every discrete form of functioning were correlated with a discrete capability. For the theory to be illuminating and useful the list must be manageable.

Some capabilities theorists, such as Sen, avoid enumerating an official list. Nussbaum argues that the following list enables one to live a full life with dignity. She does not treat it as timeless or the final word:

  1. Life—capable of living a normal lifespan.
  2. Bodily Health—health, nutrition, shelter.
  3. Bodily Integrity—movement, security against violence, choice in reproduction, sexual satisfaction.
  4. Senses, imagination, thought—the exercise of these capacities in a fully human sense, facilitated by education and protected by rights (of expression, religion, and so forth).
  5. Emotions—emotional development allowing one to form attachments.
  6. Practical reason—development, critical reflection upon, and pursuit of a conception of a good human life.
  7. Affiliation—social interaction, the social bases of self-respect.
  8. Other species—living with and showing concern for the natural world.
  9. Play—recreation.
  10. Control over one’s environment—political activity, political guarantees of security and noninterference, property holdings, full participation in the economic and civic spheres.

Nussbaum’s capabilities list gives a general picture of human flourishing. It reaches every domain of human life. (For more on the capabilities approach, see “Sen’s Capability Approach.”)

d. Democratic/Social Equality

Democratic or social equality is a narrower-scope form of the capabilities approach. Elizabeth Anderson (1999, 2010) developed the most prominent version. Her theory stems from a critique of the individualistic nature of both resource and welfare egalitarianism. Those theories of distributive justice address equality among the holdings of different individuals. Anderson objects to the focus on individual holdings of resources or welfare levels. The point of egalitarianism is social, dealing with relations among persons, not atomistic, dealing with individual allocations of some metric. Anderson rejects the individual compensation model entirely. We cannot do away with unjust inequality by allocating more resources or welfare to those at the bottom. Anderson focuses on the capabilities of citizens and the social relationships between them. Unjust inequalities are caused by oppression, which is social.

Let us again consider disability. Anderson argues that disability is as much a social as a biological fact. The impact on one’s life of having a particular disability varies according to the way social space and infrastructure are constituted and on the social practices of fellow citizens. For example, someone in a wheelchair has less of a handicap when social spaces are physically accessible to them. Equality and inequality are essentially social—the impact of many disabilities depends on social attitudes and political policies. What accommodations do the majority enact through democratic policy? Do non-disabled treat the disabled as equal and fully capable? The proper response to disability cannot be individual compensation. The resources and redistribution that should be used to counteract such handicaps must deal with social practices and infrastructure. Individualistic models could account for why a disabled person requires extra medical resources, but does not reach the level of infrastructure and social practice. Wheelchair accessible social spaces are not part of any individual’s holdings of resources. They are not her property. Yet they are fundamental to understanding disability and inequality.

Unjust inequalities are not mere individual deprivations of welfare or resources compared to others, but socially imposed oppression and exploitation. The paradigm of unjust distribution is not one in which some have much more than others, but in which some oppress and exploit others. Inequality is constituted by certain sorts of social relations. The ideal distribution is not one in which everyone is equalized in terms of resources or welfare, but in which everyone can fully function as a citizen. This is a narrow-scope capabilities approach in two ways: first, the capabilities list is not all-encompassing; second, this is all within a particular political state. Indeed, Anderson’s conception is specifically democratic equality.

This approach is committed to substantive distributive justice as instrumental in guaranteeing that all citizens have a sufficient set of capabilities. Whether a citizen possesses a given capability is jointly determined by the individual, their resources, their environment (natural and built), and the social practices and attitudes of their fellow citizens. Hence the focus is more on institutional changes to make the infrastructure navigable with disabilities, and changes to social norms and behavior, rather than seeing disabilities as an inefficiency for which the individual has a claim to a greater resource share.

The list of capabilities is narrowed to those required to function as a citizen, but nonetheless must be rather coarse and general. The capabilities list must include what is needed to fully function as a citizen and to avoid oppressive social relationships. However, fully functioning as a citizen includes more than political life. It also includes the ability to function in the civil and economic spheres. The point of egalitarianism is not to impose a pattern of distribution but to eradicate oppression, which is socially imposed.

Not only is this theory narrower than the theories of Sen and Nussbaum, it is more constrained than any other option we have considered. Welfare, resources, preferences, primary goods, Nussbaum’s capabilities—each of these reaches into every domain of human life. This conception of equality only touches our lives as citizens. Now, to be sure, since capabilities must be specified in a rather coarse-grained way, the relevant capabilities to citizenship can be put to use in other domains of life. Nonetheless, the scope is relatively narrow.

One objection facing this approach is that it may be possible to guarantee that everyone can fully function as citizens and avoid oppression while at the same time having radical inequality of resources or welfare. If so, perhaps this view is unacceptably narrow because guaranteeing the threshold capability level is compatible with unjust inequalities in life outcomes. Another worry is that this view might be less able to address global justice than other alternatives. That is a disadvantage if one thinks that a unified theory should cover both domestic and global justice.

e. Primary Goods

We now turn to an influential variation on resource egalitarianism. It is not strictly equalizing, and it employs a wide and diverse conception of resources. John Rawls argued that primary goods are what citizens have reason to care about, regardless of whatever else they care about. Primary goods include health, physical and mental abilities, income, wealth, rights, liberties, opportunities, and the social bases of self-respect. No matter what particular conception of the good a citizen may have, what their life plans, goals, and deepest commitments are, she has reason to want more rather than fewer primary goods. Primary goods are what must be expended or employed in pursuit of your conception of the good. (This could mean recreation, education, artistic output, religious missionary work, and so on.) Non-material goods such as liberties and opportunities are what make one’s freedom effective. The social bases of self-respect make for a rewarding life.

All of these primary goods are valuable to you regardless of your religion, values, and life goals. No matter what comprehensive conception of the good you affirm, it is rational to want more rather than fewer primary goods. However, given our differing conceptions of the good, we will not all agree on the best way to use the additional goods created by our social cooperation. Principles of justice are required to fairly allocate resources. For Rawls, the right is prior to the good. Just principles for allocating primary goods trumps pursuit of our individual, various conceptions of the good. This is one thing meant by the title of his book Justice as Fairness.

Rawls’ theory is egalitarian but not necessarily equalizing. It focuses on substantive distributive justice but does not always aim for an equal distribution of all primary goods. Basic rights and liberties must be distributed equally. Fair equality of opportunity requires that opportunities are distributed equally across persons of equal talent and motivation. However, considering all the various primary goods including wealth and income, equality is merely the baseline from which other distributions are judged. Other distributions can be preferable to equality. Inequalities can be justified instrumentally when they are necessary to raise the absolute condition of the worst off. This is accomplished when inequality is a necessary causal mechanism for increasing total productivity. Greater incentives may be required to motivate the talented to be more productive.  The worst off would prefer to live in a society in which they get a larger slice of a larger economic pie than to live in a purely equal society in which they get a smaller slice of a smaller pie. Rawls’ strategy is to answer the problem of distributive justice via a social contract. We consider an idealized choice scenario in which free and equal persons come to an agreement about the nature of the society they wish to enter. If our society matches principles that those persons would have chosen, our society is just. A society meeting this standard is as close as we can get to a voluntary agreement to be bound by a particular state.

Rawls argued that the distribution of benefits and burdens in society should not be fundamentally determined by that which is arbitrary from the moral point of view. This rejection of the morally arbitrary explains Rawls’ choice of the veil of ignorance as part of the preferred choice scenario for picking principles of justice. Rawls argued that we should choose principles of justice by imagining persons behind a veil of ignorance that prevents them from basing their choice on what is morally arbitrary. It is not possible to choose principles tailored to serve one’s own peculiar self-interest. The choice of principles is still made out of self-interest, but it is the interest of an abstract model of the person, not of a specific person who is aware of their particular, contingent situation in the actual world. The veil occludes knowledge of much that is due to chance, but also much that is due to choice, including the choosers’ various conceptions of the good. This scenario attempts to value choice by creating the conditions under which people can all pursue their own conceptions of the good. They reason about how to secure primary goods, which can be expended in pursuit of any conception of the good. The original position creates a model of the Kantian notion of the self, and the veil of ignorance forces the choosers to make decisions that are categorical. They lack the knowledge required to make hypothetical choices based in their own particular conception of the good and their peculiar desires.

Rawls argues that under these conditions, rational actors would choose a maximin strategy. Each individual’s goal is to make the worst possible outcome for themselves as good as it can be. They would not take an avoidable gamble on entering into a society with persons suffering at the bottom of the socioeconomic ladder because they would not want to risk living their entire lives under such conditions. Nor would they object to inequality when it raises the absolute level of the worst off, since they are more concerned with the objective quality of their own lives than with envy of those with more primary goods. Rational, self-interested persons situated in a fair procedure for making decisions about their society will affirm the difference principle. According to the difference principle, if incentives that generate inequality are required to increase productivity, then the resulting inequality can be just. If such incentives are required to motivate higher productivity, then they should be allowed as long as they can be harnessed to assist the worst off. By using inequality to motivate productivity, the economic pie grows, and redistribution can improve the lives of the worst off.  Note, however, that the difference principle cannot justify violations of the descriptive thesis affirming the equal worth of all persons. A liberty principle takes priority over the difference principle. We may not create a system of unequal rights and liberties even if doing so would allow us to raise the absolute condition of the worst off.

Gerald Cohen objects to the demand for greater incentives that the difference principle allows. The people who require greater incentives to work productively are blameworthy. Why, knowing that if they work to their full ability this will benefit the worst off, do they not do so without demanding a greater share of primary goods? Cohen argues that this demand for incentives is exploitative. If the talented changed their outlook, we would have greater equality and improvement of the lives of the worst off. Rawls’ theory deals with principles governing political institutions and the basic structure of society, not with private actions and motivations. Cohen thinks egalitarianism should be internalized. In Rawls’ theory, persons in the original position are conceived of as self-interested, and a fair procedure for choosing principles of justice ensures a commitment to distributive justice. But that is a product of the fairness of the choice scenario and the self-interest of the participants. Cohen thinks that egalitarianism as a moral and political imperative should motivate individual choices and actions, not only shape the basic structure of society and its institutions. Still, as a matter of public policy, Cohen deems Rawls’ view a radical improvement on contemporary society. His objection is that the difference principle is subordinate to unjust motivations and attitudes. Justice requires that we have egalitarian motivations, and therefore the talented should never demand the incentives allowed by the difference principle. Egalitarianism is a normative ideal, and talented persons ought to work productively and support redistributive policies to pursue equality without demanding a greater share of primary goods. Rawls thinks that in actual societies people will have a variety of motivations. The problem for Cohen is that the original position models persons as self-interested rather than egalitarian. He concludes that the difference principle is not just.

f. Luck Egalitarianism

We now turn to a view that combines egalitarianism, Rawls’ rejection of the influence of morally arbitrary factors, and an emphasis on the values of choice and responsibility. Rawls’ social contract view holds that the morally arbitrary should not fundamentally determine the distribution of primary goods or people’s life prospects. So one’s family, one’s innate talents, and one’s starting place in society should not shape one’s life prospects or distributive share unless this benefits the worst off. These factors are undeserved and should not alone determine the distribution of benefits and burdens in society. Luck egalitarianism distills this thought into a complete theory of distributive justice. The ideal distribution is sensitive to people’s choices and informed gambles, but not to brute luck in the distribution of talents and opportunities. For the luck egalitarians, our capacities for free deliberation, choice, and action are pre-institutional. Therefore, they should inform and determine the principles of distributive justice, and the institutional expectations for entitlement and deservingness. (Hurley argues that this is a crucial feature of luck egalitarianism.) These features of the self are not ignored in Rawls’ view, but they do not fundamentally shape the institutions.

Luck egalitarianism is a responsibility-sensitive conception of equality and a system for distributing goods and aid under conditions of scarcity. It prioritizes aid to those who suffer through no fault of their own. It is a non-equalizing commitment to substantive distributive justice. Equality provides a baseline, though in a quite different way from Rawls. The role of equality here is what we can call ex ante equality. At the starting gate of life, we should be equal in some sense. Depending on the favored metric, we should begin with an equal amount of resources or opportunity for welfare. The luck egalitarian ideal is that we start on an equal footing, and then the outcomes of our life choices and freely taken gambles should determine our future holdings. Inequality therefore can be just. It is not just because it brings about some further social good, as the difference principle allowed for inequalities that improve the objective condition of the worst off. Rather, inequalities are justified by being brought about in the right way, by having the right sort of causal origin.

Indeed, luck egalitarianism is an alternative way to develop the emphasis on choice, responsibility, and individual sovereignty that leads some to reject egalitarianism entirely. Cohen argues that the view co-opts these values from the anti-egalitarians. Luck egalitarianism is not opposed to inequality per se; it is opposed to inequalities that have the wrong sort of origins. Inequalities based in brute luck, that is, the type of morally arbitrary factors cited by Rawls (innate talents, parentage, starting place in society) generate unjust inequalities. But option luck, that is, luck in the outcomes of freely taken risks or gambles, lead to just inequalities. As with the capabilities approach, luck egalitarianism may be combined with other principles of justice. (See Cohen on community.)

One objection to luck egalitarianism is based in skepticism about free will and moral responsibility. The theory hinges on the moral importance of choice and responsibility. If there is no robust conception of free will and moral responsibility, why think that inequalities caused by our choices are just?

Another worry about the theory is abandonment. Does luck egalitarianism offer no aid to those who suffer because of choices with poor outcomes? If inequalities are just whenever they are caused by choice, then is there no minimum level of well-being guaranteed for all? One sort of response to this worry is combining luck egalitarianism with other political values. Cohen argues that a commitment to community prohibits inequalities that would be allowed in a purely luck egalitarian system. Kymlicka argues that luck egalitarianism can be combined with social egalitarian views that likewise prohibit some inequalities that might be allowed by luck egalitarianism.

Anderson develops a social egalitarian view and is a strong critic of luck egalitarianism. Her conception of democratic equality is not only a development of the capabilities theory but also an explicit rejection of luck egalitarianism.  She thinks that the luck egalitarian focus on brute luck means the theory completely misses the social nature of inequality. She objects that luck egalitarianism ends up trying to correct the “cosmic injustice” of brute luck in an attempt to ensure that people get what they deserve, and that this blinds them to the social oppression and exploitation that constitutes inequality. Unjust inequality has to do with social relationships.

Another question facing those who support luck egalitarianism is how to define equal starting places. This leads us into the larger issue of what constitutes equality of opportunity.

3. Equality of Opportunity

What if there are dramatic inequalities in the opportunities for choice, education, and careers? This is a problem for luck egalitarians, because they need to specify a starting gate conception of equality. It is also a pressing issue for the other conceptions of equality.

Dworkin argues that inequalities can be historically justified when persons made their choices from an equivalent set of options. This commits luck egalitarianism to robust equality of opportunity. However, his standard is difficult to interpret, since citizens can never have a strictly equivalent set of options, unless that set is so restricted that the society is dystopian. There must be some standard to define when their options are fungible or equivalent enough. However, this is a massive problem for egalitarian theory, and it seems luck egalitarianism’s values of choice and responsibility alone cannot solve it. Answering that problem requires some other standard of value. When do persons have equal opportunities?

Equality of opportunity is a natural extension of the descriptive thesis that affirmed the equality of all persons. The descriptive thesis is incompatible with forms of oppression that rule out classes of people from competing for certain positions within society. A denial of the descriptive thesis entails a denial of a commitment to equality opportunity. But what exactly does equality of opportunity require? It can be understood as ranging from merely formal equality of opportunity to substantive equality of opportunity. The more one approaches the latter, the more one becomes committed to substantive distributive justice.

Formal equality of opportunity requires that desirable positions and resources in society be allocated by open and meritocratic competition. Firms, government agencies, and universities are appropriate candidates for such equality of opportunity. This requires little or no substantive distributive justice. It does require that all citizens can participate in the competition, and that the winners are chosen on the basis of purely meritocratic concerns. Meritocracy requires that the traits that determine who wins the competition actually predict success in the position. Formal equality of opportunity prohibits allocating positions on the basis of gender, ethnicity, and so on. This deals only with opportunities, not outcomes. It does not address systemic inequalities in who wins the meritocratic competitions.

Substantive equality of opportunity addresses both the procedures for allocating positions and the preparation of the candidates that determine their chances of success. It deals with both fair procedures and the actual outcomes of those procedures. For example, if positions are open on the basis of purely meritocratic competition, but the advantages conferred by wealthy parentage are so overwhelming that only the children of the wealthy win the desirable positions, this is merely formal equality of opportunity. Those who support substantive equality of opportunity argue that the merely formal is morally inadequate.

Consider Bernard Williams’ example of a hypothetical warrior society. In the past, this was a caste society in which warriors had high prestige and the majority of wealth. The society transitions to a system of formal equality of opportunity. Under the old order, only the sons of wealthy families were eligible to be chosen as warriors. All others were consigned to poverty and subjugation. Now, warrior positions are allocated under a system that exhibits formal equality of opportunity. Under the new order, there is a meritocratic allocation of the desirable warrior positions. These desirable positions are distributed according to the results of an open, meritocratic, and fair tryout. Rich and poor alike may enter the competition. There is no bias in judging the winners and losers. Stipulate that women may now obtain these positions. Success in the examination is predictive of success as a warrior, so the system is meritocratic.

However, this is all compatible with only the offspring of warriors having adequate nutrition and training to succeed in the competition. Although careers are open to talents, the poor have no chance to cultivate the relevant talents. Even those with the luck to be born with innate ability have their prospects defined by their parentage. Those who were not born to a warrior family cannot succeed. Therefore, the old social hierarchy will persist, even though a strict caste system has been replaced by open, meritocratic procedures that satisfy formal equality of opportunity.

A formal equality of opportunity defender might point out that the long-term outlook for this social hierarchy is made much more tenuous by the implementation of formal equality of opportunity. Other changes to the society could impact the levels of inequality. The dominant positions in society are subject to change over time in a way that they were not under the original caste system.

Still, from the egalitarian perspective, this meritocratic society is unjust. That destabilizing forces can change things under formal equality of opportunity does not redeem the status quo. The current situation is unjust, and destabilizing change would not entail that the next distribution will be just, only that the individuals occupying the dominant and subordinate positions will change. The transition might be to one in which different non-meritocratic attributes correlate with having any chance for success; say, from warrior families to merchant families, or that the offspring of a small set of occupations will be the only ones with a genuine opportunity to succeed.

A perfectionist, someone who thinks that society should maximize the pursuit of some particular conception of the good, could argue that formal equality of opportunity is adequate because the concentration of wealth, which in turn prepares people to flourish as warriors, creates the best set of warriors overall. One can object to this on perfectionist terms (that generating the best warriors is not the proper overriding good, or that this system does not generate the best set of warriors) or on Rawlsian terms of liberal justice (no one conception of the good should be made sovereign in a free society, and no one would agree to this arrangement in the original position).

Suppose the example is shifted slightly. Rather than only the sons of wealthy high caste families having any opportunity to succeed, there is a small amount of social mobility. Some not born into a privileged position win the meritocratic competition. There is not substantive equality of opportunity, but there is both formal equality of opportunity and actual mobility. A supporter of substantive equality of opportunity will still object that it is the strength of the correlation between family background, the resources provided by that background, and obtaining a warrior position is itself adequate evidence of the inadequacy of formal equality of opportunity. These concerns push one to rely on another metric, such as resources, to attain a substantive, material form of equality of opportunity.

Of course, examples need not be so rigid as Williams’ caste society. A collection of informal social attitudes and practices may also violate equality of opportunity. If women are not seen as capable of being good pilots, then hiring and promotion procedures will lack genuine formal equality of opportunity, even if this is neither inscribed in company policy, in law, or in a caste system. These impediments to equality of opportunity are endemic in contemporary society. There are more strategies for answering these problems than can possibly be described in this brief article, so we will mention only two that expand upon views already covered. Rawls (2009) developed a conception of fair equality of opportunity that undermines the role of class, race, gender, or caste to determine life prospects. Fair equality of opportunity requires that persons of equivalent talent who expend equivalent effort have equivalent outcomes. Roemer (2009) provides a sophisticated luck egalitarian account of equality of opportunity that separates people into different types. The competitions that allocate desirable resources and positions should be designed so that effort is rewarded. The details of this scheme are beyond the scope of this article, but these two views are good starting places for readers who want to research the issue in greater depth.

4. Anti-Egalitarianism

An obvious form of anti-egalitarianism rejects the descriptive thesis. If persons are not equal, then there is no moral imperative to pursue substantive distributive justice. Sexism, racism, caste discrimination, and so on are obviously not views that lead into egalitarianism. These objections are beyond the scope of this article.

A common political objection to egalitarianism is that it is based in envy. None of the theories canvassed in this article are explicitly based in envy, so this objection has more to do with the alleged psychological motivations for becoming an egalitarian rather than criticism of egalitarian arguments themselves. Of course, Rawls’ theory explicitly rejects envy. Persons in the original position want to secure the greatest number of primary goods for themselves. Their choice is not impacted by envy of those who may end up with an even greater share of primary goods.

A second political objection is that egalitarianism undermines productivity. If the state redistributes income or other resources, then there is less incentive to be productive. Egalitarians can deny this on empirical grounds, object that total productivity is not the most important criterion, or attempt to harness the way that incentives motivate productivity (as with Rawls’ difference principle).

A practical objection is that a commitment to distributive equality would lead us to “level down” the allocations of those who have more for no real benefit. Suppose all the members of a population have x units of your preferred metric of distributive justice, except for one person who has 2x. Now consider whether it is desirable to transition from that distribution to one in which everyone holds x units. This makes one person worse off and no person better off. The distribution is now equal, but is it preferable? Is it more just? A strict egalitarian can respond that if equality is intrinsically valuable then the distribution is improved in that respect. They are not strictly committed to concluding that this makes the new distribution preferable overall. That only follows if equality is the overriding or sole value. If equality must be balanced against other values, then egalitarians have an answer to the leveling down objection. A strict egalitarian who thinks equality is instrumental already accepts other values, so they can argue that in these cases equality is not instrumental in bringing about the desired consequences.

The leveling down objection is a threat to views that pursue strict equality. Non-equalizing conceptions of substantive distributive justice avoid the problem. What most theories aim to do is improve the condition of the worst off and thereby lessen inequality, not pursue strict equality unconditionally. Views that prioritize aid to the worst off or support a sufficient minimum floor are not obviously subject to this objection. Even if one thinks it is morally obligatory to redistribute resources to improve the condition of those who are worse off than others, it does not follow that it is obligatory to destroy resources when that is the only way to achieve distributive equality.

Perhaps the most philosophically interesting objections to egalitarianism are themselves based in the descriptive thesis that all persons are in fact equal. One objection is that egalitarian distributive justice is insufficiently sensitive to both deservingness and human agency. A second is that there is no just way to implement a redistributive scheme that aims towards equality, because doing so violates freedoms and rights that follow from our equality.

Welfare egalitarianism, resource egalitarianism, the capabilities approach, and Rawls’ difference principle are patterned conceptions of distributive justice as opposed to historical conceptions. Strict egalitarianism defines a pattern of equal shares, the various capabilities approaches define patterns involving a sufficient minimum below which persons cannot fall, and the difference principle states that the level of permissible deviation from the baseline of equality is defined by what is necessary to raise the absolute condition of the worst off.

Nozick (1974) argues against all patterned conceptions of distributive justice. He claims that according to patterned conceptions of justice, if a given pattern is just, it makes no difference which persons occupy which places in the distribution. Justice is defined in terms of structural features of the pattern, not the identity of those occupying specific places in the pattern. Yet that seems counterintuitive. Those at the top might deserve their place on the basis of working hard. Inequalities might be generated by the voluntary transfer of goods that took place in a distribution that was already just. Nozick concludes that, rather than favoring a patterned conception of distributive justice, we ought to understand distributive justice in terms of historical entitlements and voluntary transactions. He agrees with Rawls that the distribution of natural talents is not a basis for deservingness, but denies that this means the distribution of those talents (and the varying wealth and income derivable from them) is arbitrary from the moral point of view. It is not arbitrary because natural talents are implicated in the normative relationship of self-ownership. Persons own themselves. That includes their native abilities. This means that, by extension, they hold strong entitlements to the property they can obtain by exercising those (undeserved) talents.

Since Nozick was primarily responding to Rawls’ Theory of Justice, it is worth looking at this objection and the extent it threatens patterned conceptions in general and Rawls’ conception in particular. Rawls’ view can be defended against Nozick’s objection that according to patterned conceptions of distributive justice, it should not matter which individual occupies which place in the pattern. Consider the role given to institutional expectations and institutional desert. Rawls’ theory allows for people to deserve property so long as the state’s institutions have created the reasonable expectation of such property rights. In other words, entitlement to property is generated by the basic structure of the state. Institutional expectations ground such entitlements. Therefore, his view is compatible with a conception of private property that is not indifferent to which persons occupy which positions in the distribution. Of course, Nozick, following Locke, thinks individuals can have preinstitutional entitlements, so his view of property rights is much stronger.

Still, in Rawls’ patterned view of distributive justice, it must matter which particular individuals occupy which places in the patterned distribution, because the point of the difference principle is that scarce talents are harnessed for the benefit of all. There is a causal relationship between which persons occupy which positions and the pattern of the total distribution. The size of the economic pie is defined by which people occupy which places. Switching places would change total productivity and harm the absolute condition of the worst off. Rawls argues that a given society’s distribution of goods is just if it matches the difference principle. The specific pattern depends on myriad factors, and those factors cannot be held constant while you switch the persons occupying the different positions in the pattern. For example, if in a given state greater incentives are required to motivate some of the highly talented to be more productive, you cannot switch their place in the pattern without changing the productivity level. In such cases, Nozick’s discussion of switching persons within the pattern would necessarily modify the pattern itself. The hypothetical place switching across identical patterns cannot be implemented. So what Nozick means by “patterned” does not capture everything that matters in substantive distributive justice. This response also applies to luck egalitarian accounts of distributive justice. Luck egalitarianism is committed to having shares allocated in accordance with the individual’s choices and option luck. (For a much stronger desert-based alternative, see Kagan 2012.)

Nozick’s second objection has to do with individual liberty to make voluntary transactions. Suppose an actual distribution meets your definition of a just pattern, whatever that may be. So long as persons can make voluntary transactions (purchases, gifts, trades, bequests), the original pattern will be lost. This all happens without exploitation or coercion. The only way to regain the pattern is to for the state to interfere with these voluntary transactions and coercively redistribute the resources. But that is objectionable for two reasons. First, since the deviation from the initial pattern was entirely voluntary, nobody has a valid objection to the second pattern. It wrongs no one, since every transaction that changed the pattern was consensual. Second, coercive redistribution to retain the original pattern must violate property rights. In the initial distribution, which we stipulate was just, each had a right to their holdings. Through voluntary transfers, the new pattern was generated. But if the transactions were voluntary, the new owners of these resources are as entitled to them as the original owners were. The original pattern was just and therefore it is neither required nor permissible for the state to redistribute anything. Egalitarian redistribution enforced by the state must violate property rights. No program can pursue substantive distributive justice through redistribution, because such redistribution is unjust.

This anti-egalitarianism is crucial for Nozick’s understanding of the descriptive thesis: individual rights, including the right to own and transfer property, constitute our equality. Those rights preclude systems of imposing, retaining, or regaining a specific distributive pattern. His understanding of equality is incompatible with egalitarianism. Nozick concludes that we should understand distributive justice in formal and historical terms, not in terms of patterning. He then argues for a set of historical principles governing the original acquisition and subsequent transfer of property. Nozick affirms that persons are equal, but this means that each person has equally strong property rights. The descriptive thesis on this view entails a denial of egalitarianism. Egalitarianism can only be pursued by violating the property rights that follow from our equality.

a. Sufficiency vs. Equality

There is also a sufficiency objection to strictly equalizing views. Frankfurt objects that

The mistaken belief that economic equality is important in itself leads people to detach the problem of formulating their economic ambitions from the problem of understanding what is most fundamentally significant to them. It influences them to take too seriously, as though it were a matter of great moral concern, a question that is inherently rather insignificant and not directly to the point, namely, how their economic status compares with the economic status of others. In this way the doctrine of equality contributes to the moral disorientation and shallowness of our time. (Frankfurt 1987)

A person focused on strict egalitarianism evaluates their own life and holdings based on something impersonal and independent of the particular features of their own lives and their own personal needs. Egalitarianism is harmful.

However, the egalitarian impulse is really based in something that is of moral importance—the principle that all persons should have a sufficient level of well-being. On Frankfurt’s view, people become egalitarians on the basis of compelling reasons, but those reasons have to do solely with sufficiency, not equality.

It seems clear that egalitarianism and the doctrine of sufficiency are logically independent: considerations that support the one cannot be presumed to provide support also for the other. Yet proponents of egalitarianism frequently suppose that they have offered grounds for their position when in fact what they have offered is pertinent as support only for the doctrine of sufficiency. Thus they often, in attempting to gain acceptance for egalitarianism, call attention to disparities between the conditions of life characteristic of the rich and those characteristic of the poor. (Frankfurt 1987)

The case for egalitarianism is usually only a case against poverty.

The fundamental error of egalitarianism lies in supposing that it is morally important whether one person has less than another regardless of how much either of them has. […] The economic comparison implies nothing concerning whether either of the people compared has any morally important unsatisfied needs at all nor concerning whether either is content with what he has. (Frankfurt 1987)

Defenders of equality must show that substantive distributive justice is not captured by concerns over sufficiency alone. We will use Scanlon as a representative example. (See also Parfit and O’Neill for discussions of equality as opposed to sufficiency.) Scanlon offers five sorts of reasons to be concerned with equality and not merely sufficiency. 1. Some inequalities create humiliating differences in status. One could object that sufficiency is whatever level required to avoid humiliation and shame. However, the level is sensitive to differences between the better off and worse off rather than being determined by objective or unchanging standards. This means that, contra Frankfurt, we are intrinsically concerned with differences between people, not just that everyone meets some sufficient benchmark. 2. Inequalities can give those who have more an unjust amount of power over others. 3. Social institutions are only fair if there is equality of starting places in society. Inequality can undermine procedural fairness. We can see this in economic competition, inequality of opportunity, and political influence. 4. Inequalities can be objectionable when they involve failure to treat equally those who have a claim to equal benefit. Just because everyone has a sufficient level of some service or resource provided by the state does not mean that unequal allocation is just. 5. Inequality can violate the claims of citizens to benefit from the fruits of social cooperation. This is how Scanlon reads Rawls as egalitarian. The participants in the original position are equal participants. The presumption is that they have an equal claim to the benefits of social cooperation. This is why equality is the benchmark from which inequalities are judged, and only those that benefit everyone are permissible. The primary goods are produced by social cooperation and, contra Nozick, the baseline or benchmark is that every equal citizen has an equal claim to those benefits. (For more on the debate between equality, sufficiency, and giving priority to the worst off, see the references for Nagel, Parfit, and Scanlon. For elucidating commentary on Scanlon, see Wolff 2013.)

5. Domestic or Global?

Many egalitarians hold a stronger domestic than global view. Redistributive priority is given to fellow citizens over persons in other nations. This, on its face, seems inconsistent or unwarranted. If one is committed to equality, what difference could national borders make? Is it just for a state to prioritize domestic distributive justice over global distributive justice? As a pure matter of luck egalitarianism, the state into which one is born is a paradigm example of brute luck. Having one’s life prospects be determined by nation of origin seems as morally arbitrary as having one’s life prospects determined by parentage. The arbitrariness of nationality combined with the universality of the descriptive thesis (all persons are equal) creates tension with domestic prioritization. On the other hand, if redistributive justice deals with the allocation of goods produced by the cooperation of citizens, then perhaps there is a justification for prioritizing domestic over international redistribution. The amount of redistribution required to address global inequality may depend on the nature of the goods to be allocated as well as the degree of entanglement among the world’s various states.

Consider an efficiency argument against global egalitarianism. One may be an egalitarian yet argue for domestic priority based on increased costs of sending aid to distant locations, difficulty with managing the efficient distribution on the other end, or epistemic advantages of dealing with local rather than remote issues. Peter Singer argues against the efficiency rationale. Changes in modern transportation, financial systems, and information technology have lessened most of the inefficiencies in aiding far away persons. Singer’s argument is not about egalitarianism per se, but about preventing what all reasonable people can agree are objectively bad states of affairs: famine, starvation, epidemics, and so forth. So on the one hand, it is not egalitarian in the sense of an equal distribution of some metric, but rather egalitarian in the sense of doing away with suffering at the bottom rungs of the global society.

Singer’s view is a useful example of moral obligations being global. If moral obligations can be global, then perhaps so too can egalitarianism. Proximity is arbitrary in his analysis: someone suffering nearby is no more morally relevant than someone suffering far away. Given the magnitude of global suffering, there is an egalitarian element to his utilitarian calculus. So long as these objectively bad states of affairs are occurring, first world people are obligated to work to prevent them. This will flatten global inequality. His view can be taken in two ways: the strict reading requires sacrifice to the point of marginal utility with the globe’s worst off, or a weaker (though still radical) reading that requires significant sacrifice. However, Singer constrains both readings with a utilitarian productivity argument: the first world may need some excess consumer culture (that in the short term contradicts our obligations to the worst off) to keep the economy at a level where it can make the maximum contribution to the plight of the globe’s worst off.

Singer therefore takes the descriptive thesis to require radical, obligatory sacrifice on the part of citizens in first world countries. Given the amount of objectively bad states of affairs in the world, those who are comparatively well off are obligated to reallocate resources to the worst off.

Onora O’Neill gives another example of global moral obligation. She argues for a right not to be killed unjustly. Global resource inequalities amount to de facto killings. They are unjust, since they can be avoided at reasonable cost. There is no obligation to equalize anything globally, but there is an obligation to avoid violating the global poor’s right to not be killed unjustly. She gives an argument by analogy that highlights the tension between property entitlements and distributive justice. In a lifeboat scenario, one who has excess water and food but withholds it from others, who will die without it, violates their right not to be killed unjustly. Property entitlements vary in strength in different contexts. She then argues that the planet is no different from a lifeboat, so that those dying from poverty and famine have their right not to be killed unjustly violated. This argument hinges on the contextual variability of property rights and the relative strength of the right not to be killed over property rights. The right not to be killed trumps property rights, so the redistribution required to avoid these killings is obligatory. Unlike Singer, this does not generalize to an obligation to prevent all objectively bad happenings globally. First world citizens are only obligated to do what is required to secure everyone’s right not to be killed unjustly. Yet this is radical, too—her conception of agency means that those complicit in first world economies are killing the globe’s worst off. Redistribution is the means to avoid these killings.

Given these types of arguments for global moral obligations, what can be said in favor of domestic priority in egalitarian redistribution? If distributive equality is a matter of justice, should redistribution be global? As in the discussion of anti-egalitarianism, one obvious objection is to deny that the descriptive thesis holds globally. Denying the equality of all the globe’s people is not philosophically interesting. A stronger argument is that the demands of egalitarian justice are tied up with institutions and practices that are not global. If matters of distributive justice have to do with coercive redistribution, then perhaps only persons living within the same state fall under egalitarian requirements.  If so, global distributive justice would only apply if there were genuinely powerful and coercive global institutions. Egalitarian obligations only arise within a coercive political structure. That the state holds coercive power over the citizens means that they should each be treated equally and, perhaps, that the state should engage in redistribution to pursue equality of holdings. Various forms of this view appeal to different features of the state. A similar argument is that redistributive justice has to do with allocating the resources made possible through social cooperation. If so, then the bonds of citizenship matter to distributive justice, and we should treat domestic and international inequality differently.

Another domestic-priority view is that egalitarian norms arise among people who share political bonds and obligations, and those attachments are local rather than global. These sorts of objections are not unconditionally opposed to global egalitarianism; they rather object that egalitarianism is tied to certain relationships and institutions that currently are not global. Some egalitarians counter that the amount of global engagement, cooperation, and institutional entanglement does generate global egalitarian obligations. (For example, see Pogge 1989.)

Richard Miller gives a consequentialist argument for domestic prioritization. Too much redistribution directed outside of a particular state can have a destabilizing impact. Even if that state is well off compared to others, as long as it has inequality in its own economy, then those on the internal bottom rungs may become alienated if resources are taken out of their economic system and sent to another country whose most deprived citizens are even more worse off. The worst-off citizens within the relatively wealthier state are participating in a scheme of social cooperation that benefits the well off, their state engages in egalitarian redistribution, but the redistributive scheme prioritizes the needs of the worst off in other countries. It seems as though this scheme provides benefits to all but the domestic worst off. This can undermine their commitment both to productive labor and the respect for the rule of law. This in turn harms the state, makes it less stable and productive, and therefore makes it less able to generate external aid.

Miller also attempts to transcend the patriotism-cosmopolitanism dispute by universalizing patriotic priority. For the vast majority of people, certain universal human goods are only satisfied in local political communities. (The exceptions are a miniscule small number of global elites.) Our need for social interaction and political community is satisfied locally, as we do not share rich attachments with persons across the globe. This changes the inherently arbitrary nature of the state into which one was born into something morally relevant. This is not a rejection of all global redistribution, but an attempt to break from the view that patriotic priority and helping the globe’s worst off are polar opposites. Combined with the previous consequentialist argument, this means that in order to secure these universal human goods, we need individual states, and within each state we need patriotic priority in redistributive justice. Each person needs these goods categorically, they can only be provided locally, and they are threatened when the redistributive scheme within a given state does not exhibit patriotic priority. This is all compatible with the descriptive thesis applying globally. On this view the descriptive thesis only requires that we are not insensitive to the suffering of others. We do have global obligations to assist others, but this does not mean all the demands of distributive justice are all global.

A commitment to global equality requires radical, perhaps unrealistic sacrifice. That can be taken as reason to reject global egalitarianism: persons cannot reasonably be expected to bring about global equality. However, normative principles specify what we ought to do, not what we are comfortable doing. What we ought to do might require a complete change to our way of life.

6. References and Further Reading

  • Anderson, Elizabeth S. 1999. “What Is the Point of Equality?” Ethics 109 (2): 287–337.
    • (An attack on contemporary egalitarian theory in general and luck egalitarianism in particular. Provides a defense of democratic equality.)
  • Anderson, Elizabeth S. 2010. “The Fundamental Disagreement between Luck Egalitarians and Relational Egalitarians.” Canadian Journal of Philosophy 40, no. sup1: 1–23.
  • Arneson, Richard J. 1989. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1): 77–93.
  • Arneson, Richard J. 1990. “Liberalism, Distributive Subjectivism, and Equal Opportunity for Welfare.” Philosophy & Public Affairs, 19: 158–94.
  • Arneson, Richard J. 2000. “Luck Egalitarianism and Prioritarianism.” Ethics 110 (2): 339–49.
  • Arneson, Richard J. 2004. “Luck Egalitarianism Interpreted and Defended.” Philosophical Topics 32: 1–20.
    • (Important defense of luck egalitarianism.)
  • Barry, Nicholas. 2006. “Defending Luck Egalitarianism.” Journal of Applied Philosophy 23 (1): 89–107.
  • Blake, Michael. 2001. “Distributive Justice, State Coercion, and Autonomy.” Philosophy & Public Affairs 30 (3): 257–96.
    • (Egalitarian obligations hold within particular states, not globally.)
  • Cavanagh, Matt. 2002. Against Equality of Opportunity. Oxford: Clarendon Press.
  • Cohen, Gerald A. 1989. “On the Currency of Egalitarian Justice.” Ethics 99 (4): 906–44.
  • Cohen, Gerald A. 2009. Rescuing Justice and Equality. Cambridge: Harvard University Press.
  • Cohen, Joshua. 1989. “Democratic Equality.” Ethics 99 (4): 727–51.
  • Dworkin, Ronald. 1981a. “What is Equality? Part 1: Equality of Welfare.” Philosophy & Public Affairs 10 (3): 185–246.
  • Dworkin, Ronald. 1981b. “What is Equality? Part 2: Equality of Resources.” Philosophy & Public Affairs 10 (4): 283–345.
    • (Useful discussion of different metrics of equality.)
  • Dworkin, Ronald. 2002. Sovereign Virtue: The Theory and Practice of Equality. Cambridge: Harvard University Press.
  • Elster, Jon. 1985. Sour Grapes: Studies in the Subversion of Rationality. Cambridge: Cambridge University Press.
  • Feinberg, Joel. 1974. “Non-Comparative Justice.” Philosophical Review 83 (3): 297–358.
  • Fleurbaey, Marc. 1995. “Equal Opportunity or Equal Social Outcome?” Economics and Philosophy 11 (1): 25–55.
  • Frankfurt, Harry. 1987. “Equality As a Moral Ideal.” Ethics 98 (1): 21–43.
  • Freeman, Samuel. 2006. “Distributive Justice and the Law of Peoples.” In Rawls’s Law of Peoples: A Realistic Utopia?, edited by Rex Martin and David Reidy, 243–60. Oxford: Blackwell Publishing.
  • Harsanyi, John C. 1982. “Morality and the Theory of Rational Behavior.” In Utilitarianism and Beyond, edited by Amartya Sen and Bernard Williams, 39–62. Cambridge: Cambridge University Press.
  • Hurley, Susan. 2003. Justice, Luck, and Knowledge. Oxford: Oxford University Press.
  • Kagan, Shelly. 1999. “Equality and Desert,” in What Do We Deserve: A Reader on Justice and Desert, edited by Louis P. Pojman and Owen McLeod. New York: 298–314.
  • Kagan, Shelly. 2012. The Geometry of Desert. New York: Oxford University Press.
    • (Desert-based conception of distributive justice.)
  • Knight, Carl. 2009. Luck Egalitarianism: Equality, Responsibility, and Justice. Edinburgh: Edinburgh University Press.
  • Knight, Carl, and Zofia Stemplowska, eds. 2011. Responsibility and Distributive Justice. New York: Oxford University Press.
  • Knight, Carl. 2013. “Luck Egalitarianism.” Philosophy Compass 8 (10): 924–34.
  • Kymlicka, Will. 1990. Contemporary Political Philosophy. Oxford: Clarendon Press.
  • Lake, Christopher. 2001. Equality and Responsibility. New York: Oxford University Press.
  • Miller, Richard W. 1998. “Cosmopolitan Respect and Patriotic Concern.” Philosophy & Public Affairs 27 (3): 202–24.
    • (A defense of domestic prioritization in redistribution.)
  • Nagel, Thomas. 1991. Equality and Partiality. Oxford: Oxford University Press.
  • Nagel, Thomas. 2005. “The Problem of Global Justice.” Philosophy & Public Affairs 33 (2): 113–47.
    • (Nagel argues that obligations of egalitarian justice only extend as far as a scheme of enforcement, which typically extends only throughout a particular state.)
  • Nagel, Thomas. 2012. “Equality.” In Mortal Questions 106–27. New York: Cambridge University Press.
  • Nozick, Robert. 1974. Anarchy, State, and Utopia. New York: Basic books.
    • (A libertarian affirmation of the equality of all persons and rejection of redistribution aiming at greater equality.)
  • Nussbaum, Martha C. 1999. Sex and Social Justice. New York: Oxford University Press.
  • Nussbaum, Martha C. 2001a. “Symposium on Amartya Sen’s Philosophy: 5 Adaptive Preferences and Women’s Options.” Economics and Philosophy 17 (1): 67–88.
  • Nussbaum, Martha C. 2001b. Women and Human Development: The Capabilities Approach. New York: Cambridge University Press
    • (An influential development of the capabilities approach.)
  • O’Neill, Onora. 1975. “Lifeboat Earth.” Philosophy & Public Affairs 4 (3): 273–92.
  • Otsuka, Michael. 1998. “Self‐Ownership and Equality: A Lockean Reconciliation.” Philosophy & Public Affairs 27 (1): 65–92.
  • Otsuka, Michael. 2003. Libertarianism Without Inequality. Oxford: Oxford University Press.
    • (A reconciliation of libertarianism and substantive distributive justice.)
  • Parfit, Derek. 1995. Equality or Priority. First presented at the Lindley Lectures, November 21 1991. Lawrence: University of Kansas.
  • Parfit, Derek. 1997. “Equality and Priority.” Ratio 10 (3): 202–21.
  • Piketty, Thomas. 2014. Capital in the Twenty-First Century. Translated by Arthur Goldhammer. New York: Belknap Press.
  • Pogge, Thomas W. 1989. Realizing Rawls. Ithaca, New York: Cornell University Press.
  • Pogge, Thomas W. 1994. “An Egalitarian Law of Peoples.” Philosophy & Public Affairs 23 (3): 195–224.
  • Rakowski, Eric. 1991. Equal Justice. New York: Oxford University Press.
  • Rawls, John. 2009. A Theory of Justice. Cambridge: Harvard University Press.
  • Raz, Joseph. 1986. The Morality of Freedom. Oxford: Oxford University Press.
  • Roemer, John E. 2009. Equality of Opportunity. Cambridge: Harvard University Press.
    • (Sophisticated account of luck egalitarian equality of opportunity that focuses on effort.)
  • Sangiovanni, Andrea. 2007. “Global Justice, Reciprocity, and the State.” Philosophy & Public Affairs 35 (1): 3–39.
    • (Argues that egalitarian obligations only arise within particular political communities.)
  • Scanlon, Thomas M. 1975. “Preference and Urgency.” The Journal of Philosophy 72 (19): 655–69.
  • Scanlon, Thomas. 1996. “The Diversity of Objections to Inequality.” In The Difficulty of Tolerance: Essays in Political Philosophy, 202–18. Cambridge: Cambridge University Press.
  • Scanlon, Thomas. 1998. What We Owe to Each Other. Cambridge: Harvard University Press.
  • Scheffler, Samuel. 2003. “What is Egalitarianism?” Philosophy & Public Affairs 31 (1): 5–39.
  • Scheffler, Samuel. 2005. “Choice, Circumstance, and the Value of Equality.” Politics, Philosophy & Economics 4 (1): 5–28.
  • Segall, Shlomi. 2007. “In Solidarity with the Imprudent: A Defense of Luck Egalitarianism.” Social Theory and Practice 33 (2): 177–98.
  • Sen, Amartya, and Bernard Williams, eds. 1982. Utilitarianism and Beyond. Cambridge: Cambridge University Press.
  • Sen, Amartya. 1980. “Equality of What?” In The Tanner Lectures on Human Values, vol. 1, edited by S. McMurrin, 195–220. Salt Lake City: University of Utah Press.
  • Sen, Amartya. 1992. Inequality Reexamined. Oxford: Oxford University Press.
  • Sher, George. 1979. “Effort, Ability, and Personal Desert.” Philosophy & Public Affairs 8 (4): 361–76.
  • Tan, Kok-Chor. 2008. “A Defense of Luck Egalitarianism.” The Journal of Philosophy 105 (11): 665–90.
  • Temkin, Larry S. 1993. Inequality. New York: Oxford University Press.
  • Van Parijs, Philippe. 1995. Real Freedom for All: What (If Anything) Can Justify Capitalism? Oxford: Oxford University Press.
  • Walzer, Michael. 1983. Spheres of Justice: A Defense of Pluralism and Equality. New York: Basic Books.
  • Williams, Bernard. 1962. “The Idea of Equality.” In Philosophy, Politics and Society: Second Series, edited by P. Laslett and W.G. Runciman. Oxford: Blackwell.
  • Wolff, Jonathan. 1998. “Fairness, Respect, and the Egalitarian Ethos.” Philosophy & Public Affairs 27 (2): 97–122.
  • Wolff, Jonathan. 2013. “Scanlon on Social and Material Inequality. Journal of Moral Philosophy 10 (4): 406–25.

 

Author Information

Ryan Long
Email: longr@philau.edu
Philadelphia University
U. S. A.

Jürgen Habermas (1929—)

Habermas
photo by Ziel

Jürgen Habermas produced a large body of work over more than five decades. His early work was devoted to the public sphere, to modernization, and to critiques of trends in philosophy and politics. He then slowly began to articulate theories of rationality, meaning, and truth. His two-volume Theory of Communicative Action in 1981 revised and systematized many of these ideas, and inaugurated his mature thought. Afterward, he turned his attention to ethics and democratic theory. He linked theory and practice by engaging work in other disciplines and speaking as a public intellectual. Given the wide scope of his work, it is useful to identify a few enduring themes.

Habermas represents the second generation of Frankfurt School Critical Theory. His mature work started a “communicative turn” in Critical Theory. This turn contrasted with the approaches of his mentors, Max Horkheimer and Theodor W. Adorno, who were among the founders of Critical Theory. Habermas sees this turn as a paradigm shift away from many assumptions within traditional ontological approaches of ancient philosophy as well as what he calls the “philosophy of the subject” that characterized the early modern period. He has instead tried to build a “post-metaphysical” and linguistically oriented approach to philosophical research.

Another contrast with early Critical Theory is that Habermas defends the “unfinished” emancipatory project of the Enlightenment against various critiques. One such critique arose when the moral catastrophe of WWII shattered hopes that modernity’s increasing rationalization and technological innovation would yield human emancipation. Habermas argued that a picture of Enlightenment rationality wedded to domination only arises if we conflate instrumental rationality with rationality as such—if technical control is mistaken for the entirety of communication. He subsequently developed an account of “communicative rationality” oriented around achieving mutual understandings rather than simply success or authenticity.

Another enduring theme in Habermas’ work is his defense of “post-national” structures of political self-determination and transnational governance against more traditional models of the nation-state. He sees traditional notions of national identity as declining in importance; and the world, as faced with problems stemming from interdependency that can no longer be addressed at the national level. Instead of national identity centered on shared historical traditions, ethnic belonging, or national culture, he advocates a “constitutional patriotism” where political commitment, collective identity, and allegiance coalesce around the shared principles and procedures of a liberal democratic constitutionalism facilitating public discourse and self-determination. Habermas also claims that emerging structures of international law and transnational governance represent generally positive achievements moving the global political order in a cosmopolitan direction that better protects human rights and fosters the spread of democratic norms. He sees the emergence of the European Union as paradigmatic in this regard. However, his cosmopolitanism should not be overstated. He does not advocate global democracy in any strong sense, and he is committed to the idea that democratic self-determination requires a measure of localized mutual identification in the form of civic solidarity—a legally mediated solidarity around shared history, institutions, and rooted in some shared “ethical” pattern of life (see Sittlichkeit discussion below) fostering mutual understandings.

Table of Contents

  1. Biography: Early Life to Structural Transformation
  2. Enduring Themes in Formative and Transitional Work
    1. Public Deliberation Over Positivist Decisionism and Technocracy
    2. From Philosophical Anthropology to a Theory of Social Evolution
  3. The Linguistic Turn into the Theory of Communicative Action
  4. Discourse Ethics
  5. Political and Legal Theory
  6. References and Further Reading
    1. General Introductions to Habermas
    2. Introductory Books and Articles on Specific Themes
      1. Biography
      2. Linguistic Turn
      3. Discourse Ethics
      4. Political Theory
    3. Works Cited
    4. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

1. Biography: Early Life to Structural Transformation

Habermas was born in 1929 in Düsseldorf, Germany. He has noted that early corrective surgeries for a cleft palate sensitized him to human vulnerability and interdependence, and that subsequent childhood struggles with fluid verbal communication may partly explain his theoretical interest in communication and mutual recognition. He has also cited the end of WWII and frustrations over postwar Germany’s uneven willingness to fully break with its past as key personal experiences that inform his political theory.

Habermas belongs to what historians call the “Flakhelfer generation” or the “forty-fivers.” Flakhelfer means antiaircraft-assistant. At the end of the war, people born between 1926 and 1929 were drafted and sent to help man antiaircraft artillery defenses. Over a million youth served as such personnel. The second “forty-fiver” label captures how this generation came of age with the 1945 Nazi defeat. These experiences fostered a political skepticism and vigilance born out of having been exploited, and an affinity for the nascent liberal democratic principles of postwar Germany. Both labels capture formative features of Habermas’ biography (Specter 2010, Matustik 2001).

Reflecting on his upbringing during the war, Habermas describes his family as having passively adapted to the Nazi regime—neither identifying with nor opposing it. He was recruited into the Hitler Youth in 1944 and sent to man defenses on the western front shortly before the war ended. Soon thereafter he learned of the Nazi atrocities through radio broadcasts of the Nuremburg trials and concentration camp documentaries at local theaters. Such experiences left a deep impact: “all at once we saw that we had been living in a politically criminal system” (AS 77, 43, 231).

After the war, he studied philosophy at the universities of Göttingen (1949-50), Zurich (50-51) and Bonn (51-54). He wrote his thesis on Schelling under the direction of Erich Rothacker and Oskar Becker. He was increasingly frustrated with the unwillingness of German politicians and academics to own up to their role in the war. He was disappointed in the postwar government’s failure to make a fresh political start and distressed by continuities with the past. In interviews, he has recalled leaving a campaign rally in 1949 after being disgusted by the far-right connotations of the flags and songs used. He was similarly disappointed by German academics. At university he studied the work of Arnold Gehlen and Martin Heidegger extensively, but their prior Nazi ties were not discussed openly. In 1953 Heidegger reissued his 1935 Lectures on Metaphysics in a largely unedited form that included reference to the “inner truth and greatness of the Nazi movement.” Habermas published an op-ed challenging Heidegger, and the lack of response seemed to confirm his suspicions (NC, 140-172). He wrote a piece critiquing Gehlen a few years later (1956). Around the same time he was distressed to learn Rothacker and Becker had also been active Nazi party members.

Near the end of his studies Habermas worked as a freelance journalist and published essays in the intellectual journal Merkur. He took an interest in the interdisciplinary Institute for Social Research affiliated with the University of Frankfurt. The Institute had returned from wartime exile in 1950, and Adorno became director in 1955. Adorno was familiar with Habermas’ essays and took him on as a research assistant. While at the Institute Habermas studied philosophy and sociology, worked on research projects, and continued to publish op-ed pieces. One such piece, Marx and Marxism, struck Horkheimer as too radical. Horkheimer wrote to Adorno suggesting he dismiss Habermas from the Institute. The following year Horkheimer rejected Habermas’ Habilitationsschrift proposal on the public sphere. Habermas did not want to alter his project, so he completed his dissertation at the University of Marburg under the Marxist political scientist Wolfgang Abendroth.

His Habilitationsschrift, The Structural Transformation of the Public Sphere (German 1962, English 1989), was well received in Germany. It chronicled the rise of the bourgeois public sphere in 18th and 19th century Europe, as well as its decline amidst the mass consumer capitalism of the 20th century. Habermas gave an account of the way in which newspapers, coffee shops, literary journals, pubs, public meetings, parliament and other public forums facilitated the emergence of powerful new social norms of discourse and debate that mediated between private interests and the public good. These forums functioned as mechanisms to disseminate information and help freely form the public political will needed for collective self-determination. These norms also partly embodied important principles like equality, solidarity, and liberty. By the late 19th century, however, capitalism was increasingly monopolistic. Large corporations easily influenced the state and society. Economic elites could use ownership of the media and other (previously public) forums to manipulate or manufacture public opinion and buy-off politicians. Citizens deliberating about the common good were transformed into atomized consumers pursuing private interests. Habermas describes this as the “re-feudalization” of the public sphere. While his narrative was pessimistic, the end of Structural Transformation seems to hold out hope that the truncated normative potential of the public sphere may yet be revived. The work solidified Habermas’ place in the German academy. After a short stint in Heidelberg, he returned to the University of Frankfurt in 1964 as a professor of philosophy and sociology, taking over the chair vacated by Horkheimer’s retirement.

In the spirit of his early call for renewed public sphere debate, Habermas has consistently engaged political movements as a public intellectual and taken part in various scholarly debates. This has not always been easy. After returning to Frankfurt he had been a mentor for the German student movement, but had a falling out with student radicals in 1967. In June of that year a variety of simmering protests—over the restructuring of German universities, proposed “emergency laws,” the Vietnam War, and other issues—boiled over. The breaking point was when a student at a protest against the Shah of Iran was shot and fatally beaten by plainclothes police, who then tried to cover up the incident. This stoked the flames of student protests. Sit-ins and protests crippled everyday life. Under the leadership of Rudi Dutschke students occupied the Free University of Berlin.

Habermas worried that protest leaders seemed to be advocating an unsophisticated and extra-legal opposition to any and all authority that could easily lead to violence. At a conference in Hannover shortly after the shooting he publically reproached Dutschke by calling his model of extra-legal direct-action “left fascism.” That charge alienated Habermas from the leftist student movement and inspired an essay collection Die Linke antwortet Jürgen Habermas (The Left Answers Habermas—German 1969). Rapprochement would only come a decade later when, in the aftermath of a series of killings by the radical left-wing Red Army Faction, politicians on the right tried to garner political capital by suggesting that such terrorism was rooted in the ideas of Frankfurt School Critical Theory. Habermas and Dutschke published pieces repudiating the accusation. A decade later, the editor of the essay collection apologized for how the book made it seem like Habermas’ falling out with the student movement marked a conservative turn that meant he was no longer part of the left.

As a public intellectual, Habermas has engaged a variety of topics: the anti-nuclear movement of the late fifties, the “Euromissile” debate of the early eighties and, in the early two-thousands, both the terrorism of 9/11 and the second Iraq War. In the second half of the eighties he was also a key voice in the Historikerstreit debate between historians, philosophers, and other academics about the proper way for Germany to situate and remember the Holocaust amidst the history of other atrocities. In 1989 he made important contributions to public debate about the reunification of Germany. While Habermas was not against reunification, he was critical of the speed and manner in which reunification was carried out. More recently, he has approached public debate on the European Union along the broadly similar lines of a cautious optimism that is also on guard against a forced, rushed, or duped false unity that would lack legitimacy and stability over the long term.

In a more academic vein, he has had numerous exchanges with thinkers like Jacques Derrida, Richard Rorty, Hans-Georg Gadamer, Niklas Luhmann, John Rawls, Robert Brandom, Hilary Putnam, and Cardinal Joseph Ratzinger (before he was Pope Benedict XVI). His ongoing debate with postmodernism is arguably the most enduring line of debate. Broadly speaking, thinkers like Michel Foucault, Jacques Derrida, and Richard Rorty have levied criticisms to the effect that reason is little more than a historically and culturally contingent social form, that notions of universally valid morality and truth are ethnocentric projections of power, that interests shaped by radically different ways of life are irreconcilable, and that our belief in the emancipatory moral progress of humankind is a myth. Habermas has tried to meet such challenges in much the same way as he responded to Horkheimer and Adorno’s Dialectic of Enlightenment: by relying on his account of communicative rationality in Theory of Communicative Action. However, before turning to that more mature theory, we must survey a few major phases of his formative and transitional work.

2. Enduring Themes in Formative and Transitional Work

a. Public Deliberation Over Positivist Decisionism and Technocracy

The essays in Towards a Rational Society (German 1968 and 1969, English 1970) and Theory and Practice (German 1971, English 1973b) were written on the heels of Structural Transformation. They were written amidst the “positivism dispute” in Germany about the relation between the natural and social sciences. The (somewhat inaccurately labeled) “positivist” side of this debate took scientific inquiry as the sole paradigm of knowledge and generally thought of the social sciences as analogous to the natural sciences. Following Adorno, Habermas argued against a positivistic understanding of the social sciences.

For Habermas, positivism is comprised of three claims: (1) knowledge consists of causal explanations cast in terms of basic laws or principles (for example, laws of nature), (2) knowledge passively reflects or mirrors independently existing natural facts, (3) knowledge is about what is, not what ought to be. He calls these claims scientism, objectivism, and value-neutrality. He said each can be pernicious, especially in the social scientific realm. Scientism fosters the view that only causal and empirically verifiable hypotheses can count as true knowledge. Objectivism seems to falsely naturalize the world by ignoring how lived experiences, human subjectivity, and interests can structure the object domain that gets identified as relevant or worthy of study. Lastly, value-neutrality misleads us into thinking that the role of knowledge is purely descriptive and technical. Values or preferences are seen as separate from knowledge and, as such, wholly subjective “givens” lying beyond rational justification. In turn, knowledge is seen as a tool for efficiently controlling the environment so as to realize whatever values an agent happens to hold. Ironically, this fails to see the tacit value commitments already inscribed in this general paradigm of knowledge.

Habermas’ critique makes sense given his place in Frankfurt Critical Theory. Despite differences with the first generation, he shares the decidedly non-neutral commitments to human emancipation, interdisciplinarity, and self-reflexive theory. Like Horkheimer and Adorno, Habermas worried the prior ascendancy of positivism had left influences on our conceptualizations of knowledge and social inquiry that were hard for even reflective positivists to leave behind. Indeed, he critiques Karl R. Popper’s account of inquiry and knowledge even though it rejects what Habermas calls objectivism. In opposition to a positivist picture of knowledge merely mirroring the world, Habermas holds the Frankfurt School’s Hegelian-Marxist-inspired conception of a dialectical relation between knowledge and world. Finally, like his Frankfurt School contemporaries, Habermas was concerned that positivism had left subtle yet pernicious impacts on politics.

In early writings Habermas is especially critical of two related trends, decisionism and technocracy, that stem from a positivistic understanding of political science and practice. Decisionism starts from the assumption that there is no such thing as the public interest, but rather a clash of inherently subjective values that do not (even in principle) admit of rational persuasion or agreement. It follows that political elites must either simply decide between competing values or base policy on their aggregation. Either way, political value preferences are taken as brute or static facts; there is no sense in which reasoned argumentation and persuasion could genuinely transform such preferences or lead people to a new understanding of their values. Technocracy builds from this point by emphasizing the “objective necessities” (Sachzwänge) supposedly involved in a political system—economic growth, social stability, national security—and highlighting the increasing ability of policy experts to advise political leaders about strategies for optimally realizing these goals. The worry with this approach is that questions about what specific type of growth, stability, and security we seek (and why) are removed from debate by definitional fiat. In decisionism, political legitimacy flows from periodic expressions of acclamation or disapproval at the way leaders have manifested predefined values. In technocracy, legitimacy supposedly flows from the ability of politicians to find and follow expert advice so as to attain fixed outcomes pre-defined by “objective necessities.” Both models render the potentially transformative effects of public deliberation superfluous. Legitimacy is seen as flowing from either certain outcomes or periodic expressions of aggregate preference.

Habermas thinks both models are extremely problematic accounts of democratic political practice and legitimacy. While Structural Transformation only gestured at how the normative potential of the public sphere could be reinvigorated in contemporary circumstances, this theme received increasing attention in works such as Legitimation Crisis (German 1973, English 1975), Theory of Communicative Action, and Between Facts and Norms (German 1992, English 1996). An account of democratic legitimacy that combats decisionism and technocracy is an enduring concern. Indeed, despite championing the European Union he has continued to critique technocracy by criticizing the way in which it has arisen and is currently structured (2008, 2009, 2012, 2014).

b. From Philosophical Anthropology to a Theory of Social Evolution

Knowledge and Human Interests (German 1968, English 1971) and Communication and the Evolution of Society (German 1976, English 1979) are two early attempts at a new systematic framework for Critical Theory. The approaches he uses are akin to the tradition of “philosophical anthropology” in the German social theory of the early 1900s that grew out of phenomenology—a tradition that is quite different from contemporary anthropology. Knowledge and Human Interests sought to overcome positivist epistemology that saw knowledge as simply discerning static facts, and to give a plausible account of the dialectical relation between knowledge (theory) and world (practice). Habermas’ main claim was that the knowledge of scientific and social progress is tacitly guided by three types of “knowledge constitutive interests”—technical, practical, and emancipatory—that are “anthropologically deep-seated” in the human species.

Knowledge and Human Interests tries to recover and develop alternative models of the relation between theory and practice. The approach is historical and reconstructive in that it interprets the attempts of prior theorists as part of a trajectory that Habermas wants to extend. He reviews prior reformulations of Kant’s “transcendental synthesis” (the form-legislating activity making objective experience possible) and his “transcendental unity of apperception” (the unity of the subject having such experience). He also tries to articulate the way in which Hegel relocated such synthesis in the historical development of human subjectivity (absolute spirit) and how Marx relocated it in the material use of tools and techniques (embodied labor). Habermas wants to add to such a trajectory by rehabilitating their shared insight that the constitution of experience is not generated by transcendental operations but by the worldly natural activities of the human species. Yet he wants to do this in a way that avoids the mistakes of Marx and Hegel as well. He tries to do this by building on his interpretation of Hegel, which was already concisely captured by his essay Science and Technology as Ideology (German 1968, included in English 1970).

In that essay he responded to Herbert Marcuse’s claim that the technical reason of science inherently embodies domination. According to Marcuse, under late capitalism the technical reason of science functions ideologically to collapse intersubjective practical questions about how we want to live together into technical questions about how to control the world to get what we want. Habermas shares Marcuse’s concerns, as his criticism of technocracy makes clear. Yet he thinks this dynamic is contingent because, taken as an emergent collective project, humankind constitutes how the world shows up in experience through its worldly activity. More specifically, Habermas identifies two irreducibly distinct and dialectically related modes of human self-formation, “labor” and “interaction.” Whereas labor is an action type that aims at technical control to achieve success, interaction is an action type that aims at mutual understandings embodied in consensual norms. Marcuse’s claim (and his remedy of a “new science”) would only stand if the “interaction” of intersubjective collective political choice—including the question of how we use technology—was somehow subsumed or rendered superfluous by the “labor” of technological progress in controlling the external world. But, given Habermas’ views in this period, this is impossible. Interaction and labor seem to be pitched as irreducible and invariant categories of human experience. Neither can be dropped nor can one be subsumed in the other—even if their relation becomes unbalanced.

In Knowledge and Human Interests, this division between labor and interaction is recast as the technical and practical interests of humankind. The technical interest is in the material reproduction of the species through labor on nature. Humans use tools and technologies to manage nature for material accommodation. The practical interest is in the social reproduction of human communities through intersubjective norms of culture and communication. Human social life requires members who can understand each other, share expectations, and achieve cooperation. In a sense, these interests are the “most fundamental.” Moreover, the knowledge that flows from them is supposed to slowly accrue over time in the enduring institutions of society: theoretical knowledge driven by the technical interest in controlling nature accrues in the “empirical-analytic” sciences, and normative knowledge driven by the practical interest in mutual understandings accrues in the interpretive “historical-hermeneutic” sciences.

But, going beyond Science and Technology as Ideology, in Knowledge and Human Interests Habermas adds a third “emancipatory” human interest in freedom and autonomy. The labor of material reproduction and the interaction norms of social reproduction require, in a weak sense, psychosocial mechanisms to repress or deny basic drives and impulses that would destroy material and social reproduction. For instance, labor requires delayed gratification and social interaction requires internalized notions of obligation, reciprocity, shame, guilt, and so forth. Unfortunately, psychosocial mechanisms of control are often used far more than they need to be to secure material and social reproduction. Indeed, perverse incentives to rely on such mechanisms may even arise: if the burdens and benefits of material and social reproduction processes become unfairly distributed across groups and solidified over time, then those in power may find psychosocial mechanisms useful. If women are falsely taught there are natural laws of gender relations such that the dominant patterns of marriage and domestic work that consistently disadvantage them are the best they can hope for, this is an ideological mechanism of social control. It is the limitation of freedom and autonomy for no purpose other than domination, and it “functions” through systematically distorted communication.

Habermas posits a human interest in using self-reflection and insight to combat ideologically veiled, superfluous social domination so as to realize freedom and autonomy. While there is no clearly institutionalized set of sciences where the knowledge spurred on by such an interest would accrue, Habermas points to Marx’s critique of ideology and Freud’s psychoanalytic dissolution of repression as demonstrating a cognitive viewpoint that focuses on neither (efficient) work nor (legitimate) interaction but (free) identity formation liberated from internalized systematically distorted communication. Here Habermas takes his lead from Kant’s idea that reason aims to emancipate itself from “self-incurred tutelage,” and tries to forge a link between theory (reason) and practice (in the sense of self-realization) through using critical reflection on self and society to unveil and dissolve internalized oppressive power structures that betray one’s own true interests.

Knowledge and Human Interests was envisioned as a preface for two other books that would jointly challenge the separation of theory and practice. However, the project was never finished. On the one hand, Habermas felt that vibrant critiques of positivism in the philosophy of science made the rest of the project superfluous. On the other, the work encountered heavy criticism. For starters, Habermas seems to pitch work and interaction as real action types. But, if we account for how work is communicatively structured, interaction is teleologically ordered, and how historical notions of work and interaction structure one’s sense of freedom, then it is clear these can be at best idealizations. Moreover, as even sympathetic interpreters noted, his account of an emancipatory interest seemed to blur together reflection on “general presuppositions and conditions of valid knowledge and action” with “reflection on the specific formative history of a particular individual or group” (Giddens, McCarthy, 95). Lastly, his stipulation of knowledge-constitutive interests seemed to reproduce the sort of foundationalism he wished to avoid.

Given such criticism, it may seem surprising that Communication and the Evolution of Society reconstructs Marx’s historical materialism as a theory of social evolution. This sounds foundationalist and deterministically teleological. These impressions are misleading. Around this time Habermas began presenting his work as a “research program” with tentative and fallible claims evaluable by theoretical discourses. Moreover, while he speaks of evolution, he uses the term differently than 19th century philosophies of history (Hegel, Marx, Spencer) or later Darwinian accounts. His “social evolution” is neither a merely path-dependent accumulative directionality nor a progressive, strongly teleological realization of an ideal goal. Instead, he envisions a society’s latent potentials as tending to unfold according to an immanent developmental logic similar to the developmental logic cognitive-developmental psychologists claim maturing people normally follow. Lastly, Habermas’ theory of social evolution avoids worries about determinism by distinguishing between the logic and the mechanisms of development such that evolution is neither inevitable, linear, irreversible, nor continuous. A brief sketch of his theory follows.

Habermas characterizes human society as a system that integrates material production (work) and normative socialization (interaction) processes through linguistically coordinated action. This is qualitatively different from the static and transitive status hierarchy systems of even other “social” animals. In various human epochs the linguistic coordination of these processes crystalizes around different “organizational principles” that are the “institutional nucleus” of social integration. In the most basic societies kinship structures play this role by (to take just one possible configuration) dividing labor and specifying socialization responsibilities through sex-based roles and norms. Habermas claims this organizational principle was replaced by political order in traditional societies and the economy in liberal capitalist societies. Social evolution in general and the particular movements from one “nucleus” to the next stem from learning in material and social reproduction.

Understood as ideal types, work and interaction mark out different ways of relating to the world. On the one hand, in material production one mainly adopts an instrumental perspective that tries to control an object in conformity to one’s will. In this orientation, learning is gauged by success in controlling the world and the resultant knowledge is cognitive-technical. On the other, in social reproduction one mainly adopts a communicative perspective that tries to coordinate actions and expectations through consensually agreed upon normative standards. In this orientation learning is gauged by mutual understanding and the resultant knowledge is moral-practical. Each learning process follows its own logic. But, since the processes are integrated in the same social system, advances in either type of knowledge can yield internal tensions or incongruities. These cannot be suppressed by force or ideology for long, and eventually need to be solved by more learning or innovation. If these internal tensions are too great, they induce a crisis requiring an entirely new “institutional nucleus.”

For Habermas, the slow social learning in history is the sedimentation of iterated processes of individual learning that accumulates in social institutions. While there is no unified macro-subject that learns, social evolution is also not mere happenstance plus inertia. It is the indirect outcome of individual learning processes, and such processes unfold with a developmental logic or deep structure of learning: “the fundamental mechanism for social evolution in general is to be found in an automatic inability not to learn. Not learning but not-learning is the phenomenon that calls for explanation” (LC, 15; also see Rapic 2014, 68). Habermas posits a universal developmental logic that tends to guide individual learning and maturation in technical-instrumental and moral-practical knowledge. He discerns this logic in the complementary research of Jean Piaget in cognitive development and Lawrence Kohlberg in the development of moral judgment. As social and individual learning are linked, such underlying logic has slowly created homologies—similarities in sequence and form—between: (i.) individual ego-development and group identity, (ii.) individual ego-development and world-perspectives, and (iii.) the individual ego-development of moral judgment and the structures of law and morality (Owen 2002, 132). Habermas pays more attention to the last homology and later writings focus on Kohlberg, so it is instructive to focus there (1990b).

Kohlberg’s research on how children typically develop moral judgment yielded a schema of three levels (pre-conventional, conventional, and post-conventional) and six stages (punishment-obedience, instrumental-hedonism/relativism, “good-boy-nice-girl”, legalistic social-contract/law-and-order, universal ethical principles). Two stages correspond to each level. Habermas follows Kohlberg’s three levels in claiming we can retrospectively discern pre-conventional, conventional, and post-conventional phases through which societies have historically developed. Just as normal individuals who progress from child to adult pass through levels where different types of reasons are taken to be acceptable for action and judgment, so too we can retrospectively look at the development of social integration mechanisms in societies as having been achieved in progressive phases where legal and moral institutions were structured by underlying organizational principles.

Habermas slightly diverges from the six stages of Kohlberg’s schema by proposing a schema of neolithic societies, archaic civilizations, developed civilizations, and early modern societies. Neolithic societies organized interaction via kinship and mythical worldviews. They also resolved conflicts via feuds appealing to an authority to mediate disputes in a pre-conventional way to restore the status quo. Archaic civilizations organized interaction via hierarchies beyond kinship and tailored mythical worldviews backing such hierarchies. Conflicts started to be resolved via mediation appealing to an authority relying on more abstract ideas of justice—punishment instead of retaliation, assessment of intentions, and so forth. Developed civilizations still organized interaction conventionally, but adopted a rationalized worldview with post-conventional moral elements. This allowed conflicts to be mediated by a type of law that, while rooted in a community’s (conventional) moral framework, was separable from the authority administering it. Finally, with early modern societies, we find certain domains of interaction are post-conventionally structured. Moreover, a sharper divide between morality and legality emerges such that conflicts can be legally regulated without presupposing shared morality or needing to rely on the cohering force of mythical worldviews backing hierarchies (McCarthy 1978, 252).

Obviously, this sketch is rather vague and needs further elaboration. This is especially true in light of the ways a superficial reading (that takes social evolution as strictly parallel rather than homologous to individual development) lends itself to unsavory developmentalist narratives. Yet, apart from a few later writings, Habermas has not returned to his theory of social evolution in a systematic way. Several secondary authors have tried to fill in the details (Rockmore 1989, Owen 2002, Brunkhorst 2014, Rapic 2014). Nevertheless, Habermas still endorses the contours of his theory of social evolution: these ideas show up in Theory of Communicative Action and his later writings on the nature and development of legality and democratic legitimacy bear a loose connection to this early work (especially the final homology above) insofar as they are tailored for specifically post-conventional societies. Yet, before turning to his democratic theory, we must tackle the hugely important intervening body of work concerning his communicative turn and its articulation in his Theory of Communicative Action.

3. The Linguistic Turn into the Theory of Communicative Action

Habermas’ engagement with speech act theory and hermeneutics in the late 1960s and 70s started a linguistic turn that came to full fruition in Theory of Communicative Action. This turn makes sense after both Knowledge and Human Interests and Communication and the Evolution of Society. He came to see the knowledge-constitutive interests of the former as illicitly relying on assumptions in the philosophy of consciousness and Kantian transcendentalism, while the reconstructed phases of social learning and evolution in the latter can seem far too naturalistic or foundationalist. In contrast, a focus on communicative structures let him form his own pragmatic theory of meaning, rationality, and social integration based in reconstructions of the competencies and normative presuppositions underlying communication. This approach is transcendental and naturalistic but only weakly so. Far from an account of ultimate foundations, his approach takes itself to be a post-metaphysical methodology for philosophical and social scientific research into practical reason. From the start of his linguistic turn until well after Theory of Communicative Action this approach underwent revisions. In what follows, only a broad outline of this trajectory is given.

Habermas has cited his 1971 Gauss lectures at Princeton (German publication 1984b, English publication 2001) as the first clear expression of the linguistic turn, but it was also evident in On the Logic of the Social Sciences (German 1967, English 1988a). His first truly systematic foray in Anglo-American philosophy of language came with What is Universal Pragmatics? (German 1976b, included in English 1979). His ideas were then revised further in Theory of Communicative Action. While the development of his ideas throughout this period is an important exegetical task, for present purposes the broad way he takes up speech act theory is what is important: he accepts the division in linguistics between syntax, semantics, and pragmatics. He considers each division to be reconstructing the tacit system of rules used by competent speakers to recognize the well-formed-ness (syntax), meaningfulness (semantics), and success (pragmatics) of speech. His main interpretive twist is that the theories of truth-conditional propositional meaning often associated with philosophical projects regarding language only locate part of the meaning of speech. Thus, he moves away from meaning based on the correspondence theory of truth and gives an account of the unique pragmatic validity behind the meaning of speech.

While his linguistic turn is sometimes cast as a break with prior theory, his interpretive approach actually coheres quite well with his early critique of positivism. He has always rejected the idea that language simply states things about the world. Instead of merely analyzing propositions that either do (true) or do not (false) obtain in the world, he is interested in the full range of ways people use language. He claims that, instead of focusing on sentences, a complete theory of language would focus on contextual utterances as the most basic unit of meaning. Thus, he developed a formal pragmatics (called “universal pragmatics” in early work). Building on the work of Karl Bühler, he conceives of the pragmatic use of language in context as embedding sentences in relations between speaker, hearer, and the world. This embedding helps to intersubjectively stabilize such relations. Habermas claims that, in uttering a speech act speakers mean something (express subjective intentions), do something (interact with or appeal to a hearer) and say something (cognitively represent the world). While truth-conditional theories of meaning focus on cognitive representations of the world, Habermas prioritizes the pragmatics of speech acts over the semantic or syntactical analysis of sentences. What is done through speech is taken to be what is most basic for meaning.

During his linguistic turn Habermas appropriated several ideas from John Searle. Even though Searle has not always fully agreed with such appropriations, two of them are useful points of orientation (Searle 2010, 62). Habermas adopts Searle’s idea of the constitutive rules underlying language: just like the rules of a game define what counts as a legitimate move or status, so too there is an implicit rule-governed structure to the use of language by competent speakers. He also adopts Searle’s view, built on JL Austin’s work, that speech has a double structure of both propositional content and illocutionary force. For instance, the propositional content of “it is snowy in Chicago” is a representation of the world. But the same content can be used in different illocutionary modes: as a warning to drive carefully, as a plea to delay travel, as a question or answer in a larger conversation, and so on. Moreover, beyond such illocutionary force, all speech acts also have derivative perlocutionary effects that, unlike illocution, are not internally connected to the meaning of what is said. A warning about snow may elicit annoyance or gratitude, but such responses are contextually inferred and not necessarily connected to either the propositional content or the warning itself.

These ideas about the structure of speech highlight a few key points. First, Habermas takes perlocutionary success (for example, eliciting gratitude) to be parasitic on illocutionary force (for example, the speech is perceived as a warning, not a plea). Attaining success with others by realizing one’s intention in the world is secondary to achieving an understanding with them. For example, even when lying, the lie only works by first coming to a false understanding that what is being said is true. Second, he identifies three modes of communication—cognitive, interactive, and expressive—that depend on whether a speaker’s main illocutionary intention is to raise a truth claim of propositional content, a claim of rightness for an act, or a claim of sincerity about psychological states. Third, he identifies corresponding speech act types—constatives, regulatives, and expressives—that, seen from the perspective of a competent language user, contain immanent obligations to redeem the aforementioned claims by respectively providing grounds, articulating justifications, or proving sincerity and trustworthiness.

In short, Habermas thinks there are general presuppositions of communicative competence and possible understanding that underlie speech and which require speakers to take responsibility for the “fit” between an utterance and inner, outer, and social worlds. For any speech act oriented towards mutual understanding, there is a presumed fit of sincerity to the speaker’s inner world, truth to the outer world, and rightness to what is inter-subjectively done in the social world. Naturally, these presumptions are defeasible. Yet, the point is that speakers who want to reach an agreement have to presuppose sincerity, truth and rightness so as to be able to mutually accept something as a fact, valid norm, or subjectively held experience.

For Habermas these elements form the “validity basis of speech.” He claims that, by uttering a speech act, a speaker is seen as also potentially raising three “validity claims”: sincerity for what is expressed, rightness for what is done, and truth for what is said or presupposed. Depending on the speech act type, one claim often predominates (for example, constatives raise a validity claim of truth) and, more often than not, speech rests on undisturbed background agreements about facts, norms, and experiences. Moreover, minor disagreement can be quickly resolved through clarifying meaning, reminding others of facts, asking about preexisting commitments, highlighting situational features, and so on. Habermas sometimes refers to such minor communicative repairs as “everyday speech.” But when disagreement persists we may need to transition to what Habermas calls “discourse”: a particular mode of communication in which a hearer asks for reasons that would back up a speaker’s validity claim. In discourse the validity claims that are always immanent within speech become explicit.

Clearly, Habermas uses “validity” in an odd way. The notion of validity is most often used in formal logic where it refers to the preservation of truth when inferentially moving from one proposition to another in an argument. This is not how Habermas uses the term. What then does he mean by validity? It is instructive to look at the assumptions behind his theory of meaning. When his model of meaning emphasizes what language does over what it merely says or means the operative assumption is that the primary function of speech is to arrive at mutual understandings enabling conflict-free interaction. Moreover, at least with respect to claims of truth and rightness, he assumes genuine and stable understandings arise out of the give and take of reasons. Claims of truth and rightness are paradigmatically cognitive in that they admit of justification through reasons offered in discourse. What Habermas means by validity then is a close structural relationship between the give and take of reasons and either achieving an understanding or (more strongly) a consensus that allows for conflict-free interaction. This yields an “acceptability theory” of meaning where the acceptance of norms is always open to further debate and refinement through better reasons.

As we cannot know in advance what reasons will bear on a given issue, only robust and open discourses license us to take the (provisional) consensuses we do achieve as valid. Habermas therefore formulates formal and counterfactual conditions—the “pragmatic presuppositions” of speech and the “ideal speech situation”—that describe and set standards for the type of reason-giving that mutual understandings must pass through before we can regard them as valid (on these formal conditions and how understanding and consensus may differ see below and section 4). At the same time, we never start this give-and-take of reasons from scratch. People are born into cultures operating on background understandings that are embodied in inherited norms of action. Borrowing from Husserl and others, Habermas calls this stock of understandings the “lifeworld.”

The lifeworld is an important if somewhat slippery idea in Habermas’ work. One way to understand his particular interpretation of it is through the lens of his debate with Gadamer. Broadly speaking, Habermas agrees with the view of language held by Gadamer and hermeneutics generally: language is not simply a tool to convey information, its most basic form is dialogic use in context, and it has an inbuilt aim of understanding. On such a view, objectivity is not just correspondence to an independent world but instead something that is ascribed to mutual understandings (about the world, relations to others, and oneself) intersubjectively achieved in communication. Moreover, communication has an underlying structure that makes understandings possible in the first place. Meaning is therefore in some sense parasitic on this background structure.

On this much Gadamer and Habermas agree. But Gadamer takes all this to mean that explicit understanding and misunderstanding are only possible due to a taken-for-granted understanding of cultural belonging and socialization into a natural language. Habermas agrees that culture and socialization are important, but is worried that Gadamer’s take on the background structures that form the “conditions of possibility” for meaning yields a relativistic “absolutization of tradition.” On Habermas’ interpretation the lifeworld encompasses the sort of belonging and socialization referred to by Gadamer, but it works with and is underpinned by certain deep structures of communication itself. For Habermas, the complementarity between the lifeworld and a particular manifestation of these deep structures in discourse and “communicative action” (below) is what lets one interrogate and progressively revise parts of the background stock of inherited understandings and validity claims, thereby avoiding either relativism or the dogmatic veneration of tradition.

For Habermas the lifeworld is a reservoir of taken-for-granted practices, roles, social meanings, and norms that constitutes a shared horizon of understanding and possible interactions. The lifeworld is a largely implicit “know-how” that is holistically structured and unavailable (in its entirety) to conscious reflective control. We pick it up by being socialized into the shared meaning patterns and personality structures made available by the social institutions of our culture: kinship, education, religion, civil society, and so on. The lifeworld sets out norms that structure our daily interactions. We don’t usually talk about the norms we use to regulate our behavior. We simply assume they stand on good reasons and deploy them intuitively.

But what if someone willfully breaks or explicitly rejects a norm? This calls for discourse to explain and repair the breach or alter the norm. As a micro-level example: if someone breaks a promise then they will be asked to justify their behavior with good reasons or apologize. Such communication is also called for when norms suffer more serious breakdowns: one may question the reasons behind norms and whether they remain valid, or run into a new and complex situation where it is unclear which norms, how, to what extent, and if they apply. Regardless of how serious the norm breach or breakdown is, we need to engage in discourse to repair, refine, and replenish shared norms that let us avoid conflict, stabilize expectations, and harmonize interests. Discourse is the legitimate modern mechanism to repair the lifeworld; it embodies what Habermas calls “communicative action.”

Communicative action can be seen as a practical attitude or way of engaging others that is highly consensual and that fully embodies the inbuilt aim of speech: reaching a mutual understanding. In later writings Habermas distinguishes weak and strong communicative action. The weak form is an exchange of reasons aimed at mutual understanding. The strong form is a practical attitude of engagement seeking fairly robust cooperation based in consensus about the substantive content of a shared enterprise. This allows solidarity to flourish. In either form, communicative action is distinct from “strategic action,” wherein socially interacting people aim to realize their own individual goals by using others like tools or instruments (indeed, he calls this type of action “instrumental” when it is solitary or non-social). A key difference between strategic and communicative action is that strategic actors have a fixed, non-negotiable objective in mind when entering dialogue. The point of their engagement is to appeal, induce, cajole, or compel others into complying with what they think it takes to bring their objective about. In contrast, communicatively acting parties seek a mutual understanding that can serve as the basis for cooperation. In principle, this involves openness to an altered understanding of one’s interests and aims in the face of better reasons and arguments.

The contrast between communicative and strategic action is tightly linked to the distinction between communicative and purposive rationality. Purposive rationality is when an actor adopts an orientation to the world focused on cognitive knowledge about it, and uses that knowledge to realize goals in the world. As noted, it has social (strategic) and non-social (instrumental) variants. Communicative rationality is when actors also account for their relation to one another within the norm-guided social world they inhabit, and try to coordinate action in a conflict free manner. On this model of rationality, actors not only care about their own goals or following the relevant norms others do, but also challenging and revising them on the basis of new and better reasons.

Approaching rationality after action orientations is not merely stylistic. Habermas notes that while many theorists start with rationality and then analyze action, the view of action that such an order of analysis primes us to accept can tacitly smuggle in quasi-ontological connotations about the possible relations actors can have amongst themselves and to the world. Indeed, this mistake figures into Habermas’ critique of Weber’s account of the progressive social rationalization ushered in by modernity. Weber framed Western rationalism in terms of “mastery of the world” and then naturally assumed the rationalization of society simply meant increased purposive rationality. As is apparent from Habermas’ account of social learning, this is not the only way to understand the “evolution” of societies or the species as a whole throughout history. By expanding rationality beyond purposive rationality Habermas is able to resist the Weberian conclusion that had been attractive to Horkheimer and Adorno: that modernity’s increasing “rationalization” yielded a world devoid of meaning, people focused on control for their own individual ends, and that the spread of enlightenment rationality went conceptually hand-and-glove with domination. Habermas feels the notion of rationality in his Theory of Communicative Action resists such critiques.

The contrast between communicative and strategic action mainly concerns how an action is pursued. Indeed, while these action orientations are mutually exclusive when seen from an actor’s perspective, the same goal can often be approached in either communicative or strategic ways. For instance, in my rural town I may have a discussion with neighbors whereby we determine we share an interest in having snow cleared from our road, and that the best way to do this is by taking turns clearing it. This could count as an instance of communicative action. But, imagine a wealthy and powerful recluse who is indifferent to his neighbors. He could just pay a snowplow to clear the road up until his driveway. He could also use his power to manipulate or threaten others to clear the snow for him (for example, he could call the mayor and hint he may withhold a campaign donation if the snow is not cleared). Strategic action is about eliciting, inducing, or compelling behavior by others to realize one’s individual goals. This differs from communicative action, which is rooted in the give-and-take of reasons and the “unforced force” of the best argument justifying an action norm.

Strategic action and purposive rationality are not always undesirable. There are many social domains where they are useful and expected. Indeed, they are often needed because communicative action is very demanding and modern societies are so complex that meeting these demands all the time is impossible. Speakers engaged in communicative action must offer justifications to achieve a sincerely held agreement that their goals and the cooperation to achieve them are seen as good, right, and true (see section 4). But, in complex and pluralistic modern societies, such demands are often unrealistic. Modern social contexts often lack opportunities for highly consensual discussion. This is why Habermas thinks weak communicative action is likely sufficient for low stakes domains where not all three types of validity claims predominate, and why strategic interaction is well-suited for other domains. For Habermas, modern societies require systematically structured social domains that relax communicative demands yet still achieve a modicum of societal integration.

Habermas takes the institutional apparatus of the administrative state and the capitalist market to be paradigmatic examples of social integration via “systems” rather than through the lifeworld. For example, if a state bureaucracy administers a benefit or service it takes itself to be enacting prior decisions of the political realm. As such, open-ended dialogue with a claimant makes no sense: someone either does or does not qualify; a law either does or does not apply. Similarly, in a clearly defined and regulated market actors know where market boundaries lie and that everyone within the market is strategically engaged. Each market actor seeks individual benefit. It makes little sense to attempt an open-ended dialogue in a context where one supposes all others are acting strategically for profit. Both domains coordinate action, but not through robustly cooperative and consensual communication that yields solidarity. Certainly, not all large-scale and institutionalized interaction is strategic. Some social domains like scientific collaboration or democratic politics institutionalize reflexive processes of communicative action (see section 5 on democratic theory). In such fora cooperation may yield solidarity across the enterprise. Even so, the systems integration like that found in bureaucracies or markets sharply differs from integration through communicative action.

It should be stressed that these are simply paradigmatic examples, and that the same social domain can be institutionalized differently across societies. It is therefore more useful to look at the coordinative media that are typically used to interact with and steer any given institutionalized system rather than positing a fictive typology of clear social domains wherein it is assumed that either strategic or communicative action takes place. Habermas identifies three such media: speech, money, and power. Speech is the medium by which understanding is achieved in communicative action, while money and power are non-communicative media that coordinate action in realms like state bureaucracies or markets. A medium may largely be used in one social domain but that doesn’t mean it has no role in others. While speech is certainly the main medium of healthy democratic politics, this doesn’t mean money and power never play a role.

This all might seem to imply that there is no single correct way for system and lifeworld to jointly achieve social integration. Indeed, the complementarity between system and lifeworld laid out in Theory of Communicative action is broad enough to accommodate a wide range of institutional pluralism with respect to the structure of markets, bureaucracies, politics, scientific collaboration, and so on. But, the claim that there is no “one size fits all template” for social integration should not be taken as the claim that system and lifeworld have no proper relationship. Socialization into a lifeworld precedes social integration via systems. This is true historically and at the individual level.

Moreover, Habermas claims the lifeworld has conceptual priority with respect to systems integration. His thinking runs as follows: the lifeworld is the codified (yet revisable) stock of mutual normative understandings available to any person for consensually regulating social interaction; it is the reservoir of communicative action. Systems integration represents carefully circumscribed realms of instrumental and strategic action wherein we are released from the full demands of communicative action. Yet the very definition and limitation of these realms always depends on communicative action regarding, for example, the types of markets or state administration a community wants to have and why. Without being rooted in the mutual understandings of the lifeworld, we would get untrammeled systems of money and power disconnected from the intersubjectively vouchsafed practical reason that Habermas thinks underpins all meaning. The organizing principles of systems themselves would stop being coherent. For instance, market competition makes sense against a backdrop of normative principles like fairness, equal opportunity to compete, rules against capitalizing on secret information, and so on. But if markets were so “no-holds-barred” that these principles no longer applied, then engaging in market activity would cease to make sense. Similarly, if markets were so regulated that there was no genuine risk or opportunity they would also start to loose coherence as an enterprise. In both these skewed hypothetical scenarios the system is rigged and thus, if there are functional alternatives, it is not worth participating in. This is a variant of his early anti-technocracy argument. Positing “objective necessities” like economic growth, social stability, national security and then circumventing communicative action veils disagreement on what type of growth, stability, and security is important for a given community and why. As such, systems designed to achieve these ends are primed to loose coherence and legitimacy based in widely accepted structuring principles.

Habermas thinks the lifeworld self-replenishes through communicative action: if we come to reject inherited mutual understandings embedded in our normative practices, we can use communicative action to revise those norms or make new ones. Mechanisms of systems integration depend on this lifeworld backdrop for their coherence as enterprises achieving a modicum of social integration. The trouble is that systems have their own self-perpetuating logic that, if unchecked, will “colonize” and destroy the lifeworld. This is a main thesis in Theory of Communicative Action: strategic action embodied in domains of systems integration must be balanced by communicative action embodied in reflexive institutions of communicative action such as democratic politics. If a society fails to strike this balance, then systems integration will slowly encroach on the lifeworld, absorb its functions, and paint itself as necessary, immutable, and beyond human control. Current market and state structures will take on a veneer of being natural or inevitable, and those they govern will no longer have the shared normative resources with which they could arrive at mutual understandings about how they collectively want their institutions to look like. According to Habermas, this will lead to a variety of “social pathologies” at the micro level: anomie, alienation, lack of social bonds, an inability to take responsibility, and social instability.

In Theory of Communicative Action Habermas pins his hopes for resisting the colonization of the lifeworld on appeals to invigorate and support new social movements at the grassroots level, as they can directly draw upon the normative resources of lifeworld. This model of democratic politics essentially urges groups of engaged democratic citizens to shore up the boundaries of the public sphere and civil society against encroaching domains of systems integration such as the market and administrative state. This is why his early political theory is often called a “siege model” of democratic politics. As section 5 will show, this model was heavily revised in Between Facts and Norms. Before turning to that work, we must flesh out discourse ethics—an idea that figured into Theory and Communicative Action but which was only fully developed later.

4. Discourse Ethics

Habermas’s moral theory is called discourse ethics. It is designed for contemporary societies where moral agents encounter pluralistic notions of the good and try to act on the basis of publically justifiable principles. This theory first received explicit and independent articulation in Moral Consciousness and Communicative Action (German 1983, English 1990a) and Justification and Application (German 1991a, English 1993), but it was anticipated by and depends on ideas in Theory of Communicative Action. The overview that follows draws upon these works. Much like the prior section, it only traces the broad outline of discourse ethics.

Discourse ethics applies the framework of a pragmatic theory of meaning and communicative rationality to the moral realm in order to show how moral norms are justified in contemporary societies. It could be seen as a theory that uncovers what we pragmatically do when we make and defend the moral validity claims underlying and manifested in our norms. Yet, we need to be careful with this characterization. Because of its cognitive commitments to moral learning and knowledge discourse ethics cannot simply be a reconstructive description of how it is we practically avoid conflicts and stabilize expectations in post-conventional social contexts. It is also an attempt to provide a formal procedure for determining which norms are in fact morally right, wrong, and permissible. Discourse ethics is squarely situated in the tradition of Neo-Kantian deontology in that it takes the rightness and wrongness of obligations and actions to be universal and absolute. On such a view, the same moral norms apply to all agents equally. They strictly bind one to performing certain actions, prohibit others, and define the boundaries of permissibility. There is no “relative” validity of genuinely moral norms even though, as we shall see, they can be embedded in social contexts that have consequences for their application. As long as these caveats are kept in mind we can understand discourse ethics by analyzing the practice of making and defending validity claims and how there are certain conditions of possibility tacitly underpinning and enabling this practice.

What are the conditions that enable this practice? As touched on above, Habermas posits certain unavoidable pragmatic presuppositions of speech which, when realized in discourse, can approximate a counterfactual ideal speech situation to greater or lesser degrees (1971; MCCA, 86). Discourse participants need to presuppose these conditions in order for the practice of discursive justification to make sense and for arguments to be truly persuasive. Four of these presuppositions are identified as the most important: (i.) no one who could make a relevant contribution is excluded, (ii.) participants have equal chances to make a contribution, (iii.) participants sincerely mean what they say, and (iv.) assent or dissent is motivated by the strength of reasons and their ability to persuade through discursive argumentation rather than through coercion, inducement, and so on (BNR, 82; TIO, 44). The point is not that actual discourses ever realize these conditions—this is why the ideal speech situation is best understood as a counterfactual regulative ideal. Rather, the point is that the outcomes of any discourses are only reasonably taken to be “valid” (empirically true, morally right, and so forth) under the presumption that these conditions have been sufficiently met. As soon as a violation is discovered this casts doubt upon the validity of the discursive outcome.

In addition to these pragmatic presuppositions Habermas proposes his discourse principle (D). This principle is supposed to capture the type of impartial, discursive justification of practical norms required in post-conventional societies: “only those action norms are valid to which all possibly affected persons could agree as participants in rational discourse”. (BFN 107; TIO 41) While (D) was initially framed as a principle for moral discourses it was soon revised to the more general form above, as there are many practical norms concerning interpersonal interaction that are not directly moral even if they must be compatible with morality. Yet even in its broadened form it is crucial to note that (D) only applies to discourses concerning practical norms about interpersonal behavioral expectations, not all discourses about theoretical, aesthetic, or therapeutic concerns (which may or may not involve interpersonal social interaction). The guiding thought is that if discourses about an action norm are carried out in a sufficiently ideal manner and they yield consensus then this is a good indication the norm is valid. The principle does not hold that consensus reached through discourse constitutes validity, nor that whatever norm people coalesce around after discourse that looks sufficiently ideal is assured to be valid. Rather, (D) simply holds that consensus about a norm can be a good test of validity if it has been achieved in the right type of discursive way. It is important to note that, because of its very broad scope, (D) mainly functions by pointing out invalid norms. By itself the discourse principle cannot tell us which norms are valid. It can only help us identify norms that are good candidates for validity.

Moreover, before the validity of an action norm can be assessed, we need more details on the types of discourse and validity claims at issue (TIO 42). Within his project of discourse ethics Habermas identifies moral, ethical, and pragmatic discourses (JA 1-17; MCCA 98). Each type deploys practical reason differently, framing and analyzing questions under the rubrics of the purposive (practical), the good (ethical), or the just (moral). The language of differing discourse “types” should not be taken to mean that norms come prepackaged in distinct kinds. Instead, any norm can be discursively thematized in any of these ways and should not be arbitrarily limited to a given type. With that caution in mind, we can begin to understand discourse types and the norms they produce.

Ethical discourses are a good place to start. For, while they are constrained by the outcomes of moral discourses and therefore not foundational, our prior discussion of the lifeworld provides an apt segue. Ethical discourses are paradigmatically about clarifying, consciously appropriating, and realizing the identity, history, and self-understanding of a group or individual. They make validity claims to authenticity rather than truth or rightness. They also involve value judgments about a particular social form or practice concerning the good life in a community. This is one reason why the outcomes of ethical discourses will have relative validity: they are meant to redeem validity claims for actors in some community or another. Another reason is that values differ from the types of generalizable or universalizable interests embodied in moral norms. While moral norms are supposed to strictly oblige agents to either do or not do some action, values admit of degree. While moral norms express principles backed by reasons, values are affective components of meaning acquired in virtue of living in a given social context. They are connected to reasons but not reducible to them. Values can orient us to goals, aid motivation, and help successfully navigate the lifeworld but cannot ground moral obligations by themselves. Values attract or repel but do not persuade; they can provide motivation to “do the right thing”—to have the will to follow a moral insight—but they do not constitute or even always help us discern what “the right thing” is (BFN 255).

Ethical discourses are rooted in ethicality (Sittlichkeit), which is distinct from morality (Moralität). Like many philosophers, Habermas separates the realm of the right from the realm of the good. Following a loosely Hegelian terminology, he parses this as the difference between morality and ethicality. Ethicality is a way of life composed of both cognitive and affective elements as well as more structural elements that reproduce this way of life: laws, institutions, conventions, social roles, and so forth. It is particularistic in that it defines goals in terms of what is good for a group as a whole and its members. As Habermas believes in George Herbert Mead’s model of “individuation through socialization,” ethicality is deeply engrained and connected to the lifeworld. No one can simply drop their internalized ethical perspective just as no one can simply step out of the lifeworld they have inherited. Individuals are always in some sense bound up with the identity, practices, and values of their upbringing and traditions even if they come to largely reject them. But, as was clear from Habermas’ critique of Gadamer, ethical perspectives do not determine us. Ethical discourses explain how this is by mediating between inheritance and transcendence. While we inherit and internalize an ethical perspective as individuals, we can always question parts of it that we wish to challenge, refashion, or reject for lack of sufficient reasons underwriting certain norms.

This dialectic between the ethicality we internalize through socialization and the way in which we wish to consciously reappropriate and (dis)own portions of such ethicality helps to explain why, in contrast to other discourse types, Habermas pays a great deal of attention to ethical discourses at both the individual and group levels. Ethical discourses at the individual level are called ethical-existential while ethical discourses at the group level are referred to as ethical-political discourse. For example, an individual considering a certain profession would engage in an ethical-existential discourse (for example, is this profession right for me given my character and goals?), while a polity considering whether certain policies express their collective interest, identity, and values would engage in an ethical-political discourse (for example, does this policy align with the collective identity and commitments we have had and how we want to appropriate them moving forward?).

There are two key points about these levels. First, the outcomes of such discourses are constrained by morality irrespective of what would be authentic at individual or group levels: an individual cannot simply decide to become a serial killer just as a country cannot simply enact a policy that has patently immoral consequences (for example, for those outside it). While Habermas thinks it is important to account for the way in which morality is embedded in social contexts through ethical discourses, he is staunchly opposed to postmodern or communitarian takes on morality and justice. Second, there will often be a reflexive interplay between these two levels of ethical discourse. Discourses about what it means to genuinely inhabit a collective identity can impact the ordering and strength of the values held by individuals, and discourses about who one fundamentally is and wishes to be can, through resistance to dominant interpretations of traditions and highlighting unacknowledged injustices, impact how others in a collectivity appropriate their identity and normative practices moving forward. This interplay is bookended by broader moral discourses at both levels, thereby helping the outcomes of such discourses stay in the realm of permissibility.

Pragmatic discourses are similar to ethical discourses in that they start from the teleological perspective of an agent who already has a goal. But in contrast to the reflexive, clarifying, and potentially transformative self-realization and collective self-determination of ethical discourses, pragmatic discourses simply start with a goal of presumed value and set about realizing it. This goal may involve identity and values but it could also refer to more pedestrian concerns and interests. Because the goal is presumed to be worthwhile the values, interests, or goals at issue show up as relatively static. Pragmatic discourses simply focus on the most efficient way to realize or bring about a goal, and their claim to validity concerns whether or not certain strategies or interventions in the world are likely to produce a desired result. As Habermas puts it, pragmatic discourses correlate “causes to effects in accordance with value preferences and prior goal determinations” so as to generate a “relative ought” that expresses “what one “ought” or “must” do when faced with a particular problem if one wants to realize certain values or goals” (JA 3). The “ought” is relative because it is something akin to a rule of prudence that depends on whether an agent happens to have a certain interest or find a goal worth pursuing.

Finally, we turn to what might be seen as the most important type of discourse: moral discourses. Moral discourses are broader in scope and establish stronger validity claims than either ethical or pragmatic discourses. They seek to discern and justify norms that bind universally rather than simply in the confines of a specific community or because an agent happens to find a goal valuable. These norms have binarily coded, unconditional validity instead of the gradated, relative validity of the outcomes produced by pragmatic and ethical discourses.

In order to discursively discern this non-relative sense of moral validity Habermas proposes a separate principle, his principle of universalization (U), for discourses about moral norms: “A norm is valid when the foreseeable consequences and side effects of its general observance for the interests and value orientations of each individual could be jointly accepted by all concerned without coercion” (TIO 42). While (U) has gone through several different formulations, the basic idea is that for whatever valid moral norms there are, such norms can be accepted by all affected persons in a sufficiently ideal discourse wherein they assert their own interests and values. (U) checks if the norms we take to be moral actually are in virtue of whether or not they are universalizable. If they are not universalizable, they cannot be moral norms. Beyond this basic characterization there are some interpretive issues with (U). Three are worth brief focus: its apparent reference to consequences, where (U) comes from, and the role of interests.

First, in the version of (U) above, it is easy to mistake the “foreseeable consequences and side effects” clause with the addition of a mild consequentialist constraint. Given Habermas’ deontological commitments this would be odd. Instead, the clause builds in a “time and knowledge index” so that it does not make impossible demands on moral agents. Fully satisfying (U) would require discourse participants who had unlimited time, complete knowledge, and no illusions about their own interests and values; it would require participants who transcended their human condition. As (U) must be usable in the real world it can only ask that moral discourse participants attempt to account for the “anticipated typical situations” to which a norm would apply when they attempt to justify any moral norm (JA 37). The circumscribed task of (U) is key: it is only supposed to justify moral norms in the abstract. While this justification may point towards “typical” cases of application, it does not predetermine all applications. What about novel, atypical, or completely unforeseen situations to which the norm might unexpectedly apply?

Following Klaus Günther, Habermas claims that moral (and legal) decisions in specific cases require a logic of appropriateness found in discourses of application (Günther 1993; JA 35-37). Discourses of application look at a concrete case and survey all potentially applicable norms, relevant facts, and circumstances. They try to offer exhaustive or “complete” descriptions of a situation so as to decide among multiple, sometimes competing or only partly applicable norms that might regulate a situation. There is a division of labor between the two types of recursively related discourse: whereas discourses of justification lay out the reasons why we should endorse a norm as a general rule with reference to typical situations, discourses of application seek to apply norms to concrete cases which may be wholly new or defy expectations. As fallible agents we can make a variety of different errors in our discursive justification of a norm or fail to anticipate new situations or altered understandings of facts, values, and interests—a failure that would be revealed in application. Habermas calls this the “dual falliblist proviso,” and it instills an awareness that moral justification is an ongoing project (TJ 259). The recursive interplay of justification and application is supposed to progressively address prior errors and oversights. New insights gleaned from application discourses or novel situations can lead us to revisit norms whose justification was taken for granted, and this refinement of our understanding regarding how and why norms are justified will help us apply them better. If we had providential foreknowledge we would not need application discourses. But since we are fallible the “foreseeable consequences and side effects” should be seen as referring to an in-built “time and knowledge” index for the outcomes of justificatory discourses, which are then supplemented by application discourses that may impact the formulation of the initial norm.

The second interpretive issue is where (U) comes from. Habermas initially claimed that (U) could be formally deduced from a combination of the pragmatic presuppositions of discourse and (D), but weakened this claim shortly thereafter (JA 32 n17). Instead of deriving (U) from a formal deduction or informal inferences he now claims—using a term coined by Peirce—we arrive at (U) “abductively” (TIO 42). To arrive at something abductively is to suggest that we first observe a phenomenon (moral norms) and adopt a “best guess” hypothesis to explain it (the moral principle), which can then be subjected to further inductive testing (Ingram 2010, 47; Finlayson 2000a, 19). In short, (U) is now proposed as the best candidate principle for helping to explain moral normativity. To buttress the plausibility of this claim Habermas has also fallen back on his theory of social evolution and the “weak…notion of normative justification” in post-conventional contexts (TIO 45). Indeed, he now often speaks about (U) as following from the type of impartial justificatory procedure appropriate to a post-conventional condition that seeks to discern norms that are “equally in everyone’s interest,” “generalizable,” or “universalizable” (RPT 367; BFN 108, 460; TJ 265). The reference to interests leads us to the third interpretive issue with (U).

Early formulations of (U) only refer to interests (MCCA 65, 120). The inclusion of value orientations is potentially confusing. As noted above values are not necessarily cognitively grounded. As Habermas has always presented his moral theory as cognitivist it would be odd to give values such a central role. It seemed to make sense that initial formulations of (U) only included interests, as Habermas has defined interests in a cognitive fashion (on interests as “reasons to want” see Finlayson, 2000b). Bolstering an interpretation of (U) that puts priority on (cognitive) interests he has stated that “(U) works like a rule that eliminates as non-generalizable content all those concrete value orientations with which particular biographies or forms of life are permeated” (MCCA 121), and that the specific part of (U) referring to “uncoerced joint acceptance” means that any reasons put forth in moral discourse must “cast off their agent-relative meaning and take on an epistemic meaning from the standpoint of symmetrical considerations” (TIO 43). Moreover, the interpretive secondary literature has often emphasized the centrality of interests over values and focused on how Habermas often talks about “generalizable” or “universalizable” interests as the distinctive feature that moral norms secure (Heath 2003; Finlayson 2000b; Lafont 1999). How then should the inclusion of value orientations be understood?

Habermas has said he included value orientations in (U) so as to “prevent the marginalization of the self-understanding and worldviews of particular individuals and groups” (TIO 42). This does not mean that values are on a par with interests. Instead, his point is that interests and values are always bound together. Value orientations exert at least some indirect influence on moral discourses insofar as they subtly influence the very interpretation of our own interests (JA 90). Proceeding as if value orientations can be expunged from moral discourses may in fact introduce discursive blind spots. Indeed, candor about one’s own value-orientations may be crucial since the impartiality of (U) involves “generalized reciprocal perspective-taking” that cuts both ways: it orients participants towards “empathy for the self-understandings” of others as well as towards “interpretive interventions into the self-understanding of participants who must be willing to revise their descriptions of themselves and others” (TIO 43). The essential point is that even though “some of our needs are deeply rooted in our anthropology” and can be seen as basic generalizable interests shared by all, we must nevertheless avoid “ontologizing generalizable interests” into “some kind of given” because even “the interpretation of needs and wants must take place in terms of a public language” wherein our own self-understandings are open to revision (TJ 268; JA 90).

A final interpretive issue that merits attention is the precise status of moral rightness. Habermas has always held that morality and truth are analogous in that both are cognitive, binarily coded, and subject to learning processes. Moreover, he has always been sharply critical of approaches that would reduce morality to a purely subjective or relativized affair. Yet, given that rightness is not reducible to truth and that Habermas has repeatedly disclaimed a moral realist reading of his theory, it is unclear precisely how far this analogy is supposed to extend. This is not only because there are a variety of differences between empirical and moral knowledge but also because Habermas has changed his theory of truth over the years—moving from a consensus theory that identified truth with ideal warranted assertability to a “pragmatic epistemological realism that follows in the path of linguistic Kantianism” (TJ 7). Early articulations of discourse ethics seemed to admit of interpretations wherein rightness was a justification-transcendent concept that couldn’t be captured by ideal warranted assertability. This led some interpreters to interpret Habermas’ moral theory as at least tacitly committed to some variant of internal moral realism (Davis 1994, Kitchen 1997, Lafont 1999 and 2012, Smith 2006, Peterson 2010 ms.). But, in the course of resisting this reading, Habermas has explicitly claimed that, “ideally warranted assertability is what we mean by moral validity” (TJ 258, 248). He now wishes to articulate a notion of moral rightness that can be cashed out in terms of a pragmatist constructivism that also avoids the perils of relativism and skepticism—that is, which maintains an anti-realist account of moral rightness that still resists collapsing into a form of moral consensus theory. Whether he succeeds in this endeavor is a hotly debated topic.

5. Political and Legal Theory

In post-conventional, pluralistic societies ever fewer norms can be underwritten by a shared ethos embodied in a community’s ethicality or collective identity. Moral norms cannot pick up the slack to achieve social integration and cohesion by themselves. Because moral discourse is demanding and aims at what is equally in everyone’s interest, few moral norms will be seen as justified across the world or even in a given society (JA 91, TJ 265). And, as Habermas noted in Theory of Communicative Action, while systems like the bureaucratic state and economy can achieve stability and coordinate expectations through money and power, this can erode mutual understandings and social solidarity; markets and bureaucracies tend to displace and colonize the lifeworld. Indeed, his political essays from this period cast democratically created law as holding the line against system encroachments in a siege mentality (BFN 486-89, Habermas 1992b 444). This may leave us asking: What other resources exist for legitimate social integration?

In Habermas’ clearest statement of political theory, Between Facts and Norms, modern law shows up as precisely the resource we are looking for. If law is linked to democratic political structures in the right way it confers legitimacy on legal norms, thereby fostering social integration and stability. Broadly speaking, the relation between legal legitimacy, procedural-democratic popular sovereignty, and public discourse is nested and reflexive: legitimate law must be rooted in democracy, which itself depends upon a robust public sphere. A vibrant democratic public sphere is what allows for the revision and questioning of prior law. Conceived of in this way modern law is a “transformer” that preserves the normative achievements and mutual understandings that issue from the collective self-determination of the public sphere by translating them into legitimate, binding decisions that can “counter-steer” against the logics of the state and market. As long as legal decisions are arrived at in the right type of procedural, discursive fashion there is a presumption in favor of their rationality and legitimacy. And, as long as the public sphere continues to be a robust and open forum of contestation, any prior decisions are revisable such that there is a circulation between the informal public sphere and more formal institutions of the state. This focus on the transformative, mediating nature of law revises the prior “siege” model of democratic law into a procedural “sluice” model (Habermas 2002, 243). While the prior model saw democratically generated law as a defensive dam or shield against the demands of systems, the new model sees a certain type of lawmaking as mediating the circulation between lifeworld and system in a way that produces legitimate and binding legal norms. Modern law works with systems and alongside post-conventional morality to stabilize social expectations and resolve conflicts.

We can start to understand the relation between law, democracy, and the public sphere by focusing on legal legitimacy and democracy. Between Facts and Norms posits a tension within law itself, as well as an internal relation between modern law and democracy. To function, all law must demand compliance, threaten coercion, and (however tacitly) appeal to an underlying normative justification. Law is therefore characterized by a tension between “facticity” and “validity” insofar as it must be recognized as factually efficacious and normatively justified. This tension helps explain the relation between law and democracy in contemporary contexts. Pre-modern law appealed to God, nature, human reason, or shared culture for its justificatory backing. In post-conventional societies the fact that law is coercible and changeable yet merely rooted in fallible humans is laid bare. For Habermas, the underlying normative justification can now only be understood as “a mode of lawmaking that engenders legitimacy” (IO 254). The thought is that democracy is the only mode of lawmaking that is up to this legitimacy-engendering task. In light of these connections it is fruitful for present purposes to focus on “deliberations that end in legislative decision making” rather than treating political and legal legitimacy separately (BFN 171; Bohman and Rehg 1999, 36).

The democracy Habermas has in mind differs from overly populist varieties. He is clear that the legitimacy underwriting lawmaking must be twofold: law must not only express the democratic will of the community but must also be non-subordinately “harmonized” with morality (BFN 99, 106). This non-subordinate concordance of legality and discourse theoretic morality is the hardest sense of legitimacy to explain and the easiest to overlook, so it is fruitful to start there. For Habermas, “legal and moral rules…appear side by side as two different but mutually complementary kinds of action norms” in post-conventional societies. In order to also account for “the idea of self-legislation by citizens” we must avoid a “subordination of law to morality” along the lines of classical natural law theory (BFN 105-6, 120; IO 257). Yet it seems puzzling to hold that democratically determined law should be compatible with but not subordinate to discourse-theoretic morality. What about cases where law and morality seem to conflict? There are a few answers that highlight unique features in Habermas’ theory. At a general level these answers take the same shape: while there are many ways that legal systems can square with moral permissibility, there are nevertheless structural and conceptual features endogenous to processes of modern procedural-democratic popular sovereignty that, at least at an abstract level, tend to harmonize legal norms with moral permissibility. This avoids concerns with morality trumping legality in an exogenous manner.

One reason to expect that democratically legitimate law and moral permissibility will be at least in principle commensurable is that they are both rooted in (D). We saw above how the moral principle (U) expresses the way (D) is specified for moral discourses. Habermas also proposes a principle of democratic legitimacy (L) that expresses the way (D) is specified for political discourses producing law. This principle is rooted in (D) in virtue of what Habermas calls the “legal form.” When (D) is deployed in discourses aimed at producing legal norms for regulating common life together it is understood these norms will be cloaked in the legal form: the set of formal and functional features characterizing modern positive law. Modern positive law is enacted and conventional, enforceable and coercive, rooted in institutions with some reflexivity, tailored to protect individuals through rights, and limited in scope (BFN 111-118, IO 256). If law is to function as a tool for the consensual regulation of social conflicts and the integration of society, then it needs to take on this form.

The principle of democratic legitimacy (L) is part of the normative backing that is supposed to emerge, albeit in nuce and very abstractly, from the historical interpenetration of (D) and the legal form that has culminated in the structures of modern democratic state. It claims, “only those statutes may claim legitimacy that can meet with the assent of all citizens in a discursive process of legislation that in turn has been legally constituted” (BFN 110; constituted is sometimes translated as organized). This principle captures how (D) is specified for political discourses so that democratic procedures underwrite the legitimacy of legal norms. Legitimacy does not arise out of formal legality alone; it needs the added normative backing of democracy. The idea of (L) is that compliance with the law must be rational and rooted in the law’s perceived legitimacy. To achieve this, political discourses must be structured in a way where formal legislative institutions accurately represent and address deliberations going on in the informal public sphere, and where there are institutionalized procedural mechanisms organized in a way to help screen out weak arguments (BFN 340). The details of this structuring will be clarified below, particularly in relation to the process model and the relationship between democracy and the public sphere.

However, the mere fact that (U) and (L) are rooted in (D) does little to ensure the commensurability of law and discourse-theoretic morality. Fortunately, there are additional reasons why we might expect such a harmonization. Habermas thinks the combination of (D) and the legal form in (L) also supplies us with the resources to discern the conceptual kernels of an abstract “system of rights” that will be inscribed in the core structures of any legitimate self-determining political community. The basic argument is that in order for (L) to be realized it must make reference to a concrete community engaged in self-determination through modern law. In such communities equal legal personhood takes on the role of a “protective mask,” a formal identity mainly defined by rights instead of duties, that crystalizes around individual moral persons (BFN 531, 112). This legal identity is constituted by a core of rights that secure the status and private autonomy of individuals such that they can not only live their individual lives but also genuinely deliberate (on equal footing, free from coercion, and so forth) about the terms of shared life together. Yet, these individual rights cannot be effective unless they presuppose other rights to participation and basic material provision—rights that secure public autonomy. The claim is that the legal manifestations of private and public autonomy, often expressed in the idioms of human rights and popular sovereignty, mutually presuppose one another. What results is an abstract system of rights made up of five core types. What are these right types?

First, in order to discursively engage one another people need to be reasonably secure. Therefore, rights that guarantee the status of individual persons are required. Three types of rights jointly achieve such protection: (i.) the right to equal liberties compatible with those of others, (ii.) rights of membership that determine the extent of the community, and (iii.) rights of due process that assure each person is treated the same and equally protected under the law (BFN, 133-134). These rights secure the individual private autonomy prioritized by classical liberalism. But any community engaged in specifically democratic self-determination must also safeguard the ability to actively use the freedom afforded by this secure status to deliberate, disagree, and come to mutual understandings in concert with others. If individual rights are to be effectively used (iv.) rights of communication and political participation that formally secure equal opportunity and access to the political process are required. These rights secure the collective public autonomy prioritized by classical republicanism. They enable discourses in the public sphere as well as equal access to channels of political say and influence; they enable democratic popular sovereignty by making sure everyone can participate on fair and equal terms, and that information, innovative ideas and arguments about how to structure common life are kept freely circulating and scrutinized. Lastly, these four right types are insufficient if basic needs are threatened or go unmet. Formal guarantees of freedom and participation mean little if they amount to the freedom to starve. So, as a final step, Habermas proposes some measure of (v.) social, technological, and ecological rights securing the basic conditions of a minimally decent life. Democratic states have often done a poor job fully realizing these rights, but the claim is simply that these general right types are conceptually required if self-determination through law is to achieve the dual sense of legitimacy noted above. In this same spirit of clarification, it is also important to note that the abstract system only identifies certain right types, not some list of concrete rights. Communities have incredibly wide interpretive latitude when it comes to how these rights show up. Habermas often refers to rights as “unsaturated placeholders”; it is largely up to communities to “fill in” their content.

The expectation of a non-hierarchal harmonization of morality and legality may now seem less puzzling. Ideally, lawmaking discourses approximate (L) against the backdrop of an abstract system of rights inscribed in the political structures of a democratic community. This places some broad constraints on how deliberations unfold and the type of norms they can produce. Moreover, apart from these structural background constraints political discourses are also themselves unique. In contrast to moral discourses focused on “the interest of all” or ethical discourses focused on authentic self-realization, political discourses aimed at self-determination through law reference a plethora of different concerns, and do so in an internally-structured way aimed at carving out a space (defined by rights) where moral personhood and ethical authenticity can flourish (BFN 531).

While deliberations about “political questions are normally so complex that they require the simultaneous treatment of pragmatic, ethical, and moral aspects” of issues, they ideally unfold along a ‘process model’ where there is a structured interplay between pragmatic, ethical, and moral concerns as well as procedurally regulated bargaining (BFN 565, 168). The basic idea is that for any provisional policy conclusion there is an obligation to respond to objections stemming from more abstract aspects of an issue or levels of discourse; discursive processes cannot be arbitrarily limited. For instance, participants in a discourse on immigration policy cannot simply consult ethical concerns regarding their community’s authentic identity but yet refuse to listen to moral discourses that bear on such policies. Any moral aspects need to be explicitly discussed, and they filter or check more particularistic issue-aspects and discourses (Cf. BFN 169 and the emendation at 565 on whether to refer to the structured interplay as between discourses or aspects of a case). The abstract system of rights and the process model mean that, within political deliberations about how to structure common life together, it will in principle always be possible for more abstract moral discourses to weakly check pragmatic and ethical-political discourses. And, this checking will be endogenous to structures of democratic self-determination.

So far the focus has been on the relation between law and democracy without much reference to the public sphere. However, it is hard to overstate the importance Habermas places on democratic deliberation rooted in the public sphere. None of the formal or structural mechanisms mentioned so far guarantee that public political discourses or laws will be specified in a given way. There is assurance neither that the abstract system of rights or (L) will be meaningfully realized, nor that the interplay of various types of concerns in political discourses will unfold along the process model. Everything hangs on the quality and institutional structuring of deliberation in the public sphere. Indeed, the primary reason why democracy confers legitimacy upon legislative outcomes is that it is rooted in a model of distinctly procedural popular sovereignty that simultaneously expresses the will of the community and that leads to more rational outcomes. An analysis of the specific way in which democracy and the public sphere are related on Habermas’ model is the best way to understand how the democratic mode of lawmaking underwrites the legitimacy of legal norms.

In Between Facts and Norms Habermas proposes a “two-track” model of democratic politics outlining a circulation of political power engendering legitimacy. He divides the political public sphere into informal and formal parts. The informal public sphere includes all the various voluntary associations of civil society: religious and charitable organizations, political associations, the media, and public interest advocacy groups of all varieties (BFN 355). In this sphere public political deliberation is free and unorganized. Through this open clash of views and arguments individuals and collectivities can both persuade and be persuaded, thereby contributing to the emergence of considered public opinions. In contrast, the formal public sphere includes institutionalized forums of discourse and deliberation like congress, parliament, and the judiciary as well as more peripheral administrative and bureaucratic agencies associated with state structures. This sphere is supposed to be organized in such a way that it renders decisions reflecting the considered public opinions of the informal public sphere. Formal institutionalized decision making bodies must be porous to results of the informal public sphere.

The informal public sphere is the key forum for generating a type of normative power that can integrate society through mutual understandings and solidarity rather than through money or administrative-bureaucratic power. When discourse participants in the informal public sphere freely reach mutual understandings about how to regulate the terms of shared life together “communicative power” emerges (Flynn 2004 discusses communicative power’s precise locus). Communicative power arises from jointly authored norm expectations that are cognitively grounded in the force of better reasons and motivationally grounded (albeit weakly) in mutual recognition and collective ethical discourses. Cognitively speaking, free communication in the public sphere can foster “rational opinion and will-formation” because “the free processing of information and reasons, of relevant topics and contributions is meant to ground the presumption that results reached in accordance with correct [discursive] procedure are rational” (BFN 147). This acceptance also provides weak motivation: in accepting a norm’s validity claim one accepts the background understandings and reasonins underlying it which can motivate relevant circumstances. Moreover, because this mutual understanding was presumably reached through persuasive discourse where reasoned dissent was (and remains) a real possibility, norm acceptance can also motivate in a spirit of anti-paternalistic empowerment: parties recognize each other as accountable and responsible for their actions in accord with a norm until new counter-reasons are discovered. While they may be aware of counter-inclinations and motives that are not backed by good reasons, they take one another to be competent, responsible agents who can choose to act on rationally backed norms (Günther 1998). Yet, because the motivation accompanying cognitive insight is fragile and weak, communicative power must also be rooted in a community with a shared ethical-political identity and legitimate law so that motivational deficits can be met with supplemental resources of a shared life and law.

Communicative power can only arise if the informal public sphere has certain characteristics. First and foremost, it must be relatively free of distortions, coercion, and silencing social pressures so that communication can work as a filter for fostering more rational individual and collective will formation (BFN 360). The public sphere also needs to accurately function as a “context of discovery” wherein problems that affect large segments of the public are identified and taken up for discussion and resolution in discourse. Moreover, civil society must be animated by a political culture so that members actively participate in voluntary associations and public discourse about the terms of common life together (BFN 371). Normative power potentials cannot be generated if members largely retreat into private concerns or a society is internally segmented and riven with special interests (Flynn (2004) 439-444; Bohman and Rehg (1999) 41-42). Clearly, if the public sphere is to remain healthy then the media’s role in fostering accurate information and timely mass communication will also be crucial (EFP, 138-183).

The political institutions of the formal public sphere are arranged so as to be porous to the inputs of the informal public sphere, to further refine and focus public opinion, and to make decisions. Building on the work of Bernhard Peters, Habermas maintains that modern constitutional democracies are set up so communication and decision-making flow from the “periphery” of the informal public sphere into the “center” constituted by those formal political institutions that create, enforce, clarify, or implement the law (BFN 354). In a well-functioning democratic regime there will be structural “sluices” or “floodgates” embedded in the institutions of the administrative state (legislature, judiciary, and so forth) so that the circulatory flow of power proceeds in the right direction, from the periphery to the center.

The thought is that the political community should “program” and direct the institutions of the administrative complex, not the other way around (BFN 356). If the state or other powerful actors reverse this flow by simply positing new laws or rules and either demanding compliance or inducing it in some other way, then this exercise of non-communicative administrative-bureaucratic power would be neither legitimate nor stable. Habermas claims the “integrative capacity of democratic citizenship” erodes to the extent that the circulation of political power is interrupted or reversed. Only communicative power has the legitimating force needed so that a community can both author and rationally abide by the law. Democratic lawmaking is the key institution that “represents…the medium for transforming communicative power into administrative power” while preserving its normative potential (BFN 169, 81, 299). Democratically generated law ensures normative power potentials flow in the right direction and that they are maintained when implemented by institutions of the administrative state.

This account of procedural-democratic collective self-determination should not be confused with traditional national self-determination. Habermas rejects models of sovereign collective self-determination that presuppose a nation or people with a homogeneous identity and interests, as well as models where “a network of associations” stands-in for this (imaginary) collective-self (BFN 185, 486). Instead, in modern constitutional democracies the “idea of popular sovereignty is… desubstantialized [and]…not even embodied in the heads of the associated members.” Popular sovereignty “is found in those subjectless forms of communication that regulate the flow of discursive opinion- and will- formation in such a way that their fallible outcomes have the presumption of practical reason on their side” (BFN 486). Insofar as we can speak about the will of a community it is an anonymous and subjectless public opinion emerging out of the discursive structures of communication themselves (BFN 136, 171, 184-186, 299, 301). This unique interpretation of popular sovereignty helps explain some final aspects of Habermas’ political theory: his views on religion and the public sphere, his constitutional patriotism, and his vision of politics beyond the nation-state.

In early writing Habermas claimed that as the rationality and pluralism of enlightenment ideals slowly took hold in modern societies the mythic explanations of religion would be less important. But, he slowly came to revise his view on religion in modern societies. At present, the way he sees religion fitting into the public sphere of a liberal democracy is what is important. In liberal democracies, untrammeled populism is held in check by not only individual rights but also the very nature of public debate: citizens collectively self-determine through persuasion and rational argumentation. To do this amidst the pluralism of modernity, the laws they make must be grounded in public reasons accessible to all. The question is what this means for religious citizens.

There have been a variety of answers. For instance, in Political Liberalism John Rawls held that liberal democratic citizens should ultimately only endorse policies that they can support on the basis of secular reasons. While these citizens may have religious reasons that favor a law or policy, when engaged in political debate they must eventually “translate” these reasons into terms that non-believers could accept. Habermas is sympathetic to the vision of liberal democracy animating this view of how religious citizens should act. Indeed, he criticizes thinkers like Wolterstorff who insist that religious citizens ought to be allowed to try to base coercive law on their own particularistic values and conception of the good. Nevertheless, he feels that placing the burden of “translation” onto religious citizens alone is somewhat misguided. Such an approach underestimates the ethical-existential importance of religion in some people’s lives—especially if it is bound up with the structure of their lifeworld and identity. As an alternative, Habermas proposes both religious and non-religious citizens be allowed to invoke any reasons for or against policies at the level of the informal public sphere, provided they take one another’s claims seriously and do not dismiss them from the outset. But when it comes to the institutions of the formal public sphere concerning coercive lawmaking, justifications should only be based in reasons that all can accept.

This view is somewhat unsatisfying for several reasons: it simply moves the asymmetrical burden of translation “up a level,” it may run into concerns of a metaphorical split in identity, and it could even saddle non-religious citizens with undue burdens (Yates 2007, Lafont 2009). For present purposes, the most charitable reading is that Habermas assumes all democratic citizens have an obligation to adopt a thoroughly self-reflective attitude. Religious citizens must “self-modernize” insofar as they are expected to be open to things like the authority of science, the need for non-religious reasons backing coercive law, and the possible validity of claims made by other religions. But, this also means non-religious citizens must move beyond a dogmatic secularist understanding wherein it is impossible for religious claims to have any cognitive value whatsoever. Indeed, given that some fundamental moral notions—such as equal human dignity—have been inextricably tied to the history of world religions, he claims it is not always clear where the boundaries of the religious and secular are. Determining these boundaries (and what can count as publicly acceptable) may at times be a cooperative task wherein each side takes the claims of the other with some degree of seriousness (2006b, 45 and 2003b, 109).

Habermas’ reinterpretation of popular sovereignty also explains why he has adopted the theory of constitutional patriotism pioneered by Dolf Sternberger. Constitutional patriotism maintains that, in contrast to national identities of the past, modern political communities can base their collective identities around the unique ways they appropriate and embed the abstract, universalistic principles of democratic self-determination within their unique histories and traditions. On such a model, political allegiance can coalesce around “a particularist anchoring of…the universalist meaning of [principles such as] popular sovereignty and human rights” (BFN 500; L’i 308; BNR 106). This particularist anchoring would presumably include the way in which a community takes up the abstract system of rights, the process model, and (L). The claim is that the specific way a political community instantiates the “abstract procedures and principles” of the modern democratic state fosters the development of a “liberal political culture” that “crystalizes” around that country’s constitutional traditions, structures, and discursive fora (IO 118; DW 78). The integrative force that emerges against this backdrop is called civic solidarity, which Habermas characterizes as “an abstract, legally mediated solidarity among citizens… a political form of solidarity among strangers” (DW 79; BNR 22). This is essentially the integrative potential of democratic citizenship when it is actively used.

One assumption here is that “culture and national politics have become…differentiated” from one another; citizens can see themselves as part of a shared political culture precisely because they no longer see the state as a vehicle for realizing a homogenous, pre-political nation. While this is a far cry from empirical realities in many parts of the world, Habermas sees the European Union as illustrative in this regard. Even in a context that was once characterized by strong national identities (where the chances for such an identity might seem slimmer than in more multicultural contexts) we can start to see how “a common political culture could differentiate itself off from the various national cultures” and how “identifications with one’s own forms of life and traditions [could be] overlaid with a patriotism that has become more abstract, that now relates… to abstract procedures and principles” (NC 261; BFN 507, 465; IO 118; BNR 327; DW 78).

Finally, Habermas sees constitutional patriotism as a normative resource that could help to expand civic solidarity across political borders and uncouple legal structures from the nation-state so they could be scaled-up into new institutions of international law. Such developments would allow new forms of democratic self-governance above the nation-state at regional and global levels (DW 79). These post-national implications are naturally produced by Habermas’ core theoretical commitments. Deliberative democracy is committed to institutionalized discourse that in some way makes it possible for law to be justified to the persons who are affected by or subjected to it. Given increasing global interdependence this obviously pushes in cosmopolitan directions. However, at the same time, it is important to remember that communicative power must be rooted in a community with a shared ethical-political identity, and that constitutional patriotism is parasitic upon a particular political culture. This rootedness means that civic solidarity and new forms of self-governance can stretch, but only so far.

This anchored cosmopolitanism yields a multi-level constitutionalization of international law that aims at some measure of global governance without government. While Habermas’ account of such a multi-level system is only a sketch and many details need filling-in, the broad outline is clear. He proposes a system comprised of the “supranational” (global), “transnational” (regional), and national level political institutions with different roles. A supranational organization akin to a reformed United Nations is envisioned as securing international peace, security, and core human rights. At the mid-level, transnational authorities like the EU would tackle technical issues through coordinative efforts and political issues through negotiated bargaining among sufficiently representative regional regimes of commensurate stature. Finally, nation-states would retain their status as the locus of democratic legitimation. This would require the spread of democratic structures to each nation-state so that laws can reflect the will of the community and so that they could be reliably in line with the basic human rights secured by a supranational organization.

This vision of a multi-level political system for the constitutionalization of international law can be criticized as demanding both too much and too little. Habermas’ version of cosmopolitan deliberative democracy locates the touchstone of legitimacy in the fact that “citizens are subject only to those laws which they have given themselves in accordance with democratic procedure” (CEU 14). From this perspective of democratically legitimate law, the proposed system may demand too little. Despite Habermas’ insistence that negotiation between regional regimes could take place in a way that would not “impair deliberation and inclusion,” it is hard to see how such bargaining could really constitute a process where citizens give themselves the law through democratic procedures (CEU 19). From the perspective of rootedness in political culture, the multi-level system may also demand too much with the extension of civic solidarity to transnational regimes. Habermas clearly thinks there are limits to such an extension, as “the transnational extension of civic solidarity…comes to nothing…when it is supposed to assume a global format.” However, apart from the fact that neighboring countries might be supposed to have some minimal level of shared history and culture born out of territorial proximity and an interdependency of interests, it is unclear why this extension of solidarity would reach the levels needed to underwrite the democratic legitimacy of laws within transnational units of regional governance (CEU 62).

While Habermas is certainly aware of these criticisms, he is largely focused on defending his political theory in broad, systematic terms. If the broad normative outlines are correct then the overall theory will stand regardless of how the empirical details are filled in. Indeed, Habermas is rather unique among contemporary philosophers both in his systematic approach to large areas of theory and in his willingness to allow others to fill in the details of how particular claims might work. He has always insisted that philosophers do not speak from a privileged place of knowledge. The best that they can hope for is to articulate a theory that can be convincingly and rigorously tested and debated in the public sphere. We can perhaps understand not only his political theory, but several other theoretical projects in this spirit of a public intellectual putting forth a theory for testing and debate that requires further articulation by those who come after.

6. References and Further Reading

a. General Introductions to Habermas

The article presented a general and reasonably complete introduction to Habermas. However, given the breadth of his work and space constraints, the following should also be consulted:

  • 1978. McCarthy, Thomas. The Critical Theory of Jürgen Habermas.  The MIT Press.
  • 1988. White, Stephen K. The Recent Work of Jürgen Habermas. Cambridge University Press.
  • 2005. Finlayson, James Gordon. Habermas: A Very Short Introduction. Oxford University Press.
  • 2011. Fultner, Barbara (ed.) Jürgen Habermas: Key Concepts. Acumen Press.
  • 2014. Bohman, James and Rehg, William. Jürgen Habermas. Stanford Encyclopedia of Philosophy.
  • Thomas Gregersen maintains an online bibliography at the Habermas Forum.

The following printed bibliographies are also useful:

  • 2013. Corchia, Luca. Jürgen Habermas. A Bibliography: Works and Studies (1952-2013). Pisa (IT): Arnus University Books.
  • 2014. Müller-Doohm, Stephan. Jürgen Habermas – Eine Biographie. Berlin: Suhrkamp.

b. Introductory Books and Articles on Specific Themes

i. Biography

  • 2001. Matustik, Martin Beck. Jürgen Habermas: A Philosophical-Political Profile. Rowman and Littlefield.
  • 2010. Specter, Matthew G. Habermas: An Intellectual Biography. Cambridge University Press.
  • 2004. Wiggershaus, Rolf. Jürgen Habermas. Reinbek Bei Hamburg: Rowohlt.

ii. Linguistic Turn

  • 1994. Cooke, Maeve. Language and Reason. The MIT Press.
  • 1999. Lafont, Cristina. The Linguistic Turn in Hermeneutic Philosophy. Jose Medina (trans.). Cambridge University Press.
  • 2016. Lafont, Cristina. Jürgen Habermas in The Blackwell Companion to Hermeneutics, 440-445.

iii. Discourse Ethics

  • 1994. Davis, Felmon John. Discourse Ethics and Ethical Realism: A Realist Realignment of Discourse Ethics. European Journal of Philosophy, 125-142.
  • 1997. Rehg, William. Insight and Solidarity: The Discourse Ethics of Jürgen Habermas. University of California Press.
  • 2000a. Finlayson, James Gordon. Modernity and Morality in Habermas’s Discourse Ethics. Inquiry. 43, 319-40.

iv. Political Theory

  • 1994. Bohman, James. Review: Complexity, Pluralism, and the Constitutional State: On Habermas’s Faktizitat und Geltung. Law and Society Review, 897-930.
  • 2002. Discourse and Democracy: Essays on Habermas’s Between Facts and Norms, ed. Rene von Schomberg and Kenneth Baynes. SUNY Press.
  • 2010. Hedrick, Todd. Rawls and Habermas: Reason, Pluralism, and the Claims of Political Philosophy. Stanford University Press

c. Works Cited

Most of Habermas’s work can be found in German and English. After the original year of publication and title, see the square brackets for the English translation. For some texts a translation does not exist, only exists in part, or is divided between texts. This is denoted by an asterisk (*).

  • 1953. Mit Heidegger gegen Heidegger denken. Zur Veröffentlichung von Vorlesungen aus dem Jahre 1935. Frankfurter Allgemeine Zeitung. July 25, 1953. [English: 1977]
  • 1956. Der Zerfall der Institutionen (Arnold Gehlen). Frankfurter Allgemeine Zeitung. July 4, 1956.*
  • 1958. Philosopische Anthropologie: Ein Lexikonartikel. In: A. Diemer, I. Frenzel (eds.) Fischer-Lexikon Philosophie. Frankfurt am Main: Fischer. Pp. 18-35.*
  • 1962. Strukturwandel derffentlichkeit. Darmstadt: Luchterhand. [English: 1989]
  • 1967. Probleme einer philosophischen Anthropologie. Lecture transcript from the 1966/7 Winter semester at the University of Frankfurt (unauthorized edition).*
  • 1967. Zur Logik der Sozialwissenschaften. Tübingen: J.C.B. Mohr. [English: 1988a]
  • 1968a. Technik und Wissenschaft als Ideologie. Frankfurt am Main: Suhrkamp. [English: 1970, 1973b*]
  • 1968b. Erkenntnis und Interesse. Frankfurt am Main: Suhrkamp. [English: 1971b]
  • 1969. Protestbewegung und Hochschulreform. Frankfurt am Main: Suhrkamp. [English, 1970]
  • 1970 Toward a Rational Society, J.J. Shapiro (trans.) Boston: Beacon.
  • 1971a. Theorie und Praxis. Frankfurt am Main: Suhrkamp. [English: 1973b]
  • 1971b. Knowledge and Human Interests. J. J. Shapiro (trans.). Boston: Beacon.
  • 1971c. The Christian Gauss Lectures: Reflections on the linguistic foundations of sociology [For published versions, see chapter 1 of German 1984b and pp.1-103 of English 2001* original translation for lecture purposes by Jeremy Shapiro; re-translated for publication by Barbara Fultner]
  • 1973a. Wahrheitstheorien. In H. Fahrenbach (ed.), Wirklichkeit und Reflexion. Pfllingen: Neske. 211-265. Reprinted as chapter 2 in 1984b.*
  • 1973b. Theory and Practice, J. Viertel (trans.). Boston: Beacon.
  • 1973c. Nachwort / Postscript to Knowledge and Human Interests: Philosophy of the Social Sciences [included in all subsequent printings of Knowledge and Human Interests].
  • 1973d. Legitimationsprobleme im Spätkapitalismus. Frankfurt am Main: Suhrkamp. [English: 1975]
  • 1975. Legitimation Crisis, T. McCarthy (trans.). Boston: Beacon.
  • 1976a. Zur Rekonstruktion des Historischen Materialismus. Frankfurt am Main: Suhrkamp. [English: 1979*]
  • 1976b. Was heiβt Universalpragmatik? In K.-O. Apel (ed.), Sprachpragmatik und Philosophie. Frankfurt am Main: Suhrkamp. 174-272. [English: 1979, chap. 1]
  • 1977. Martin Heidegger, on the publication of lectures from the year 1935. Graduate Faculty Philosophy Journal 6, no. 2: 155-180.
  • 1979. Communication and the Evolution of Society, T. McCarthy (trans.) Boston: Beacon.
  • 1981. Theorie des kommunikativen Handelns. Band I: Handlungsrationalität und gesellschaftliche Rationalisierung. Band II: Zur Kritik der funktionalistischen Vernunft. Frankfurt am Main: Suhrkamp. [English: 1984a and 1987]
  • 1983. Moralbewuβtsein und kommunikatives Handeln. Frankfurt am Main: Suhrkamp. [English: 1990a]
  • 1984a. The Theory of Communicative Action, Volume I: Reason and the Rationalization of Society, T. McCarthy (trans.). Boston: Beacon.
  • 1984b. Vorstudien und Ergänzungen zur Theorie des kommunikativen Handelns. Frankfurt am Main: Suhrkamp. [English: 2001* does not include the Wahrheitstheorien essay]
  • 1985. Die Neue Unübersichtlichkeit: Kleine Politische Schriften V.ÿ Frankfurt am Main: Suhrkamp. [English: 1991]
  • 1986a. Gerechtigkeit und Solidarität: Eine Stellungnahme zur Diskussion über Stufe 6. In W. Edelstein and G. Nunner-Winkler (eds), Zur Bestimmung der Moral. Frankfurt am Main: Suhrkamp. 291-318. [English: 1990b]
  • 1986b. Entgegnung. In A. Honneth and H. Joas (eds), Kommunikatives Handeln. Frankfurt am Main: Suhrkamp. 327-405. [English: 1991b]
  • 1987. The Theory of Communicative Action. Vol. II: Lifeworld and System, T. McCarthy (trans.). Boston: Beacon.
  • 1988a. On the Logic of the Social Sciences, S. W. Nicholsen and J. A. Stark (trans.). Cambridge, MA: MIT Press.
  • 1988b. Nachmetaphysisches Denken. Frankfurt am Main: Suhrkamp. [English: 1992a]
  • 1989. The Structural Transformation of the Public Sphere, T. Burger and F. Lawrence (trans). Cambridge, MA: MIT Press.
  • 1990a. Moral Consciousness and Communicative Action, C. Lenhardt and S. W. Nicholsen (trans). Cambridge, MA: MIT Press.
  • 1990b. Justice and solidarity: On the discussion concerning stage 6. In: T. E. Wren (ed.), The Moral Domain. Cambridge, MA: MIT Press. 224-251, S. W. Nicholsen (trans.).
  • 1991a. Erläuterungen zur Diskursethik. Frankfurt am Main: Suhrkamp. [English: 1993]
  • 1991b. The New Conservatism: Cultural Criticism and the Historians’ Debate. S. W. Nicholsen (trans.).
  • 1992a. Postmetaphysical Thinking, W. M. Hohengarten (trans.). Cambridge, MA: MIT Press.
  • 1992b. Faktizität und Geltung. Beiträge zur Diskurstheorie des Rechts und des demokratischen Rechtsstaats. Frankfurt am Main: Suhrkamp. [English: 1996b]
  • 1993. Justification and Application, C. P. Cronin (trans.). Cambridge, MA: MIT Press.
  • 1994. The Past as Future. M. Pensky and P. Hohendahl (trans.). University of Nebraska Press.
  • 1996a. Die Einbeziehung des Anderen. Studien zur politischen Theorie. Frankfurt am Main: Suhrkamp. [English: 1998a]
  • 1996b. Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy, W. Rehg (trans.). Cambridge, MA: MIT Press. [German, 1992]
  • 1998a. Inclusion of the Other: Studies in Political Theory, C. Cronin and P. DeGreiff (eds). Cambridge, MA: MIT Press.
  • 1998b. Die postnationale Konstellation. Frankfurt am Main: Suhrkamp. [English: 2001a]
  • 1999a. Wahrheit und Rechtfertigung. Frankfurt am Main: Suhrkamp. [English: 2003a]
  • 2001a. The Postnational Constellation, M. Pensky (trans., ed.). Cambridge, MA: MIT Press.
  • 2001b. Die Zukunft der menschlichen Natur. Auf dem Weg zu einer liberalen Eugenik? Frankfurt am Main: Suhrkamp. [English: 2003b]
  • 2001c. Zeit der Übergänge. Kleine politische Schriften IX. Frankfurt am Main: Suhrkamp. [English: 2004b]
  • 2003a. Truth and Justification, B. Fultner (trans.). Cambridge, MA: MIT Press.
  • 2003b. The Future of Human Nature, W. Rehg, M. Pensky, and H. Beister (trans.). Cambridge: Polity.
  • 2004a. Der gespaltene Westen. Kleine politische Schriften X. Frankfurt am Main: Suhrkamp. [English: 2006a]
  • 2004b. Time of Transitions. C. Cronin (trans.). Cambridge: Polity Press.
  • 2006a. The Divided West. C. Cronin (trans.). Cambridge: Polity Press.
  • 2006b. Pre-political foundations of the democratic constitutional state. In J. Habermas and J. Ratzinger The Dialectics of Secularization: On Reasons and Religion. B. McNeil (trans.). San Francisco: Ignatius. 19-52.
  • 2007. Kommunikative Vernunft und grenzüberschreitende Politik. Eine Replik. In: Anarchie der kommunikativen Freiheit Jürgen Habermas und die Theorie der internationalen Politik. Peter Niesen and Benjamin Herborth (eds.). Frankfurt am Main: Suhrkamp Press.
  • 2008. Between Naturalism and Religion. C. Cronin (trans.). Cambridge: Polity Press.
  • 2008. Ach Europa. Kleine politische Schriften XI. Frankfurt am Main: Suhrkamp. [English: 2009]
  • 2009. Europe: the Faltering Project. C. Cronin (trans.). Cambridge: Polity Press.
  • 2012: The Crisis of the European Union: A Response. C. Cronin (trans.). Cambridge: Polity Press.
  • 2014. The Lure of Technocracy. C. Cronin (trans.). Cambridge: Polity Press.
  • 2014. Entgegnung (x13) and Schlusswort. In: Habermas und der Historische Materialismus. Smail Rapic (ed.). München: Karl Alber Press.

d. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

  • 1990. Apel, Karl-Otto. Diskurs und Verantwortung. Frankfurt am Main: Suhrkamp.
  • 2002. Apel, Karl-Otto. Regarding the relationship of morality, law and democracy: on Habermas’s Philosophy of Law (1992) from a transcendental-pragmatic point of view. In: Habermas and Pragmatism. Aboulafia, Mitchell, Myra Bookman and Catherine Kemp (eds.). 17-30.
  • 2014. Brunkhorst, Hauke. Critical Theory of Legal Revolutions: Evolutionary Perspectives. Bloomsbury.
  • 2009. Brunkhorst, Hauke, Regina Kreide and Cristina Lafont (eds.) Habermas-Handbuch. Stuttgart: JB Metzler.
  • 2000b. Finlayson, James Gordon. What Are ‘Universalizable Interests’? Journal of Political Philosophy. vol. 8, no. 4, 456-469.
  • 2004. Flynn, Jeffrey. Communicative Power in Habermas’s Theory of Democracy. European Journal of Political Theory, 433-454.
  • 1993. Günther, Klaus. The Sense of Appropriateness. J. Farrell (trans.). Albany: SUNY Press.
  • 1991. Honneth, Axel and Hans Joas. Communicative Action: Essays on Jürgen Habermas’s Theory of Communicative Action. Gaines, Jeremy and Doris L. Jones (trans.). Cambridge, MA: MIT Press.
  • 2003. Horkheimer, Max and Theodor Adorno. Dialectic of Enlightenment. G. Schmid Noerr (ed.), E. Jephcott (trans.). Stanford: Stanford University Press.
  • 2001. Joas, Hans. Values versus Norms: a pragmatic account of moral objectivity. In: The Hedgehog Review 3, 42-56.
  • 2012. Lafont, Cristina. Agreement and Consent in Kant and Habermas: Can Kantian Constructivism be fruitful for democratic theory? In: The Philosophical Forum. 43/3, 277-95.
  • 2009. Lafont, Cristina. Religion and the Public Sphere: What are the deliberative obligations of democratic citizenship? Philosophy and Social Criticism 35, 127-150.
  • 1991. McCarthy, Thomas. Ideals and Illusions, Cambridge, MA: MIT Press.
  • 2007. Müller, Jan-Werner. Constitutional Patriotism. Princeton University Press.
  • 2000. Müller-Doohm, Stefan (ed.). Das Interesse der Vernunft: Rückblicke auf das Werk von Jürgen Habermas seit “Erkenntnis und Interesse”. Frankfurt am Main: Suhrkamp.
  • 2007. Niesen, Peter and Benjamin Herborth (eds.). Anarchie der kommunikativen Freiheit-Jürgen Habermas und die Theorie der internationalen Politik. Frankfurt am Main: Suhrkamp Press.
  • 2002. Owen, David. Between Reason and History: Habermas and the Idea of Progress. Albany: SUNY Press.
  • 2014. Rapic, Smail (ed.). Habermas und der Historische Materialismus. München: Karl Alber Press.
  • 1989. Rockmore, Tom. Habermas and Historical Materialism. Bloomington and Indianapolis: Indiana University Press.
  • 1998. Rosenfeld, Michel, and Andrew Arato (eds). Habermas on Law and Democracy, Berkeley: University of California Press.
  • 1982. Thompson, John B. and David Held. Habermas: Critical Debates. Cambridge, MA: MIT Press.
  • 1991. Wellmer, Albrecht. Ethics and dialogue: Elements of moral judgment in Kant and discourse ethics. In: A. Wellmer, The Persistence of Modernity, D. Midgley (trans.). Cambridge, MA: MIT Press. 113-231.
  • 1995. White, Stephen K. (ed.). The Cambridge Companion to Habermas. Cambridge: Cambridge University Press.
  • 2007. Yates, Melissa. Rawls and Habermas on religion in the public sphere. Philosophy and Social Criticism. 33, 880-891.

 

Author Information

Max Cherem
Email: Max.Cherem@kzoo.edu
Kalamazoo College
U. S. A.

Ancient Aesthetics

ancient-greek-aestheticsIt could be argued that ‘ancient aesthetics’ is an anachronistic term, since aesthetics as a discipline originated in 18th century Germany. Nevertheless, there is considerable evidence that ancient Greek and Roman philosophers discussed and theorised about the nature and value of aesthetic properties. They also undoubtedly contributed to the development of the later tradition because many classical theories were inspired by ancient thought; and, therefore, ancient philosophers’ contributions to the discussions on art and beauty are part of the traditions of aesthetics.

The ancient Greek philosophical tradition starts with the pre-Socratic philosophers. In most cases, there is little evidence of their engagement with art and beauty, with the one notable exception of the Pythagoreans. In the Classical period, two prominent philosophers, Plato and Aristotle, emerged. They represent an important stage in the history of aesthetics. The problems they raised and the concepts they introduced are well known and discussed even today.

The three major philosophical schools in the Hellenistic period (the Epicureans, the Stoics and the Sceptics) inherited a certain philosophical agenda from Plato and Aristotle while at the same time presenting counterarguments and developing distinct stances. Their contributions to aesthetics are not as famous and, in some cases, are significantly smaller than those of their predecessors, yet in certain respects, they are just as important. In late antiquity, the emergence of Neoplatonism marks another prominent point in the aesthetic tradition. Neoplatonists were self-proclaimed followers of Plato, yet starting with the founder of the school, Plotinus, Neoplatonists advocated many distinctly original views, some of them in aesthetics, that proved to be enduringly influential.

The history of ancient Greek aesthetics covers centuries, and during this time numerous nuanced arguments and positions were developed. In terms of theories of beauty, however, it is possible to classify the theories into three distinct groups: those that attribute the origin of beauty to proportion, those that attribute it to functionality and those that attribute the Form as the cause of beauty. This classification ought not to be understood as a hard-and-fast distinction among philosophical schools, but as a way of pinpointing some major theoretical trends. Oftentimes, philosophers use a combination of these positions, and many original innovations are due to the convergence and interaction among them.

Ancient philosophers were also the authors of some of the more notable concepts in the philosophy of art. The notions of catharsis, sublimity and mimesis originated in antiquity and have played a role in aesthetics ever since then.

Table of Contents

  1. Ancient Aesthetics: Methodological Issues
    1. Aesthetics in Antiquity
    2. To Kalon
  2. Three Types of Theories about the Origin of Beauty
    1. Proportion
      1. Pythagoreans
      2. Plato and Aristotle
      3. The Stoics
    2. Functionality
      1. Xenophon
      2. Hippias Major
      3. Aristotle
      4. The Stoics
    3. Form
      1. Plato
      2. Plotinus
  3. Philosophy of Art
    1. Mimesis
      1. Plato
      2. Aristotle
    2. Criticism of Arts
      1. Plato
      2. Epicureans
    3. Catharsis
    4. Sublime
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Ancient Aesthetics: Methodological Issues

a. Aesthetics in Antiquity

One of the most important foundational issues about ancient aesthetics is the question of whether the very concept of ‘ancient aesthetics’ is possible. It is generally considered that aesthetics as a discipline emerged in the 18th century. To speak of ancient Greek and Roman aesthetics, therefore, would be an anachronism. Furthermore, there are certain differences between ancient and modern approaches to the philosophical study of beauty and art that make them distinct projects. These differences were outlined and discussed by Oskar Kristeller, an influential critic of ancient aesthetics, who suggested that the ancients’ interest in moral, religious, and practical aspects of works of art—combined with their lack of grouping the fine arts into a single category and presenting philosophical interpretations on that basis—means that aesthetics was not a philosophical discipline in antiquity (Kristeller 1951: 506).

Kristeller’s critique is still often quoted and discussed in works that deal with the ancients’ ideas on arts and beauty. The question of how compatible ancient and modern methodologies are remains a relevant issue. At the same time, Kristeller’s view has been challenged by a number of compelling arguments in 20th and early-21st century scholarship.

A number of arguments against Kristeller’s interpretation of the aesthetic tradition have been raised. These arguments also pinpoint some of the central concepts that ancient philosophers used. Stephen Halliwell criticised Kristeller’s argument by pointing out that, first, the notion of mimesis was a much more unified concept of art than Kristeller allows (see below for a more detailed explanation of mimesis). Second, the 18th-century category of fine art, established in such works as Batteux’s Les beaux arts réduits à un même principe (1746), relied on the mimetic tradition, although later the focus shifted towards different conceptions of art (Halliwell 2002: 7–8). Peponi later refuted Kristeller’s claims by pointing out that ancient Greek thinkers grouped activities we call fine arts and, moreover, were interested in the effects produced by the beautiful properties of, for instance, poetry (Peponi 2012: 2–6).

James Porter has also criticised Kristeller’s premises and conclusions on three different grounds: Kristeller’s historical account is not the only one possible; “the modern system of arts” is not as clear-cut a category as Kristeller makes it out to be; and it does not follow that the existence of the concept of fine arts indicates the emergence of aesthetic theory (Porter 2009). In addition to this, it has been argued that the ideas of Plato and Aristotle are not only relevant to the preoccupations of modern philosophers but also address the foundational questions of aesthetics and philosophy of art (Halliwell 1991).

b. To Kalon

Another methodological issue concerning ancient aesthetics is a linguistic one, namely the translation and conceptualisation of the term to kalon (honestum in Latin) whose meaning contains some ambiguity. The issue at stake is the question of when this term can and cannot be read and translated as an aesthetic one. The Greek language has a rich vocabulary of terms that are uncontroversially aesthetic, but to kalon, a fairly popular term in philosophical texts, has a range of meanings from ‘beauty’ to ‘being appropriate.’ The problem arises especially in ethical discussions, when the context does not make it clear whether the usage of the term to kalon ought to be understood as aesthetic or not.

It has been customary to translate to kalon in ethical contexts as ‘fine’ or something similar. Early 21st-century thinkers have argued, however, that to kalon and similar Greek and Latin terms (to prepon in Greek; honestum and decorum in Latin) ought to be read as aesthetic concepts. The translations that ignore the aesthetic aspect of these terms may not capture their meaning accurately (Bychkov 2010: 176). Or, more specifically, the use of to kalon in Aristotle’s works often has aesthetic meaning and, therefore, can be translated as ‘beautiful’ (Kraut 2013). At the same time, some studies of Aristotle’s use of to kalon have argued that the conceptualisation and translation of the term depend on the context in which it is found. In the context of ethical discussions, more neutral or ethical translations ought to be preferred over aesthetic ones (Irwin 2010: 389–396).

2. Three Types of Theories about the Origin of Beauty

a. Proportion

The idea that beauty in any given object originates from the proportion of the parts of that object is one of the most straightforward ways of accounting for beauty. The most standard term for denoting this theory is summetria, meaning not bilateral symmetry, but good, appropriate or fitting proportionality.

The idea that beauty derives from summetria is usually attributed to the sculptor Polycleitus (5th cn. B.C.E.), who wrote a treatise entitled Canon containing a discussion of the exact proportions that generate beauty and then made a statue, also entitled Canon, exemplifying his theory. Little is known of Polycleitus’ work and ideas, but when the famous Roman architect Vitruvius used this notion in his De Architectura, he explained it in terms of specific numerical ratios. For instance, in the human face, the distance from the chin to the crown of the head is an eighth part of the whole height; the length of the foot is a sixth part of the height of the body, while the forearm is a fourth part. Then Vitruvius adds that ancient painters and sculptors achieved their renown by following these principles (Book 3.1.2). It is likely that Polycleitus’ treatise had similar contents, such as a discussion of specific ratios that produce beauty in a human body, and was therefore useful for making sculptures of idealised human forms.

i. Pythagoreans

Equally, if not more, significant for the philosophical tradition are Pythagorean ideas about the fundamentality of numbers. Of course, Pythagoreanism was far from a unified school of thought; diverse philosophers were given that name during antiquity. The Pythagoreans referred to here are the philosophers active during the 5th and 4th centuries B.C.E., such as Philolaus and Archytas.

Numbers, according to this strand of Pythagoreanism, underlie the basic ontological and epistemological structure of the world and, as a result, everything in the world can be explained in terms of numbers and the relationship between them, namely, proportion. Beauty is one of the properties that the Pythagorean philosophers use to support their doctrine, because they claimed its presence can be fully explained in terms of numbers or, to be more precise, the proportion and harmony that is expressed in numerical relationships.

Sextus Empiricus recorded the Pythagorean argument that sculpture and painting achieve their ends by means of numbers, and thus art cannot exist without proportion and number. Art, the argument continues, is a system of perceptions and the system is reducible to a number (Sextus Empiricus Against the Logicians Book 1.108–9).

The Pythagoreans had a well-known interest in music. The evidence on this topic is wide-ranging: from the reputation of Pythagoras as the first one to pinpoint the mathematics underlying the Greek music scale to Socrates’ remark in the Republic attributing to Pythagoras the claim that music and astronomy were sister sciences (Rep. 530D). Music is also said to have a positive influence on a person’s soul. According to a testimonial from Aristoxenus, music had an effect on a person’s soul comparable to the effect that medicine has on a person’s body (Diels, II. 283, 44). Arguably this role was attributed to music due to its being an expression of the harmonizing influence of numbers.

ii. Plato and Aristotle

Although generally speaking, Plato is best classified as a Form Theorist, a small number of passages in the Platonic corpus suggest a viewpoint derived from summetria, that is, a good proportion or ratio of parts.

In the Timaeus, lacking summetria is associated with lacking beauty (87D). Similarly, both in the Republic and the Sophist, beauty is said to derive from arrangements (R. 529D-530B and Sph. 235D–236A respectively). Plato’s use of summetria raises the question of how this theory was supposed to function alongside the idea that beauty derives from the form of beauty. Most likely, however, there was no contradiction for Plato. Summetria is one of the properties that beautiful things have, rather than the cause of beauty, which is its form. Summetria, as well as such properties as colour and shape, is one of the aspects that an object gains by partaking in the form.

The case is similar in the Aristotelian corpus. Aristotle named summetria one of the chief forms of beauty, alongside order and definiteness (M 3.1078a30–b6). The context for this definition is the refutation of the view put forth by the sophist Aristippus who argued that mathematics has nothing to say about the good and the beautiful (M 3.996a). Since the causes of aesthetic properties are describable in mathematical terms, mathematics does, in fact, have something to say about these things. Similarly, in Physics, bodily beauty (kallos) is named as one of the excellences that depend on particular relations (Ph. 246b3–246b19), and in Topics, it is said to be a kind of summetria of limbs (Topics 116b21). The beautiful (to kalon) is also identified with being well arranged in On Universe (397a6).

At the same time, Aristotle did not think that summetria was a sufficient condition for beauty. He claimed that size was also necessary for beauty. In Nicomachean Ethics 4.3, beauty is said to imply a good-sized body, so that little people might be well-proportioned, but not beautiful. The city as well is required to be of a certain size before it can be called beautiful (Politics 7.4).

iii. The Stoics

Summetria assumed a much more significant role in Stoicism. The Stoics defined beauty as originating from the summetria of parts with each other and with the whole. Galen (On the Doctrines of Hippocrates and Plato 5.3.17) attributes this definition to Chrysippus, the third head of the school, but all other testimonials describe it simply as the Stoic definition. This definition is meant to apply to both the beauty of the body and the beauty of the soul (Arius Didymus Epitome of Stoic Ethics 5b4–5b5 (Pomeroy); Stobaeus Ecl. 2.62, 15). Some sources suggest that there are additional conditions: for the former, colour, and for the latter, the stability or consistency of beliefs (Plotinus Ennead 1.6.1; Cicero Tusculan Disputations  4.13.30). In many respects, the Stoics inherit this understanding of beauty from their predecessors, but it is worth noting that they also often invoked the notion of functional beauty. Stoics aesthetics, therefore, was likely a combination of functional and proportion theories.

b. Functionality

The theory of functional beauty is the idea that beauty originates in an object when that object performs its functions, achieves its end or fits its purpose, especially when it is done particularly well, that is, excelling at the task of achieving that end. In an ancient philosophical context, this idea is also often associated with the notion of dependent beauty, which means an object is beautiful if it excels at functioning as the kind of object it is. It is also noteworthy that the Greek term to kalon, often—but not always—used as an aesthetic term, can be used to denote being fitting or well-executed. The functionalist theory of beauty might have been more linguistically intuitive to ancient Greeks than it is possible to convey in English.

i. Xenophon

It is hard to attribute this theory to one particular philosopher, since functionalist arguments are fairly common in ancient philosophy texts. An example of functional theory can be found in Xenophon’s Memorabilia. Socrates first makes a point about dependent beauty by saying “a beautiful wrestler is unlike a beautiful runner, a shield beautiful for defence is utterly unlike a javelin beautiful for swift and powerful hurling” (3.8.4). Then he further develops this point by adding that “it is in relation to the same things that men’s bodies look beautiful and good and that all other things men use are thought beautiful and good, namely, in relation to those things for which they are useful” (3.8.5).

It is not obvious that the term to kalon employed here is used in an aesthetic sense, but a few lines down, it is said that “the house in which the owner can find a pleasant retreat at all seasons and can store his belongings safely is presumably at once the pleasantest and the most beautiful. As for paintings and decorations, they rob one of more delights than they give” (3.8.10). This remark highlights that the issue at stake is aesthetic phenomena, and that a much greater pleasure is to be gained from perceiving functionality rather than perceiving pleasing, yet artificial, colours (paintings) and structures.

ii. Hippias Major

A functional definition of beauty is also found in Plato’s dialogue Hippias Major. In this dialogue, Socrates engages in a discussion with Hippias, a sophist, in order to discover the definition of beauty. They each give a number of possible options, and one of them, proposed by Socrates, was a functional definition.

It is argued that stone, rather than ivory, is more beautiful as material for eye pupils in Pheidias’ statue and that a fig wood ladle is much better suited and beautiful than a gold one for making soup. Socrates proposes these two cases as objections to Hippias’ proposal that beauty is gold. By presenting two cases in which a beauty-making property is not some inherent property of an object, but that object’s functionality, Socrates rejects Hippias’ suggestion. This move also leads to examining the possibility that all beauty is to be defined as deriving from functionality, but this option is ultimately rejected as well on the grounds that it appears to rely on a kind of deception, because it prioritizes how things appear over how things truly are (290D–294E).

iii. Aristotle

In Aristotle’s work, there are many instances of excellence in functionality described by the term to kalon. In fact, Aristotle states outright that fitting a function and to kalon are the same (Top. 135a12–14). Since this term can be used both aesthetically and non-aesthetically, it is a matter of contention whether in some specific cases the reference for this term is meant to be an aesthetic phenomenon or not.

If to kalon is read aesthetically, some of the most pertinent passages for the functionalist understanding of aesthetic properties would come from Aristotle’s descriptions of natural phenomena. For instance, according to Generation of Animals, the generation of bees reveals a kalon arrangement of nature; the generations succeed one another even though drones do not reproduce (760a30–b3). In Nicomachean Ethics, Aristotle states that dogs do not enjoy the scent of rabbits as such, but the prospect of eating them; similarly, the lion appears to delight in the lowing of an ox, but only because it perceives a sign of potential food (1118a18–23).

iv. The Stoics

A certain kind of functionality and aesthetic language also appear in certain Stoic arguments, most notably in the works of Panaetius who used the term to prepon (‘fitting’, ‘becoming’, ‘appropriate’ in English) in certain ethical arguments. Probably the most elaborate discussion of to prepon (or decorum in Latin) is recorded in Cicero’s On Duties, which represents Panaetius’ views.

Here, an analogy between poetry and human behaviour is drawn as follows. The poets “observe propriety, when every word or action is in accord with each individual character.” The poets depict each character in a way which is appropriate regardless of the moral value of the character’s actions, so that a poet would be applauded even when he skilfully depicts an immoral person saying immoral things. To human beings, meanwhile, nature also assigned a kind of role, namely that of manifesting virtues like steadfastness, temperance, self-control, and so forth. This claim reflects one of the essential tenets of Stoic ethics, eudaimonia, which is living in accordance with nature and pursuing virtue (Diogenes Laertius 7.87–9; Long and Sedley 63C). Human beings, therefore, are functional entities as well in the sense that they have a certain function and end. The idea that achieving that end produces beauty is made clear when it is said that just as physical beauty consisting of the harmonious proportion of limbs delights the eye, so too does to prepon in behaviour earn the approval of fellow humans through the order, consistency and self-control imposed on speech and acts (1.97–98).

c. Form

i. Plato

Plato’s best-known argument, the theory of forms, has much bearing on his aesthetics in a number of ways. The theory posits that incorporeal, unchanging, ideal paradigms— forms—are universals and play an important causal role in the world generation. Arguably the most important way in which the theory of forms has bearing on aesthetics is the account of the origin of aesthetic properties. Beauty, just like many other properties, is generated by its respective form. An object becomes beautiful by partaking in the form of Beauty.  The form of Beauty is mentioned as the cause of beauty throughout the Platonic corpus; see, for instance Cratylus 439C–440B; Phaedrus 254B; Phaedo 65d–66A and 100B–E; Parmenides 130B; Republic 476B–C, 493E, 507B. In this respect, the form of Beauty is just like all the other forms. Plato does, however, say that the form of Beauty has a special connection with the form of Good, even if they are not, ultimately, identical (Hippias Major 296D–297D).

The form of Beauty is shown as having a pedagogical aspect in the Symposium. In Diotima’s speech, the acquisition of knowledge (that is, the knowledge of the forms) is represented as the so-called Ladder of Love. A lover is said to first fall in love with an individual body, then notices that there are commonalities among all beautiful bodies and thus becomes an admirer of human form in general. Then the lover starts appreciating the beauty of the mind, followed by the beauty of institutions and laws. The love of sciences is the next step on the ladder until the lover perceives the form of Beauty. The form is said to be everlasting, not increasing or diminishing, not beautiful at one point and ugly at another, not beautiful only in relation to any specific condition, not in the shape of any specific thing, such as a limb, a piece of knowledge or an animal. Instead, it is absolute, everlasting, unchanging beauty itself (210A–211D).

ii. Plotinus

Plotinus, a self-proclaimed follower of Plato, was also committed to the view that beauty originated from the form of Beauty, adding some further elaborations of his own. Plotinus presents this account as a rival to the summetria theory. His treatise On Beauty (Ennead 1.6) starts with an elaborate critique of the rival theory. Plotinus claims that accounting for beauty by means of summetria has a number of drawbacks. For instance, it cannot explain the beauty in unified objects that do not have parts, such as a piece of gold.

According to Plotinus’ own theory, an object becomes beautiful by virtue of its participating in the form. He also adds that the Intellect (nous) is the cause of beauty. To be precise, it is the Intellect that imposes the forms onto passive matter thus producing beauty. Those entities that do not participate in the form, and thus reason, are ugly (1.6.2). The form is therefore capable of producing beauty by virtue of its being an instrument of the Intellect that creates order and structure out of chaotic matter in the universe, and beauty is an expression of its designing powers.

Apart from being expressions of the Intellect, forms have another aspect that makes them the cause of beauty; namely, they unify disarrayed and chaotic elements into harmony. When form approaches formless matter, it introduces a certain intrinsic agreement, so that many parts are brought into unity and harmony with each other. The form has intrinsic unity and is one, and therefore, it turns the matter it shapes into one as well, as far as it is possible. The unity produces beauty which ‘communicates itself’ to both the parts and the whole (1.6.2).

Plotinian metaphysics and aesthetics converge in the analogy between Intellect shaping the universe and a sculptor shaping a piece of stone into a statue. At the beginning of his Ennead 5.8 (On the Intelligible Beauty), Plotinus asks his readers to envision two pieces of stone placed next to each other, one plain and another one sculpted into the shape of an especially beautiful human or some god. Then he argues that the latter will appear immediately beautiful, not because of the material it is made out of but because it possesses the form. The beauty is caused by the intellect of the sculptor, which transmits the form onto the stone. The visible form that the sculptor imposes onto the stone is an inferior version of the actual form that can only be contemplated. The actual forms are purely intellectual, ‘seen’ with mind’s eye. The intellectual beauty of reason, argues Plotinus, is a much greater and also truer beauty (En. 5.8.1.).

Plotinus follows Plato in arguing that visible beauty is inferior as it is only a copy of the true beauty of forms. There is, however, a significant difference between them in terms of their attitude towards the value of artistic beauty. Plotinus warns against devaluing artistic activities and, in an argument very much unlike those found in Plato, states that (i) nature itself imitates some things. (ii) Arts do not simply imitate what is seen by the eye but refer back to the principles of nature. (iii) Arts produce many things not by means of copying, but from themselves. In order to create a perfect whole, they add what is lacking, because arts contain beauty themselves. (iv) Phidias (one of the most famous Greek sculptors) designed a statue of Jupiter not by imitation, but by conceiving a form that a god would take if he were willing to show himself to humans (5.8.1).

3. Philosophy of Art

a. Mimesis

In older scholarship, it is common to find a claim that a Greek term for art was techne, and as this is a much narrower term than the contemporary concept of fine art, it is claimed that ancient Greeks did not have a concept of fine art. This interpretation, however, has been challenged. It has been argued that, if there were a concept of fine art in Greek thought, it would be mimesis. In the most literal meaning of the term, mimesis refers to imitation in a very broad sense, including such acts as following an example of someone’s behaviour or adopting a certain custom. This word is widely used when discussing art and artistic activities, and it can be roughly defined as an imitative representation, where ‘representation’ is understood as involving not just copy-making, but also creative interpretation. Aristotle grouped poetry with “the other mimetic arts” (8.1451a30) in the Poetics, in a remark that suggests the conceptualisation of a distinct group of artistic activities resembling the notion of fine arts. A similar grouping of “imitators” (mimetai), including poets, rhapsodists, actors, and chorus-dancers can be found in Plato’s Republic as well (2.373B).

i. Plato

Books 2 and 3 of Plato’s Republic contain an extensive analysis of mimesis in the context of the education of the guardian class in the ideal city-state. In Book 2, Socrates starts developing his account of the ideal city-state. The class of guardians plays an especially important role in its maintenance, and therefore, the question of how the guardians ought to be educated is raised. Apart from physical education, the education based on storytelling is quite important, as it starts early in childhood and precedes physical education (2.376E).

First, Socrates and the interlocutors agree to ban from the guardians’ education and the ideal city-state more generally certain stories based on their content, particularly stories depicting the gods committing evil deeds (2.377D–E). At the start of book 3, there is a longer list of the kind of stories that are undesirable in the ideal city, including ones with negative portrayals of the afterlife, lamentations, gods committing unseemly acts and portrayals of bad people as happy (386A–392C).

Then there follows a discussion of the style (Gr. lexis) of narration. Socrates distinguishes direct speech, when a poet speaks in his own voice, from imitative speech, when a poet imitates the speech of the characters in the story and suggests that if a poem is written in the former style, it contains no mimesis (3. 393D). The poetry can be of three kinds: dithyrambs (in poet’s own voice, no mimesis), tragedy and comedy (pure mimesis) and epic poetry (a combination of the two) (3.394C).

The discussion turns towards the question of whether mimetic poets ought to be allowed into the city-state and whether guardians themselves could be mimetai. The answer to this question turns out to be negative. The main argument against mimesis in the ideal city goes as follows. The guardians preserve the well-being of the city, and thus the only things they ought to imitate are the properties of virtue, not shameful or slavish acts. The reason for this is that enjoying the imitation of these things might lead them to actually pursuing them, as imitation is habit-forming (3.395D–E). It is ultimately concluded that only a pure imitator of a good person ought to be allowed into the city-state (3. 397D–398B)

ii. Aristotle

Aristotle argues that poetry originates from two causes. Both of these causes are grounded in human nature, particularly the natural proneness of human beings to mimesis. Mimesis is said to be (i) the natural method of learning from childhood and (ii) a source of delight for human beings.

In order to support the latter point, Aristotle notes that although such objects as dead bodies and low animals might be painful to see in real life, we delight in artistic depictions of them, and the reason for this is the pleasure humans derive from learning. People delight in seeing a picture, either because they recognise the person depicted and ‘gather the meaning of things’ or—if they do not recognise the subject—they admire the execution, colour, and so on. (The distinction between mimesis and colour/composition is reiterated in Politics, where colour and figures are said to be not imitations but signs with little connection to morality, and therefore, young men ought to be taught to look at those paintings which depict character (1340a32–39).) This principle applies not only to visual arts. The natural inclination to mimesis combined with the sense of harmony and rhythm is the reason why humans are drawn to poetry as well (Poetics 1448b5–1448b24).

Aristotle’s conceptual analysis of poetry contains a revealing discussion of the differences between poetry and history. Poets differ from historians by virtue of describing not what happened, but what might happen, either because it is probable or necessary. They do not, however, differ because one is set in prose and another one in verse, as the works of Herodotus could be set in verse and remain history. The fundamental difference between history and poetry lies in the fact that the former is concerned with statements about particulars, whereas the latter is concerned with universal statements. Some tragedies do use historical characters, but this, according to Aristotle, is because “what is possible is credible,” which presumably means that plots involving historical characters are more moving because they might have actually happened. Another notable conclusion is that the poet is a poet because of the plot rather than the verse, as the defining characteristic of such activity is the imitation of action (1451a37–b31).

b. Criticism of Arts

i. Plato

Unlike Aristotle, Plato saw potential dangers associated with mimetic activities. In Republic 5, “lovers of beautiful sights and sounds,” people addicted to music, drama and so on, are contrasted with true philosophers. The lovers of sights and sounds pursue only opinions, whereas philosophers are the pursuers of knowledge and, ultimately, beauty in itself (5.475D–480A).

But perhaps the best-known argument criticising art comes from Book 10 of the Republic. Here, the products of artistic activities are criticised for being twice removed from what is actually the case. Socrates uses the example of a symposium couch to argue that the painting of a couch is just a copy of reality, the actual couch. Yet the actual couch made by the craftsman is also just a copy of the true reality, the forms. The painters, according to this argument, portray only a small portion of what is actually the case. For the most part, they are concerned with appearances. There are, thus, three kinds of couches: one produced by god, another one produced by a carpenter and the third one by a painter. God and the craftsmen are called makers or producers of their kinds of couches, but the painter is only an imitator, a producer of the product that is thrice removed from nature. This category is also said to include tragedians and all the other imitators (596D–597E).

These and other passages have earned Plato a reputation of being hostile to art. Plato’s theory of art, however, is much more complex, and criticism is only one aspect of his treatment of artistic mimesis. An example of a more constructive understanding of artistic imitation can be found in the same work where he famously criticises it, the Republic.

For instance, Socrates suggests that there is an analogy between the ideal political state they are discussing and an idealised portrait, arguing that no one would think the latter is flawed because the painter cannot produce an ideal person in reality and, therefore, there is no need to worry that their ideal state does not actually exist (5.472D–E). Socrates’ remark indicates that there is much more to painting than the copying of appearances. Ideas like these can be found throughout the Republic (see also 6.500E–501C; 3.400E–401A).

In fact, after banishing poetry from the ideal city earlier, Socrates praises Homer, who is said to be the best of the tragedians, and a concession is made for hymns to god and eulogies to good people. Socrates also adds that even imitative poetry could be welcomed in the city, provided there is an argument showing it ought to belong to such well-governed places (10.606E–607c). The ancient quarrel between poets and philosophers, as Plato called it, was neither unambiguous nor a settled matter.

ii. Epicureans

The Epicureans, members of the Hellenistic philosophical school notorious for its atomist physics and hedonist ethics, were also critics of poetry. The Epicurean ethical views, especially the claim that death is not evil, played a major role in shaping their perspective on poetry. The extant works of the founder of the school, Epicurus, show him criticising muthos, stories told by poets. Epicurus was concerned with the dangerous influence that these stories could have on those who hear them. The stories of poets are based on beliefs that produce the feeling of anxiety in listeners (for instance, the belief that life is full of pain and it is best to not to be born at all). The opposite of these are beliefs gained by studying nature and engaging in philosophical investigations. Such studies lead to the discovery that the greatest pleasure in life is ataraxia (the state of tranquillity) and abolishing the fear of pain and death (Letter to Menoeceus 126–7; Principal Doctrines 12). Epicurus also notoriously argued against receiving the traditional education (paideia) that includes an education in poetry (Letter to Pythocles 10.6; Plutarch 1087A).

It is noteworthy, however, that Epicurus was not unequivocally opposed to poetry and arts. Some evidence suggests that he maintained that only an Epicurean would discuss music and poetry in the right way, although the Epicureans would not take up writing poetry themselves (Diogenes Laertius, Lives 10.120; Plutarch 1095C). It appears that for Epicurus, like Plato, arts were problematic because of their power to impart incorrect beliefs and emotions that pose risks to one’s ataraxia.

Lucretius, the author of the Epicurean epic poem De Rerum Natura, espouses a somewhat different attitude toward poetry. Written in the 1st century B.C.E. in Latin, the poem is an exposition of Epicurean views including atomism, hedonistic ethics and epistemic dogmatism (especially against attacks from the Sceptics). As a whole, the poem engages very little with aesthetic issues, with the exception of the often-quoted passage from Book 1, in which Lucretius talks about the effects of poetry. He compares himself to a physician who, administering unpleasant-tasting wormwood, covers the brim of the glass with honey, not to deceive his patients, but to help them take the medicine and become better. In the same way, Lucretius himself sweetens doctrines that otherwise might seem woeful to those who are new to Epicureanism (1.931–50).

c. Catharsis

Catharsis is a psychological phenomenon, often associated with the effects of art on humans, famously described by Aristotle. There is, however, no explicit definition of catharsis in the extant Aristotelian corpus. Instead, we have a number of references to such a phenomenon. The one most pertinent to aesthetics is found in Poetics, where one of the defining features of tragedy is a catharsis of such emotions as fear and pity (1449b22–28). Another reference to catharsis can be found in Politics. Here Aristotle writes that music ought to be used for education, catharsis and other benefits (1341b37–1342a1). The lack of Aristotle’s own definition combined with the long and rich history of later interpretations of catharsis (see Halliwell 1998: app. 5) makes it hard to reconstruct a precise Aristotelian account of this term. It is arguably related to the influence that arts have on a person’s emotions and judgements that derive from those emotions (Politics 1340a1–1340b18).

It has been argued that the concept of catharsis has both religious and medical connotations, although more recent interpretations favour the view that it is primarily a psychological phenomenon that has certain ethical aspects (though it is not a means to learn ethics per se).

d. Sublime

Another aesthetic term that originated in antiquity, but was made famous by subsequent adaptations, especially by Kant and Burke, is that of the sublime. The main source for the theory of the sublime is the handbook on oratory titled Peri Hupsous (De Sublimitate in Latin), although it is also noteworthy that a notion of the sublime was known and used much more widely in antiquity (Porter 2016). The authorship of Peri Hupsous is disputable. The work has been attributed to Cassius Longinus, a Greek rhetorician in the 3rd century C.E., and an anonymous author in the 1st century C.E. referred to as pseudo-Longinus.

Fundamentally, the sublime as described by Longinus is a property of style, “certain loftiness and excellence of language.” It does have some more striking aspects, however. For instance, Longinus states that:

A lofty passage does not convince the reason of the reader, but takes him out of himself . . . Skill in invention, lucid arrangement and disposition of facts, are appreciated not by one passage, or by two, but gradually manifest themselves in the general structure of a work; but a sublime thought, if happily timed, illumines an entire subject with the vividness of a lightning-flash, and exhibits the whole power of the orator in a moment of time (1).

Longinus suggests that sublimity originates from five different sources: (i) the greatness of thought; (ii) a vigorous treatment of passions; (iii) skill in employing figures of thought and figures or speech; (iv) dignified expressions, including the appropriate choice of words and metaphors; and (v) majesty and elevation of structure. The last cause of sublimity is said to embrace all the preceding ones as well (8.1).

4. References and Further Reading

a. Primary Sources

  • Armstrong, A. 1966–88. Plotinus: Enneads. 7 vols. Cambridge, MA: Harvard University Press.
  • Arnim, H. F. A. von. 1903–1924. Stoicorum Veterum Fragmenta. 3 vols. Leipzig: Teubner.
  • Bychkov, O. and A. Sheppard, eds. 2010. Greek and Roman Aesthetics. Cambridge: Cambridge University Press.
  • Cooper, J. and D. Hutchinson, eds. 1997. Plato: Complete Works. Indianapolis; Cambridge: Hackett.
  • Diels, H. and W. Kranz, eds. 1951–1952. Die Fragmente der Vorsokratiker, griechisch und deutsch. 3 Vols. Berlin: Weidmannsche buchhandlung.
  • Dyck, A. R. 1996. A Commentary on Cicero De Officiis. Ann Arbor: University of Michigan Press.
  • Goodwin, W. 1874. Plutarch’s Morals. Cambridge: John Wilson and son.
  • Hicks, R. D. 1925. Diogenes Laertius: Lives of Eminent Philosophers. London: W. Heinemann; New York: G.P. Putnam’s Sons.
  • King, J. 1945. Cicero: Tusculan Disputations. Cambridge, MA: Harvard University Press.
  • Long, A. and D. Sedley, eds. 1987. The Hellenistic Philosophers. 2 vols. Cambridge: Cambridge University Press.
  • Leonard, W. E. 1916. Lucretius: De Rerum Natura.  London: Dent; New York: Dutton.
  • O’Connor, E. M. 1993. The essential Epicurus: letters, principal doctrines, Vatican sayings, and fragments. Buffalo, N.Y.: Prometheus Books.
  • Roberts, W. R. 2011. Longinus on the Sublime: The Greek Text Edited after the Paris Manuscript. 2nd edn. Cambridge: Cambridge University Press.

b. Secondary Sources

  • Asmis, E. 1991. “Epicurean Poetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy 7, pp. 63–93. Reprinted in Philodemus and Poetry: Poetic Theory and Practice in Lucretius, Philodemus and Horace, ed. by D. Obbink, Oxford University Press 1995, pp. 15–34; and in Ancient Literary Criticism, ed. Andrew Laird, Oxford University Press 2006, pp. 238–66.
    • (A discussion of the evidence concerning the views on poetry found in the works of Epicurus, Lucretius and Philodemus.)
  • Barney, R. 2010. “Notes on Plato on The Kalon and The Good.” Classical Philology 105(4): 363–377.
    • (A discussion of functionality and its relationship to beauty in Plato’s works.)
  • Beardsley, Monroe C. 1966. Aesthetics from Classical Greece to the Present. New York: Macmillan.
    • (Relevant sections of this book contain a classic interpretation of ancient aesthetics.)
  • Bernays, J. 1979. “Aristotle on the Effect of Tragedy.” In Articles on Aristotle, edited by J. Barnes, Schofield, and R. Sorabji. Vol. 4: Psychology and Aesthetics, 154–165. London. (Originally in Abhandlungen der historisch‐philosophischen Gesellschaft in Breslau, vol. 1, 1857: 135–202; and Sonderausgabe, Breslau 1857.)
    • (A seminal paper for the study of Aristotle’s concept of catharsis; it argues that catharsis is the ‘purgation’ of emotions.)
  • Bett, R. 2010. “Beauty and its Relation to Goodness in Stoicism.” In Ancient Models of Mind, ed. A. Nightingale and D. Sedley, 130–152. Cambridge: Cambridge University Press.
    • (In this paper, the evidence for the Stoic definition of beauty as summetria is collected and interpreted.)
  • Boudouris, K. ed. 2000. Greek Philosophy and the Fine Arts, Volume 2. Athens: International Centre for Greek Philosophy and Culture.
    • (A large collection of papers on various aspects of ancient Greek aesthetics.)
  • Bychkov, O. 2010. Aesthetic Revelation: Reading Ancient and Medieval Texts after Hans Urs von Balthasar. Washington, D.C.: Catholic University of America Press.
    • (A wide-scope monograph; the central argument concerns the notion of the revelatory aesthetics and its presence in ancient (and later) philosophical texts.)
  • Close, A. J. 1971. “Philosophical Theories of Art and Nature in Classical Antiquity.” Journal of the History of Ideas 32(2): 163–184.
    • (A study of the notion of creator/designer in antiquity.)
  • Demand, N. 1975. “Plato and the Painters.” Phoenix 29(1): 1–20.
    • (An article discussing Plato’s attitude to painting and the relationship between his views and contemporary painting traditions.)
  • Denham, A. ed. 2012. Plato on Art and Beauty. New York: Palgrave Macmillan.
    • (A collection of papers on Plato’s philosophy of art.)
  • Destrée, P. and P. Murray, eds. A companion to Ancient Aesthetics. Hoboken, NJ: Wiley-Blackwell.
    • (A wide-ranging collection of extended entries, including such topics as mimesis, beauty, sublime, art and morality, tragic emotions and others.)
  • Ford, A. 1995. “Katharsis: The Ancient Problem.” In Performativity and Performance, edited by A. Parker and E. K. Sidgwick, 109–32. New York and London.
    • (An interpretation of Aristotle’s concept of catharsis with an argument that the relevant passages from Politics help to shed light on the sparse description in Poetics.)
  • Gál, O. 2011. “Unitas Multiplex as the Basis of Plotinus’ Conception of Beauty: An Interpretation of Ennead V.8.” Estetika: The Central European Journal of Aesthetics 48(2): 172–198.
    • (A paper arguing that, for Plotinus, beauty derives from Intellect and unity in diversity.)
  • Golden, L. 1973. “The Purgation Theory of Catharsis.” The Journal of Aesthetics and Art Criticism 31(4): 473–479.
    • (An in-depth argument against Bernays’ interpretation of catharsis as purgation; it contains a suggestion that catharsis is better understood as intellectual clarification.)
  • Halliwell, S. 1991. “The Importance of Plato and Aristotle for Aesthetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy, vol.7, pp. 321–48. New York: Routledge.
    • (A paper arguing that Plato and Aristotle address issues that are pertinent to contemporary aesthetics.)
  • Halliwell, Stephen. 1998. Aristotle’s Poetics. 2nd edn. London: Duckworth.
    • (An extensive study of Poetics, including a number of concepts central to Aristotle’s aesthetics; also includes appendices on the history of interpreting catharsis after Aristotle, dating of Poetics and others.)
  • Halliwell, Stephen. 2002. The Aesthetics of Mimesis: Ancient Texts and Modern Problems. Princeton: Princeton University Press.
    • (A seminal study of the concept of mimesis in Greek philosophy and literature.)
  • Horn, H. -J. 1989. “Stoische Symmetrie und Theorie des Schönen in der Kaiserzeit.” Aufstieg und Niedergang der römischen Welt 36.3: 454–472.
    • (A study of the Stoic definition of beauty as summetria.)
  • Hyland, D. 2008. Plato and the Question of Beauty. Blooming & Indianapolis: Indiana University Press.
    • (An interpretation of Plato’s notion of beauty in Symposium, Hippias Major and Phaedrus influenced by continental philosophy.)
  • Irwin, T. 2010. “The Sense and Reference of Kalon in Aristotle.” Classical Philology 105(4): 381–396.
    • (An argument for avoiding an aesthetic translation of the term to kalon in Aristotle’s works on ethics.)
  • Kraut, R. 2013. “An aesthetic reading of Aristotle’s Ethics.” In Politeia in Greek and Roman Philosophy, ed. M. Lane and V. Harte, pp. 231–250. Cambridge: Cambridge University Press.
    • (An argument for translating to kalon in Aristotle’s work as an aesthetic term.)
  • Kristeller, O. P. 1951. “The Modern System of the Arts: A Study in the History of Aesthetics Part I.” Journal of the History of Ideas 12(4): 496–527.
    • (An article containing arguably the most significant critique of the notion of ancient aesthetics.)
  • Konstan, D. 2015. Beauty: The Fortunes of an Ancient Greek Idea. Oxford: Oxford University Press.
    • (A wide-ranging study of the ancient Greek conception of beauty; includes a discussion of translating problematic aesthetic terms.)
  • Laird, A. ed. 2006. Ancient Literary Criticism. Oxford: Oxford University Press.
    • (A collection of papers covering a wide range of topics including Aristotle’s catharsis, the views of the Hellenistic schools on poetry and Plato’s treatment of tragedy.)
  • Lear, J. 1988. “Katharsis.” Phronesis 33: 297–326.
    • (An argument against the interpretation of catharsis as ‘purgation’ of emotions; and the suggestion that it is, instead, a psychological one with certain ethical connotations.)
  • Lear, G. R. 2006. “Aristotle on Moral Virtue and the Fine.” In The Blackwell Guide to Aristotle’s Nicomachean Ethics, ed. R.Kraut, pp.116–136. Malden, MA; Oxford: Blackwell.
    • (A study of Aristotle’s use of to kalon with the argument that Aristotle used this term (with its aesthetic undertones) to put an emphasis on certain properties of goodness, namely, intelligibility and pleasantness to contemplate.)
  • Lobsien V. and C. Olk, eds. 2007. Neuplatonismus und Ästhetik: zur Transformationsgeschichte des Schönen. Berlin/New York: De Gruyter.
    • (A collection of papers on Neoplatonist aesthetics.)
  • Lombardo, G. 2002. L’Estetica Antica. Bologna: Il Mulino.
    • (A short monograph in Italian containing a discussion of views on aesthetics espoused by both major and lesser-known philosophical figures in antiquity.)
  • Nehamas, A. 2007. “‘Only in the Contemplation of Beauty is Human Life Worth Living’ Plato, Symposium 211d.” European Journal of Philosophy 15 (1): 1–18.
    • (A discussion of the role that beauty plays in Plato’s Symposium.)
  • Nussbaum, M. 1990. Love’s Knowledge: Essays on Philosophy and Literature. Oxford: Oxford University Press.
    • (The relevant sections of this book analyze the complex relationship between philosophy and literature in Plato’s works.)
  • Pappas, N. 2012. “Plato on Poetry: Imitation or Inspiration?” Philosophy Compass 7 (10): 669–678.
    • (An argument that in Republic and Sophist, poetry is treated as imitation, whereas in Ion and Phaedrus, it is treated as inspiration. The relationship between the two views is explained by employing Plato’s concept of drama in Laws.)
  • Peponi, A. -E. 2012. Frontiers of Pleasure: Models of Aesthetic Response in Archaic and Classical Greek Thought. Oxford: Oxford University Press.
    • (A study of the representations of aesthetic properties of artworks and other objects in ancient Greek texts, including philosophical ones.)
  • Pollitt, J. J. 1974. The Ancient View of Greek Art: Criticism, History, and Terminology. New Haven, CT: Yale University Press.
    • (A seminal work on ancient Greek philosophy of art, it deals with not only philosophical but also literary, rhetorical and other kinds of texts.)
  • Porter J. 2009. “Is Art Modern? Kristeller’s ‘Modern System of the Arts’ Reconsidered.” British Journal of Aesthetics 49: 1–24.
    • (An article containing a critique of Kristeller’s dismissal of the possibility of ancient aesthetics.)
  • Porter, J. 2010. The Origins of Aesthetic Thought in Ancient Greece: Matter, Sensation and Experience. Cambridge: Cambridge University Press.
    • (The central argument claims that Plato and Aristotle established formalist aesthetics, which dominated the tradition and silenced alternative, materialist aesthetics.)
  • Porter, J. 2016. The Sublime in Antiquity. Cambridge: Cambridge University Press.
    • (A study of the notion of sublime outside Longinus’ treatise.)
  • Rogers, K. 1993. “Aristotle’s Conception of τὸ καλόν.” Ancient Philosophy 13:355–71. Reprinted in L. P. Gerson (ed.) 1999. Aristotle: Critical Assessments, iv. London: Routledge: 337–55.
    • (The analysis and interpretation Aristotle’s use of the term to kalon, especially his claim that virtues are undertaken for the sake of to kalon.)
  • Sheffield, F. 2006. Plato’s ‘Symposium’: The Ethics of Desire. Oxford: Oxford University Press.
    • (A monograph on Plato’s Symposium; the central argument interprets the dialogue as concerned with moral education, but in a distinct way, that is, by means of the analysis of desire.)
  • Tatarkiewicz, W. 1974. The History of Aesthetics. Vol. 1. The Hague: Mouton.
    • (A collection of ancient Greek philosophical texts on various topics in aesthetics accompanied by a commentary.)
  • Zagdoun, M. -A. 2000. La Philosophie Stoïcienne de l’art. Paris: CNRS Editions.
    • (An extensive study of the notions of beauty and art in Stoic philosophy.)

 

Author Information

Aiste Celkyte
Email: aiste.celkyte@googlemail.com
Yonsei University
South Korea

Metaepistemology

Metaepistemology is, roughly, the branch of epistemology that asks questions about first-order epistemological questions. It inquires into fundamental aspects of epistemic theorizing like metaphysics, epistemology, semantics, agency, psychology, responsibility, reasons for belief, and beyond. So, if as traditionally conceived, epistemology is the theory of knowledge, metaepistemology is the theory of the theory of knowledge. It is an emerging and quickly developing branch of epistemology, partly because of the success of the more advanced ‘twin’ metanormative subject of metaethics. The success of metaethics and the structural similarities between metaethics and metaepistemology have inspired parallel conceptual forays in metaepistemology with far reaching implications for both subjects.

The current article offers a concise survey of basic themes and problems in metaepistemology. The survey, of course, aims neither at being exhaustive nor at presenting these basic themes and problems in their full sophistication and complexity. Rather, given the very broad span of themes and problems that fall under the label of metaepistemology, the aim is to introduce basic themes and problems and overview some of the cutting edge research that is currently undertaken in metaepistemology debates.

In what follows, “(meta)”epistemology contains brackets to indicate the epistemology of epistemology. This is to be distinguished from non-bracketed “metaepistemology,” which is meant to refer to the whole domain of metaepistemological theorizing (metaphysics, epistemology, semantics, agency and so forth).

Table of Contents

  1. Situating Metaepistemology within Epistemology and Metanormativity
  2. Normativity
  3. Metaphysics
  4. Semantics
  5. (Meta)Epistemology
  6. Reasons for Belief and Epistemic Psychology
  7. Agency and Responsibility
  8. New Directions in Metaepistemology
  9. References and Further Reading

1. Situating Metaepistemology within Epistemology and Metanormativity

Following the example of ethics (for example, Fisher 2011; see also Fumerton 1995), we can distinguish three basic branches of epistemology: normative epistemology, applied epistemology, and metaepistemology. Normative epistemology mostly deals with first-order theorizing about how we should form justified beliefs, gain understanding, truth and knowledge, offer accounts of the basic sources of knowledge (like memory, perception, testimony) and so forth, but it does not pursue higher-order questions about these matters or pressing applied epistemic matters. To the extent that it does, it embroils itself, respectively, in metaepistemology and applied epistemology. Applied epistemology draws from normative epistemological theorizing in order to respond to pressing epistemic matters of practical value, like climate change skepticism, jury decision-making, gender or race issues in epistemology, and so forth.

The following is an example to illustrate how the trichotomy of the epistemic domain is meant to divide epistemological labor. As is well-known, epistemologists are intrigued by the perennial question “What is knowledge?” and, accordingly, try to come up with plausible reductive analyses. This much is first-order normative epistemological theorizing at its best. If we conceptually dig deeper, however, move a level down and ask whether there is any “real” (or robust) knowledge or whether the project of reductive analysis of knowledge is any plausible, then we ask second-order, metaepistemological questions. That is, we ask questions about first-order epistemological questions, like the question “what is knowledge?”. Moreover, if we ask epistemic questions of pressing practical value, like whether gender, race, and ethnic origin factors affect ordinary knowledge attributions, then we are pressing applied questions (for example, Fricker 2010) and have swiftly moved into the field of applied epistemology.

Opinions diverge about the exact interrelation of the three branches of epistemology and the exact interrelation of metaepistemology and its twin metanormative subject of metaethics. In regard to the former issue, there are two broad, possible positions about the relation among the three branches. The first position is one we may call the autonomy thesis. According to the autonomy thesis, also sometimes propounded in ethical theory (compare Enoch 2013 for discussion), metaepistemology is an independent branch of epistemological inquiry that does not depend on the results of the other two branches of epistemology. Inversely, both applied and normative epistemology do not depend on the results of metaepistemology either. The autonomy thesis bears some prima facie plausibility because it seems intuitive that one may be, let us say, a coherentist, foundationalist, or reliabilist about normative epistemology but an expressivist, error theorist, or relativist about metaepistemology.

The other position on the matter is what we may call the interdependency thesis. It suggests that there are important theoretical interdependencies between the three branches (pace some prima facie appearances of autonomy). If, for example, we could reductively analyze epistemic justification in informative necessary and sufficient conditions, it seems that we would have a theory to invoke in normative justificatory matters and apply to pressing questions of epistemic justification like, say, climate change skepticism. However, the fact that such analyses do not seem readily available indicates that nothing is very obvious in metaepistemological matters.

In regard to the latter issue, namely, how to situate metaepistemology not merely within epistemology but within the broadly metanormative domain, there are again two broad, divergent positions. First, many metanormativists hold “the parity thesis” (or, sometimes called, “the unity thesis”) according to which the epistemic and the moral/practical are intertwined normative subjects, theoretically on a par and should therefore share the same metanormative fate, whatever that may be (realist, antirealist, Kantian constructivist, or even other) (compare Kim 1988; Cuneo 2007). Other metanormativists deny this and argue that there are important discontinuities between metaepistemology and metaethics and, hence, that we should instead hold “the disparity thesis” (compare Lenman 2008; Heathwood 2009).

For example, Cuneo (2007) has argued that the moral and the epistemic domain share core structural similarities (reasons, supervenience, motivation, and so forth) and that this bolsters the parity thesis. In response to Cuneo’s (2007) arguments for the parity thesis, it has been suggested by Lenman (2008) and Heathwood (2009) that while moral facts and truths may be irreducible, epistemic facts and truths may be reducible to facts and truths about evidence and probability (where these are ultimately to be understood in descriptive terms) and, therefore, there is a fundamental disparity between the two metanormative subjects. Again, Cuneo and Kyriacou (2017) have come up with a rejoinder to the Heathwood/Lenman case for the moral/epistemic disparity and argued that the parity seems to go through in the end. Of course, the dialectic is currently developing and the jury is still out.

So far, we have said a few basic things about the possible positions in situating metaepistemology within epistemology proper and within metanormativity. We now turn to the basic question of what it is that makes epistemology a distinctively normative subject and how from epistemic normativity we arrive at perplexing metaepistemological questions. The next section unpacks the various aspects of the metaepistemological domain that will be presented as we proceed.

2. Normativity

One of the most remarkable characteristics of human primates is their evolved, often linguistically mediated, capacity for cognizing and, moreover, the intrinsic normativity of this cognizing; intrinsic normativity of cognizing because our wide array of cognitive endeavors seem to be inherently “fraught with ought” and evaluable in terms of (in)correctness. Intuitively, to the extent that we are rational and responsible agents, there are propositions we ought to believe and propositions we ought not to believe, and there are cognitive practices, methods, processes, habits, and so forth that are epistemically correct to employ and others that are epistemically incorrect to employ. That is, (in)correct from the epistemic point of view.

Indeed, generations of epistemologists from the early moderns like Descartes (1641), Locke (1690) and Hume (1739), to Clifford (1877), Chisholm (1966), Alston (1988), Fumerton (1995), Feldman (2002) and beyond have attested the normativity of cognizing and have talked about corresponding epistemic duties, oughts, obligations, requirements, and so forth—terms that for current purposes are used interchangeably—that rational agents have.

For example, intuitively, we ought to believe on the basis of the relevant evidence or the relevant reliable cognitive process and ought not to believe what is merely bequeathed by tradition, dictated by fiat of authority or simply feels good. It is also epistemically correct to collect evidence meticulously and open-mindedly, and it is epistemically incorrect to cook up your lab research to the conclusions that a generous research sponsor would favor (for example, say, that extensive consumption of red meat incurs no side-effects on health and the environment).

It is precisely this intrinsic normativity of our cognitive endeavors (practices, methods, processes, habits, beliefs, theories) that gives rise to metaepistemological questions because as rational, responsible agents we seem bound by epistemic duties and obligations that are rationally non-optional and inescapable. To the extent that we are rational agents, we seem constrained by epistemic oughts and duties regardless of whether we like it or not, or whether we submit to these or not. The fact is reflected in ordinary locutions like “p is the right thing to believe,” “You should trust what Paul says because he is an expert on the matter,” or ‘They should have known this much; there is no excuse,” and so forth. Call this fundamental appearance of ordinary epistemic discourse the deontic appearance.

 

Of course, the deontic appearance is the prima facie appearance of ordinary epistemic discourse and appearances, even deeply entrenched appearances, as we know very well may be deceptive. Secunda facie, we may have no epistemic duties or obligations and epistemic normativity may not be explainable in deontological terms. But at least prima facie we often talk and think in terms of propositions that one should or should not believe and in terms of practices, processes, methods, habits etc. that one should and should not employ. This much of epistemic appearance seems unequivocal and whether we should debunk the deontic appearance or not is a further question down the road.

It should also be underscored that the deontic appearance of ordinary epistemic discourse seems to have a distinctively categorical flavor; that is, the phenomenology of our everyday talk and thought about duties, obligations, oughts, seems to imply the existence of categorical duties and obligations such as duties that are in some sense unconditional, that is independent of our psychology (desires, dispositions, beliefs,) and constrain what we ought to believe insofar as we are rational. For example, if a speaker utters, “You should believe that p” in an ordinary conversational context her statement would, typically, conversationally implicate that it is an (epistemic) fact of sorts that “You should believe that p.” A fortiori, the conversational implication is that anyone epistemically rational would be obliged to believe that p because it constitutes a categorical epistemic obligation (derivative of a corresponding epistemic fact).

In line with the deontic appearance, the broadly internalist view that takes it that we are bound by reflectively accessible epistemic duties is called epistemic deontologism (compare Clifford 1877; Alston 1988; Feldman 2002). It asserts that we have reflectively accessible, epistemic duties and that they should regulate rational doxastic behavior, namely, endorsing, maintaining, and revising a belief.  Epistemic deontologism can be construed in a number of ways depending on how we understand epistemic goals of inquiry. Accordingly, we can have different proposals about how to construe epistemic duties.

However, the standard way to understand epistemic deontologism has been in terms of epistemic justification (for discussion see Feldman 2002).  Roughly, an epistemic duty for S to believe p exists iff S has sufficient justification for p.  Sufficient justification may in turn be understood in various ways, perhaps, along broadly evidentialist lines, that is, in terms of a relatively high ratio of evidential probability (for example, Heathwood 2009) or even along reliabilist lines, that is, in terms of a high ratio of truth output by a process (or ability) in an externalist framework (for example, Goldman 1979).

This, of course, is not the only way epistemic deontologism may be construed because it can also be construed in terms of alternative epistemic goals/values like truth, knowledge, and even understanding or wisdom (compare for the latter two goals Kyriacou 2016). That would mean that, roughly, an epistemic duty for S to believe p exists iff p is true or an instance of knowledge or even promotes understanding or wisdom. However, the best construal of epistemic deontologism is a question we need not further dwell on here. The important thing for current explicating purposes is that no matter how epistemic duties are to be construed, the deontic appearance stirs a whole host of perplexing and far-reaching metaepistemological questions, like the following:

Metaphysical: Are there epistemic properties/goals, norms, and facts in virtue of which categorical epistemic oughts, duties, and obligations for rational agents follow? If yes, what is their exact nature? If no, where does this leave us in terms of the intrinsic normativity of our cognitive practices? May the nonexistence of epistemic properties/goals, norms, and facts cripple the normative dimension of our epistemic lives? How is the constraint of epistemic supervenience to be understood and explained?

Semantic: What is the meaning of epistemic statements? Is it descriptive, expressivist, or even other? Are epistemic statements truth-apt? If yes, can truth-aptness be rescued in an expressivist metasemantic framework? Can deflationism do the trick? If there are robust epistemic facts, how do they ground truth, if at all? If there are no robust epistemic facts, then what does ground epistemic truth, if at all? Is the meaning of epistemic statements invariant or context-sensitive? Do practical interests, stakes, and so forth have a semantic contribution to the meaning of epistemic statements?

(Meta)epistemological: If there are categorical epistemic oughts and duties, how do we get to know them, if at all? Do we merely construct such duties and obligations or do we somehow discover them? If we discover them, how can this happen with minimal reliability, given that such properties and duties do not seem at first instance natural? How is this cognitive reliability to be accounted for, given our evolutionary history and the fact that the evolutionary process has been a blind, nonintentional process largely pushing towards adaptation, survival and reproduction by means of natural selection? Are intuitions credible evidence, especially in view of the evolutionary-cultural origins of cognition? Is talk of epistemic duties and oughts misconceived in light of epistemic externalism? Does epistemic externalism comport with the intrinsic normativity of cognitive endeavors?

Reasons for Belief and Epistemic Psychology: Are there categorical reasons for belief or are all reasons for belief hypothetical and dependent on our contingent, subjective desires and dispositions? Is epistemic rationality merely instrumental or categorical? Is epistemic judgment motivating? If yes, motivating in what way?

Agency and Responsibility: Can we directly choose what to believe and, if not, what about the fundamental and deontic notion of epistemic responsibility? Is there such a thing as character or is it merely fictional? If there is, can it play an integral part in our epistemic lives? If there is not, where does this leave our epistemic lives?

The following sections concisely introduce and discuss at some depth at least many of these metaepistemological questions.

3. Metaphysics

A core component of epistemic metaphysics concerns ontology. Epistemic ontology explores questions about the existence and nature of epistemic properties like epistemic justification, warrant, rationality, entitlement, understanding, truth, wisdom, knowledge, epistemic duties, and norms like, “You ought to trust your senses, unless you have reason to doubt their overall reliability” and particular epistemic facts like, “The theory of evolution is well-justified, given the abundant empirical evidence.”

Here, focus is restricted on the epistemic properties of justification and knowledge and respective justificatory and knowledge duties/norms/facts for at least three basic reasons: first, because of the more prominent position they have traditionally held in the history of epistemology; second, because of their relatively more advanced research state of art; and third, because considerations of simplicity and economy inevitably constrain the thematic boundaries of the article.

Justification and knowledge are treated in turns and not jointly in spite of the fact that some positions about the two properties are strictly analogous, for two reasons: first, because this analogy can easily come apart, for example, in principle someone could be an antirealist about knowledge but not about justification; second, because the debates of justification and knowledge often develop independently of one another, and it would perhaps oversimplify the state of the debates if we agglomerate the two.

Now, beginning with epistemic justification, a traditional distinction that helps map the theoretical landscape of justification debates is that between epistemic realism and antirealism (though to see how hard it can be to distinguish the two see Dreier, 2004). On the one hand, realists take epistemic justification to be a real, mind-independent property that its existence does not depend in any way on human cognizing. Thus, if a belief is justified, then this should be the object of discovery and not of invention (or construction). Accordingly, we should be able to understand that a justified belief instantiates the property of justification and that it is in virtue of this property that is justified.

On the other hand, epistemic antirealists deny that epistemic justification is anything like a real and mind-independent property. Epistemic justification is considered a property (if at all) that is constructed out of the workings of evolved human cognizing and nothing over and above this. This is not to imply the rather naïve view that justification is made up “out of thin air” by the antirealist. Justification is still constrained by certain epistemic norms, facts or framework, although these are mind-dependent and are of mere local validity. For the antirealist, if a belief is justified, then it is justified in virtue of certain epistemic norms, facts, or framework, but this should not be overstated. Epistemic norms, facts, or framework are invented by cognizers and therefore epistemic justification is also invented. It is not the case that if a belief is justified, then we can somehow understand that is justified because it instantiates a real property of justification.

As in metaethics, realists can be distinguished between reductionists and antireductionists. Reductionists can further be distinguished between analytic and synthetic reductionists. Analytic reductionists believe in the capacity of traditional a priori conceptual analysis to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would take epistemic justification to be, in principle, reductively analyzable to a more basic property like coherence, reliability, foundations, virtues, responsibility, evidence, probability, and so forth (compare Bonjour 1985, 1998; Goldman 1979; Zagzebski 1996; Conee and Feldman 2004; Vahid 2005; Sosa 2007; Heathwood 2009;).

Synthetic reductionists are not so sanguine about traditional a priori conceptual analysis, and its purported capacity to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would deny that epistemic justification is, in principle, reductively analyzable to a more basic and informative property, but they would still cling on realism about epistemic justification because they would take it to be a natural kind property, somehow discoverable only by the a posteriori means of empirical science and not by the a priori conceptual analysis of traditional philosophical methodology (compare Jenkins 2007). As, for instance, we can discover by empirical means that “water is H20 molecules” or that “gold is the element with atomic number 79,” we can presumably discover the natural kind property that constitutes the essence of epistemic justification, or so the thought goes.

So far, we have seen analytic and synthetic reductionism. Both typically adhere to methodological naturalism, roughly, the view that naturalistic scruples constrain the right kind of philosophical methodology (compare Pollock and Cruz 1999). In other words, philosophical methodology should be empirically informed and cohere with our best naturalistic picture about ontology, epistemology, and so forth. Be that as it may, analytic and synthetic reductionists disagree about the exact content of methodological naturalism. Analytic reductionists are still optimistic about the method of a priori conceptual analysis while synthetic reductionists counter that conceptual analysis is rendered obsolete by the progress of the a posteriori methods of empirical science and that philosophy should be sensitive to this progress.

However, there are also antireductionist realists that are usually reluctant to embrace methodological naturalism, at least not any form of chauvinistic methodological naturalism that would exclude the possibility of antireductionism from the outset. Anti-reductionists usually disagree with their fellow reductionist realists about the capacity of methodological naturalism to deliver illuminating philosophical results and, in particular, results about justification—and, in principle, other normative properties (compare Moore 1903; Boghossian 2007; Cuneo 2007). They take epistemic justification to be a property that is real and mind-independent but not reducible to any more basic, natural property. That is, a property that can be the object of study of natural sciences and empirical psychology, neuroscience, anthropology, sociology, and so forth.

Behind their suspicion of methodological naturalism may hover the intuition that normative properties do not seem natural. This suspicious attitude also helps explain their pessimism about the employment of methodological naturalism in metanormativity puzzles, for if normative properties do not seem in any profound way to be natural, then, perhaps, we should not insist on the employment of the restrictive philosophical methodology of methodological naturalism. The thought is that if normative properties are, indeed, non-natural in any profound sense then by insisting on a naturalistic methodology we will not be making any progress. We will only engage in a subtle begging of the question, or so the thought goes.

On the opposite theoretical side stand epistemic antirealists about justification. Antirealists deny that the property of epistemic justification is anything over and above what human cognizers construct and thereby invent. This is not to deny that there are, in some sense, justified beliefs in virtue of certain epistemic norms and facts or agents that justifiably believe that p or even corresponding epistemic oughts and duties. It is only to reject the distinctively realist idea that epistemic justification is a robust property somehow “out there” and our beliefs are justified to the extent that they instantiate that “out there” property. Rather, justification is something that emerges out of our evolved cognitive attitudes that are mental states and out of our culturally evolved epistemic practices and interactions that are social activities. The same goes for corresponding epistemic oughts and duties. They may exist, in some sense, but definitely not in the Archimedean “out there” sense that realists like to envisage.

Like realists, epistemic antirealists are a heterogeneous lot. Some may be subjectivists, others expressivists, error theorists, or relativists. Let us very briefly preview the rudiments of these families of theories. Subjectivists typically hold that judgments of justification report the agent’s noncognitive attitudes, valuations, pro-attitudes.  For subjectivism, justification assertions like “p is justified, given my evidence” or attributions like “S justifiably believes that p” report the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth, for the belief that p or for S’s believing that p. Worthy of notice is that the speaker’s attitudes are reported and not expressed. The speaker is supposed, so to speak, to step back from his own attitudes, introspect and simply report these attitudes but not directly express them. So, if I say, “S justifiably believes that p,” according to a simple subjectivist theory I may be reporting my approval for the belief that p, but not directly expressing that approval.

The fine-grained distinction between reporting/expressing the speaker’s attitudes might seem like an insignificant detail but it is a distinctive feature of the theory that helps distinguish it from expressivist theories (compare Schroeder 2008a). Besides, subjectivism is usually understood to be a cognitivist theory while expressivism is usually understood to be a noncognitivist theory. To stipulate, a cognitivist theory is a theory that takes normative judgment to express descriptive mental states like beliefs while a noncognitivist theory is a theory that takes normative judgment (or at least some species of it) to express nondescriptive mental states like desires, proattitudes and sentiments. Obviously, although both subjectivism and expressivism may be labeled as broadly sentimentalist theories because they involve noncognitive attitudes, subjectivism is a cognitivist theory while expressivism is a noncognitivist theory.

Subjectivism is usually considered an implausible theory (see Schroeder 2008a) and, in fact, although it has some metaethical proponents (compare Wiggins 1987), to the best of my knowledge there are no obvious subjectivists in metaepistemology. But things are very different with regard to expressivism that has had quite a few proponents recently. Expressivists take justification judgments to express (and not report) the speaker’s noncognitive attitudes like approval, endorsement, recommendation, assurance, reliance, plans, trust, desires and intentions. Justification assertions like “p is justified” or attributions like “S justifiably believes that p” express the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth for p or for S’s believing that p (compare Kyriacou 2012). They express (or “voice”) directly the speaker’s states of mind.

The third antirealist theory of epistemic justification is that of error theory (or, sometimes, fictionalism). Unlike subjectivism and expressivism, error theory does not invoke, one way or another, noncognitive attitudes. Error theorists take their justification judgments to purport to describe respective justification facts but deny that such facts really exist (compare Olson 2011a, 2011b). Given the absence of such facts (typically considered to be truthmakers), we end up with an error theory, namely, a theory that suggests that justification judgments are uniformly false; at least all first-order justification judgments are false (compare Olson 2011a, 2011b).

The fourth antirealist theory of justification is that of relativism. Relativism denies the existence of “real” justificatory epistemic norms and facts and stipulates that justificatory norms and facts are only relative to some indexical factor of mere local validity—usually the agent, or his society, culture, and so forth (compare MacFarlane 2005). Often, relativists are cultural relativists that think that there are no mind-independent norms and facts and that the only norms and facts that really exist are some culturally constructed and embedded norms and facts (compare Stich 1990; for a recent defense of cultural moral relativism see Velleman 2013; and for criticism see Kyriacou 2015). These culturally constructed norms and facts allow for justified beliefs but the justification is of only local validity.

There are many subdivisions under the banner of each of these families of theories that we cannot really dwell on here. There are, for example, many different expressivist theories, and it is really doubtful whether any two of these theories are identical. For instance, Allan Gibbard, one of the most prominent and influential expressivists and one of the first to extend expressivism from metaethics to metaepistemology, has held at least two expressivist theories, the early norm-expressivism (1990) and the later plan-expressivism (2003). There are other versions of expressivism in the literature too—habits-expressivism (compare Kyriacou 2012) and hybrid versions of expressivism such as Ridge’s (2007) ecumenical expressivism (for some discussion of epistemic expressivism see Chrisman 2012).

We have now introduced the rudiments of the major realist and antirealist theories of justification, but there are also other theoretical options that are somewhat harder to classify as realist or antirealist that deserve at least a short mention. For example, as in metaethics (for example, Korsgaard 1996), a Kantian constructivist might claim that norms and facts of justification are constructed out of a priori constitutive norms of rationality (for example, the universalizability or autonomy formulas) and obviate the distinction between realism/antirealism. She could claim that her theory cannot be properly classified as realist or antirealist but only as deontological. Categorical epistemic duties follow from the application of these constitutive norms of rationality but these duties are, ontologically speaking, neither realist nor antirealist.

Be that as it may, so far we have discussed the question of the ontology of justification and the various sorts of approaches to it, but there are also two important metaphysical challenges for a plausible metaepistemological theory that deserve some attention: the evolutionary challenge and the supervenience challenge.

The evolutionary challenge is more of a challenge to normative realism and realist understandings of justification/knowledge (moral and epistemic). As Sharon Street (2006, 2009) has argued, our evolutionary history prima facie conflicts with normative realism because it is very implausible to think that we evolved and our moral and epistemic attitudes were somehow mysteriously and finely attuned to track corresponding moral and epistemic facts. Such a realist “tracking account” seems implausible on a number of counts (ontological parsimony, clarity, mysterious causal connections, and so forth), especially if we think of a competing metaphysically lighter, mere Darwinian account that explains our normative attitudes and their content as largely shaped by the main mechanism of evolutionary change, namely, natural selection. There is no need to postulate moral and epistemic facts, Street argued, in order to have the best Darwinian explanation of how we came to have the normative attitudes we tend to have.

Thus, the evolutionary challenge for realists is to explain, in the best theoretical way, how our normative attitudes have largely been shaped by natural selection in consonance with robust normative facts. The problem for the realist now is that it seems that Ockham’s razor should apply and redundant robust normative facts (and realism) should drop off the picture of that theoretical explanation because a mere Darwinian, antirealist account can do all the explaining we need.

Street’s evolutionary challenge has stirred some fascinating discussion and some interesting realist rejoinders that we unfortunately have to skip here—for example, Setiya (2012), Enoch (2013), FitzPatrick (2014), Vavova (2014). Nevertheless, it is widely acknowledged to be an important challenge for metanormative realism. If realism is to be plausible, it has to explain in a plausible way how it can comport with our evolutionary history.

The supervenience challenge asks us to explain how epistemic properties (if any) relate to the natural world. This challenge is usually interpreted in terms of the widely accepted metaphysical constraint of the epistemic supervenience thesis (compare Kim 1988; Conee and Feldman 2004; Vahid 2005; Cuneo 2007). It is a metaphysical constraint because it suggests that any theory needs to explain how epistemic properties like justification supervene on more basic, natural properties in such a way that if two situations are naturalistically identical (and thereby indistinguishable) and the first realizes, say, epistemic justification then the second situation must also realize epistemic justification.

Metaphysically speaking, it cannot be the case that two naturalistically identical situations (at least in the epistemically relevant aspects) realize inconsistent epistemic properties. There must be some naturalistic difference at the base level that grounds some difference at the normative-supervening level, otherwise there is no good reason why the supervenient-normative properties should be inconsistent. To illustrate, suppose that there are two naturalistically indistinguishable cases where a dead body has been found. It would be unreasonable to think that the one case justifies the belief in a homicide while the second justifies the belief in a suicide—unless of course there is at least some relevant naturalistic difference in the two cases.

The epistemic supervenience thesis is a rather technical way to formalize the strong mundane intuition that ‘no double standards’ should be allowed in normative matters (epistemic, practical/moral, aesthetic, or other). Some theories can deal with the constraint rather easily while others seem to have difficulties with it. For example, reductionist realists have an easy explanation of the constraint. Intuitively, if two naturalistic situations are identical (at least in the epistemically relevant respects) and the first situation realizes justification as, say, coherence, then there should be no surprise that the other situation realizes coherence and thereby justificatory status.

On the other hand, antireductionists cannot offer the same account of supervenience with the reductionist because, crucially, they deny that epistemic justification is a reducible property. To see this, conceive for a moment of epistemic justification as a non-natural property, an irreducible property that regardless of what natural facts it is not entailed that a certain belief p is justified for S. Conceive now of two naturalistically identical situations and that one of the two situations does realize justification. It seems that it remains at least an open question whether the second situation also realizes epistemic justification exactly because the property is not reducible to a more basic property. No doubt, this is not to deny that antireductionists can somehow explain epistemic supervenience. It is only to note a distinctive challenge they face.

Expressivists usually attempt to explain epistemic supervenience as a merely conceptual and not as a metaphysical constraint (for discussion compare Hare 1952; Blackburn 1993; and Ridge 2014). For antirealists such as expressivists, epistemic supervenience is only an a priori norm of rationality that constrains the appropriate application of the concept of epistemic justification. If we have two naturalistically identical situations (at least in the epistemically relevant respects), and we judge that the one justifies a target belief p, then on pain of irrationality, we should also judge that the next situation justifies the belief that p.

One worry for antirealist explications of supervenience is that they reduce the constraint from a metaphysical principle to a conceptual and this seems to change the subject; likewise, expressivists might object that to insist that the constraint should be addressed in its metaphysical guise it is to beg the question in favor of realism. At any rate, supervenience is a tricky but valuable philosophical concept, and there are questions about how to best interpret it (local or global, for instance) and how to best account for its intuitive character, but the discussion will end here. Enough has been said to showcase why epistemic supervenience seems to be a challenge for realists and antirealists and a desideratum for a plausible metaepistemic theory of justification.

Let us now turn to the ontology of knowledge. The metaepistemology of knowledge seems even less well-defined than the metaepistemology of epistemic justification. This is reflected in the fact that knowledge theorists do not usually speak in the metaphysically-loaded terms of realism/antirealism about knowledge. They often set up their discussion in terms of challenges and problems for a theory of knowledge like radical skepticism, the Gettier problem, the lottery problem, the dogmatism paradox, linguistic evidence, and so forth. To keep the distinctively metaepistemological tone of the article, let us follow Michael Williams (2001) and speak of realism/antirealism about knowledge. According to Williams (2001), and this is the understanding of knowledge realism/antirealism that we endorse in the ensuing discussion, realists think of knowledge as something real, invariant, and mind-independent. Antirealists simply deny that there is such knowledge.

So, for realists, whether “S knows that p” is a question to be decided by the corresponding, independent knowledge facts, whatever these may be (evidential, reliabilist, virtue-theoretic, and so forth). Knowledge status is not a construct or invention of human cognition of sorts. In contrast, antirealists deny that there is such a thing as robust knowledge and corresponding robust knowledge norms and facts. There is knowledge, of course, but not in the metaphysically-inflated sense that realists tend to envisage. Knowledge is true belief in accordance with certain knowledge norms and facts that are not mind-independent and not of universal validity. How to comment on antirealist knowledge norms and facts depends on the contours of the particular theory one favors (relativist, expressivist, error theory and so forth). At any rate, there is nothing over and above this sort of knowledge.

Realists divide again into reductionists and antireductionists, and reductionists further divide into analytic and synthetic reductionists. On the one hand, analytic reductionists think that the analytical project about knowledge is still viable in spite of the failures and pessimism that traditional conceptual analysis may occasionally inspire. The Gettier problem has, for example, inspired pessimism about the prospects for an analysis of knowledge to many epistemologists (compare Kirkham 1984; Fogelin 1994; Williamson 2000; Floridi 2004), but some others remain unmoved and argue for sophisticated analyses of knowledge, like Pritchard’s (2012) anti-luck virtue-theoretic account. On the other hand, synthetic reductionists again tend to treat knowledge as a natural kind property that in principle should be discoverable by means of empirical inquiry. In this spirit Kornblith (2002) and Neta (2008) have argued that knowledge is just another natural kind.

In their turn, antireductionists deny that knowledge is reducible, analytically or synthetically. They tend to think that knowledge is irreducible to anything more basic and should be taken to be a primitive and sui generis concept, “the unexplained explainer” that is the most fundamental building block for an epistemological theory. Other epistemic concepts/phenomena like evidence, justification, probability, assertion, and skepticism should be explained in the light of knowledge and not the other way round. The most notable proponent of this “knowledge first” approach to epistemological theorizing is, of course, Williamson (2000).

Worthy of note is that Williamson (2000) puts forth his antireductionist theory not as a metaphysically-loaded theory, so it would be a mistake to hasten to infer that he is a non-naturalist about knowledge just because he is an antireductionist about knowledge. He is more interested in showing that the analytical reductionist project about knowledge is a “degenerate research programme.” In consequence, the safe thing to say is that although he is an avowed antireductionist, he shows no particular interest in the metaphysics of knowledge as such.

Antirealists about knowledge include expressivists and relativists as well as what we could call skeptics-as-error theorists. Expressivists like Gibbard (2003) and Chrisman (2007) claim that there are no “real” knowledge facts in virtue of which knowledge truths obtain. Building on Gibbard’s (1990) norm-expressivism about rationality, Chrisman (2007) suggests that knowledge is a normative concept we use to evaluate epistemic positions, and, accordingly, express approval for the norms in virtue of which true belief is formed. There are also relativists about knowledge who suggest that there are no independent or absolute knowledge facts but only constructed knowledge facts of mere local validity, like Stich (1990) and Macfarlane (2005).

Skeptics about knowledge are typically also antirealists and one natural way to integrate their position in metaepistemological classification would be as error theorists (compare Unger 1971; Fogelin 1994; Kyriacou 2017). Skeptics deny that there are any real knowledge facts (at least empirical knowledge facts) and therefore imply that at least most of knowledge discourse is implicitly in a state of constant error (for discussion compare Hawthorne 2004). Knowledge assertions and attributions are almost uniformly false. Of course, we speak of knowing such-and-such but we do not know in reality because there is no such thing as knowledge. As a result, ordinary speakers are afflicted by semantic blindness about the concept of knowledge. They unwittingly speak as if they know, but philosophical reflection can eventually indicate that there is not much of real knowledge.

Thus far we have introduced realists and antirealist approaches to knowledge. Interestingly, however, some epistemologists argue that there is “real” knowledge but not of the demanding invariant sort that traditional realism (as stipulated above) presupposes. There is proper, “real” knowledge that fully merits the name but is context-sensitive. That is, it is knowledge where the demandingness of the standards of justification (in virtue of which we arrive at true belief) varies with context because of factors like the intentions, needs, stakes, goals, and so forth of the attributor. In essence, the standards of knowledge may shift from context to context.

Contextualists, though, suggest that at least some important portion of our knowledge talk comes out true because the standards of knowledge need not be so high that we couldn’t ever satisfy them. Of note is that contextualism is a semantic view and not a metaphysical view and as such it could be easily wed to an antirealist theory. For example, some cultural relativists might be understood as a kind of contextualists about knowledge. The concept of knowledge picks out incommensurable culturally constructed knowledge facts from cultural context to cultural context (for critical discussion of such views see Boghossian 2007). At any rate, discussion of contextualism about knowledge takes us into the field of semantics, which is discussed in section four.

A final note on how the supervenience constraint applies to knowledge. As the epistemic supervenience thesis applies to epistemic justification, it seems to apply to knowledge as well. That is, if two epistemic situations are naturalistically indistinguishable (at least in the epistemically relevant respects) and the first situation realizes knowledge, then in the absence of at least some relevant difference between the two situations, it would be irrational to deny that knowledge is realized in the second situation. No double standards are allowed in epistemology. Again, the supervenience constraint on knowledge has stirred some fascinating discussion but space restrictions oblige us to leave the topic here (for critical discussion of the mentalist supervenience thesis of Conee and Feldman 2004, see Greco 2010).

We have now talked a bit about the metaphysics of epistemic justification and knowledge. As it has probably become obvious from the discussion, metaphysics is inextricably linked with semantics (and there is also a good methodological question of explanatory priority) and, therefore, it makes for a natural follow up section.

4. Semantics

Epistemic semantics typically deals with the meaning of epistemic declarative statements and often focuses on the more traditional concepts of justification and knowledge. Let us first present a certain famous semantic challenge for the meaning of normative predicates that has been lately applied to the subset of epistemic predicates as well (Jenkins 2007; Heathwood 2009; Greco 2015; Cuneo and Kyriacou 2017). This is Moore’s (1903) famous open question argument that he applied to “goodness” that ushered Moore to the conclusion that goodness is an indefinable predicate that picks out a simple and sui generis non-natural property. It can be formulated in this style of question: “Is (super)natural property N (for example, pleasure, desire, divine will, and so forth) goodness?” Or, in terms of more colloquial discursive contexts, “I can see that this is N (pleasurable, desirable, socially accepted, and so forth), but can’t see why this is good.”

Moore thought that competent speakers of English (and of goodness) will find this style of questions widely open and that this intuition of semantic openness is evidence that goodness is irreducible. He thought so because he assumed that property identities can be discovered solely by a priori conceptual analysis and that they should be directly transparent to competent speakers of the target language. As applied to epistemic predicates like justification, the open question argument would go like this: “Is (super)natural property N epistemic justification?”; or “I can see that this belief (or system of beliefs) is coherent, intuitive, socially approved, reliably produced and so forth but is this really justified? I don’t see that!”; or “I see that this belief is coherent, self-presenting, and so forth, but so what? It does not make it justified.”

The open question argument has been transposed and applied to epistemic justification and epistemic rationality with the same antireductionist verdict as in the case of goodness (and other moral predicates). Interestingly, there are also echoes of the Moorean semantic openness idea in Williamson’s (2000) famous antireductionist argument about knowledge. Williamson (2000:31) says, for example, that ‘‘even if some sufficiently complex analysis never succumbed to counterexamples, that would not entail the identity of the analyzing concept with the concept knows. Indeed, the equation of the concepts might well lead to more puzzlement rather than less.”  The Moorean, semantic intuition that lies behind Williamson’s words is that it just seems that any purported reduction of knowledge, even if it avoids counterexamples, will remain semantically open and this indicates that the predicate is irreducible to a more basic property. It is a conceptual primitive that is not derivative to anything conceptually more basic.

As it quickly became obvious, however, the open question argument is dialectically vulnerable because it relies on unwarranted assumptions about meaning and analysis. First, intuitions of semantic openness are inconclusive evidence for property non-identity. We might have not discovered the correct analysis yet and, besides, intuitions are often a bad counselor (compare Frankena 1939).

Second, not all property identities should be immediately transparent to a competent speaker (compare Smith 1994). Some may of course be almost trivial like “vixen is a female fox,” but some others may require reflection and practice even for the relatively simple “circle is the figure with a circumference that is equidistant from the center.” Arguably, it might take some reflection (and drawing) for a competent speaker to grasp the property identity. Third, some property identities may not be discoverable by a priori conceptual analysis no matter how hard or ingeniously we try (compare Brink 1989; Kornblith 2002; Jenkins 2007; and Neta 2008). Some seem discoverable by the a posteriori means of empirical science like the natural kinds of water, gold, silver, salt, and so forth.

But in spite of its dialectical weaknesses, the open question argument has exerted significant influence to metaethics (compare Darwall and others 1992) and now seems to extend this influence to metaepistemology. Many think that there is something strongly intuitive to this argument that is hard to shake off. Perhaps what the argument captures is the strong—in Enoch’s (2013) words- “just-too-different intuition,” namely, the intuition that normative properties are just too different to be reduced to anything more basic, like natural properties. Others, no doubt, think that the argument can be blunted and we can still argue for reductionist accounts of goodness, justification, rationality, knowledge, and so forth. To be sure, all parties to the discussion seem to consider it a serious semantic challenge that should be addressed, one way or another.

Those who accept the antireductionist result of the open question argument may interpret it (and actually have) in ways amenable to their overall views, realist or antirealist. Synthetic reductionists have seen it as evidence that an a priori conceptual reduction of epistemic properties is not possible but this constitutes no reason for thinking that an a posteriori reduction could not apply. So they proposed that perhaps epistemic properties are irreducible by conceptual analysis and, hence, this is why we have “open feel” semantic intuitions in Moore-style open question arguments.

This optimism, though, seems to be premature as there are good reasons to question whether sophisticated synthetic reductionism is all that promising. In the case of moral properties, Timmons and Horgan (1991) have devised a sophisticated Moore-style open question argument, the so-called “moral twin earth argument,” with the intent to thwart the optimism of synthetic reductionists. Inspired by Putnam’s (1975) seminal “twin earth argument” for semantic externalism, Timmons and Horgan have devised a thought experiment that allows us to test our semantic intuitions for the prospects of such a synthetic reductionism.

Timmons and Horgan contrasted their moral version of the twin earth argument with Putnam’s natural kinds version. They argued that, while in Putnam’s original thought experiment our intuitions suggest that the meaning of natural kind terms is not merely fixed internally (by other “meanings in the head,” so to speak) but externally by the natural world, in the moral version of the story our intuitions differ significantly. We tend to think that differences in the extension of, say, “right’” merely reflect differences in internal normative theory, not external natural facts about rightness. This, obviously, contrasts with Putnam’s natural kinds version because these intuitive differences in extension in the twin earths experiments seem to reflect differences in what external natural facts we tend to think there are.  We tend to think that there are no moral natural facts or kinds.

The exact details need not detain us here but what is important is that as in the case of Moore’s classic open question argument,, in the moral twin earth argument, our semantic intuitions seem to suggest that such a synthetic reduction is not in the offing. Even if there were a synthetic reduction of normative properties, we would tend to find such a reduction semantically open. To illustrate this, suppose for the sake of argument that somehow epistemic justification is reduced to some externalist property X (reliabilist, subjunctive tracking property, and so forth). Would this close the question “Is justification the externalist property X?”; it seems that the question would still strike us as widely open.

Some others have seen the open question argument as evidence for an antireductionist but realist theory while others, more sympathetic to both antireductionism and methodological naturalism, have seen it as evidence for antireductionist antirealist theories like error theory or expressivism (compare Ayer 1936). They suggested that we have entrenched “open feel” semantic intuitions because no reduction (analytic or synthetic) of normative properties (moral or epistemic) is forthcoming. Since there are no such properties, the quest for reduction is therefore quixotic.

In particular, expressivism is a very interesting approach to moral and epistemic discourse because it breaks completely with the traditional and mainstream truth-conditional metasemantic framework. Unlike error theorists, relativists, and so forth, expressivists question and reject both factualism and cognitivism. While antirealists by definition reject factualism—namely, the idea that propositions are rendered true by corresponding robust facts regardless of the kind of discourse (descriptive or normative)—with the sole exception of audacious expressivists, they respect cognitivism. That is, they respect the thesis that normative propositions express descriptive mental states like beliefs that purport to pick out corresponding normative properties and facts.

The result of the joint rejection of factualism and cognitivism is a novel metasemantic framework that does not understand meaning on the basis of truth and reference (and truth-conditions) but on the basis of the notion of expression of states of mind (though, for a recent non-content-centric exposition of expressivism see Charlow 2014). Truth is not built in the rudiments of the semantic theory at first instance. Sophisticated, non-classical expressivists (compare Blackburn 1993, 1998) like to appeal to a deflationary account of normative truth, but even then truth is not a primitive component of their metasemantic theory, but only a derivative of the mental states expressed.

This allows expressivists to reap some important explanatory fruit but at the same time also incurs important theoretical costs: expressivists can explain the operation of the open question argument; they are in accord with a naturalistic picture of ontology and epistemology consonant with our evolutionary history (compare Gibbard 1990, 2003); they can explain epistemic motivation (compare Kappel and Moeller (2014); they can explain normative disagreement as a mere conflict in noncognitive attitudes (compare Stevenson 1963); they are in line with linguistic evidence for the expression of noncognitive attitudes in normative discourse; and even more.

However, the theoretical price they are called to pay seems also pretty high. For a start, at first sight truth and objectivity fall out of the picture and the appeal to deflationism about truth seems only to transpose the problem a step back (compare Cuneo 2007). Normative disagreement also becomes a psychological matter of conflicting noncognitive attitudes and not a logical matter of truth/falsity. Consonance with empirical linguistics may also quickly become a problem rather than an attraction because sometimes we may express normative thoughts without expressing noncognitive states like approval as the theory suggests that we should;  at least this much of empirical linguistics cannot and should not be denied a priori by expressivism (for similar points compare Huemer (2008) and Yalcin 2012).

There are even more problems for expressivism, but a very serious problem that deserves at least a brief note is widely considered to be “the Frege-Geach problem” (compare Schroeder 2008a, 2008b; Charlow 2014). Drawing from Frege’s (1918/1997) discussion of negation, Geach (1960, 1965) first broached the problem for the early expressivist theory of emotivism, and ever since the problem has been at the forefront of expressivism debates. The problem consists in the fact that while expressivism works reasonably well in the context of asserted atomic normative sentences, in non-asserted logically complex contexts, the noncognitive content of the sentence seems absent.

This seems to have serious repercussions because in the case of deductive inference contexts it implies a fallacy of equivocation in regard to the normative predicate. This is so because the meaning of the predicate seems different from premise to premise and as sameness of meaning is a prerequisite for truth-preservation and validity, an obviously valid argument is left invalid. This means that expressivism does not account for a key semantic fact, namely, logical validity (for a round discussion of the problem see Schroeder 2008b).

The Frege-Geach problem has provoked heated discussions about the plausibility of expressivism, and expressivists keep trying to address the challenge. Some have tried to build a so-called “logic of attitudes” while others have tried to build structure into the involved noncognitive attitudes that would help explain the syntactical features of logic and address the problem (compare Blackburn 1993, 1998; Gibbard 1990, 2003; and Schroeder 2008a). More recently, some have developed novel non-content-centric understandings of expressivism (compare Charlow 2014). Whether expressivism can be developed into a plausible, full-blown metasemantic framework is currently an ongoing research project for many philosophers (some pessimists, some optimists).

Perhaps unsurprisingly, expressivists have an additional problem with truth. For an expressivist, nondescriptive theory of meaning the notion of truth need not come up in the explication of the meaning of sentences. What matters are the states of mind expressed and not truth-conditions, but truth is so valuable a notion that we cannot just give up so lightly. For instance, we want to hold onto such truths like that “the theory of general relativity is well-justified, given the evidence’ or that ‘murder is wrong.” But of course, expressivists are baffled about such truth-talk. Early emotivists like Ayer (1936) were happy to concede that moral statements are not truth-apt at all, but misgivings about giving up truth never subsided.

More sanguine and sophisticated non-classical expressivists observed that expressivism is a metasemantic framework that need not directly involve truth-conditions but that need not imply that we cannot wed such a framework with a derivative notion of truth. Perhaps it is incoherent to assume that expressivism can be wed to certain traditional theories of truth like a correspondence theory, given the antirealism of expressivism, but we could still appeal to metaphysically light theories of truth like deflationism/minimalism. This is what Blackburn (1993, 1998) has proposed for example. Blackburn suggested that we could be expressivists and antirealists but appeal to a deflationary theory of quasi-truth and quasi-facts in order to rescue normative quasi-truth. He has dubbed this project quasi-realism for obvious reasons: you can be an antirealist but mimic all the realist appearances, like the truth appearance (though, for the so-called “problem of creeping minimalism” about how to distinguish realism from quasi-realism see Dreier 2004).

Deflationism about truth understands truth-talk in a merely disquotational, ontologically deflated way. No commitment to a truth-property is involved and typically truth-talk is understood as a linguistic device that serves certain conversational and social functions. For instance, to say that “p is true” is semantically equivalent to saying that “p” (and vice versa).  The predicate “…is true” (and cognates) may recommend that p, show approval of p, confidence about p, and so forth. and facilitate other conversational and social functions but does not thereby commit to a robust truth property. Similarly, saying that “It is true that p is justified” merely shows approval, recommendation, and so forth. that p is justified but without any ontological commitment to a truth property (see Dowden & Swartz’s Truth for more discussion of deflationism).

Of course, deflationism as a theory of truth faces a number of important independent objections that carry over to the expressivist project. Here are a couple of objections. First, it is very unclear what grounds normative truth in the absence of normative facts (compare Cuneo 2007). Truth seems to presuppose a grounding relation that confers truthmaking but it is not clear what grounding and truthmaking can antirealist expressivism offer (for discussion of grounding see Schaffer 2013). So the question remains: If a normative sentence is true, it is true in virtue of what? Second, we often ascribe truth in second-order sentences like that “It is true for expressivism that realism is false and antirealism true” and there is a question about how the expressivist can explain such truth-talk if there is no corresponding robust truth about the matter.

To conclude our discussion of expressivism and return to mainstream truth-conditional semantics, there are various truth-conditional theories of justification/knowledge with different takes on semantics. There is of course traditional invariantism and contextualism about justification and knowledge. Traditional invariantism takes justification/knowledge to be absolute/univocal concepts that their meaning remains invariant from context to context (compare Unger 1971; Fogelin 1994; and Kyriacou 2017). Invariantism can then be glossed either in more moderate terms or more demanding terms. Moderate invariantism about knowledge can, for instance, be explicated in terms of a safety principle, that is, a principle that roughly suggests that true belief involved in knowledge could not have easily been false. Such safety-based, Neo-moorean approaches can be found in Sosa (1999), Williamson (2000), and Pritchard (2007, 2012).

More demanding invariantism about knowledge can be explicated in terms of a sensitivity principle, that is, a principle that roughly suggests that true belief involved in knowledge is sensitive to falsity and if the belief were false it wouldn’t be believed. Interestingly, one strand of demanding invariantism may lead to skeptical invariantism because it seems to indicate that all logical (relevant or irrelevant) possibilities of error should be taken into consideration and ruled out. Given that almost always we cannot rule out all logical possibilities of error, we inevitably embrace skepticism about knowledge. Of course, sensitivity theorists need not be, and some have not been, skeptics. Nozick (1981), for example, famously accepted a sensitivity condition on knowledge but rejected the intuitive condition of closure under known entailment and escaped the embrace of skepticism; at least this much he thought (for critical discussion compare Fogelin (1994) and Hawthorne 2004).

Some other philosophers simply repudiate the semantic invariance assumption in favor of semantic contextualism. Attributor contextualism takes justification/knowledge to be context-sensitive concepts, that is, concepts that their meaning varies due to contextual factors (stakes, interests, needs, goals, and so forth of the attributor). Such contextual factors induce a conversational shift of epistemic standards between high and low standards depending on the discursive context. Philosophers like Annis (1978), DeRose (1995), Lewis (1996), Cohen (1998), Williams (2001), and Wedgwood (2008) have proposed contextualist accounts of justification/knowledge that analyze meaning as context-sensitive.

More recently novel understandings of invariantism and contextualism have been proposed. These novel understandings are at least partly motivated by empirical linguistic evidence of how we occasionally use the concepts of justification/knowledge. Subject-sensitive invariantism proposes that the meaning of justification/knowledge is sensitive to the invariant subject’s practical interests, stakes, goals, and so forth. (and not the attributor’s) and makes much of the importance of knowledge for assertion and practical reasoning (compare Hawthorne 2004). Contrastivism takes into consideration the contrastive and comparative element in justification/knowledge discourse. For example, if I say ‘”S knows that p” this might conversationally imply that “S knows that p rather than q” (compare Schaffer 2004).

Questions of epistemic meaning go far beyond what we presented, but we have to pause here. In the next section we turn to (meta)epistemology, namely, the epistemology of epistemology and, in particular, the (meta)epistemology of epistemic justification and knowledge.

5. (Meta)Epistemology

Let us pause for a moment to take stock. We have explained how metaepistemological questions and puzzles arise out of the deontic appearance of our cognizing and its concomitant intrinsic normativity. Accordingly, we explained that human cognizing seems to implicate categorical epistemic duties and obligations; that is, there are propositions we ought to (dis)believe and practices that are (in)correct to employ from the epistemic point of view. Finally, we explained that justification and/or knowledge and corresponding epistemic oughts may be understood realistically or anti-realistically, or even otherwise (for example, in Kantian constructivist style) and delved a bit into the semantic aspect of things.

The obvious (meta)epistemological question now is how we get to know, or at least have justified belief, of such epistemic oughts and obligations (realist, antirealist, or other). That is, how we get to know, or at least have justified belief, about what we ought to believe or what doxastic practices, habits, and methods we should employ. To pursue this metaepistemic question we need first to stipulate a fundamental and very contentious distinction, namely, the distinction between epistemic internalism/externalism. The distinction is so fundamental—that we have not managed to entirely avoid it so far—because the epistemological theory we end up with depends on how we construe it and take sides about it. Given that so much hinges on the distinction, it should come as no surprise that the distinction is so contentious that is even debatable how best to construe it (and there are a number of ways of doing so) (see Poston’s Internalism and Externalism in Epistemology for discussion).

One standard way is to construe it in terms of cognitive accessibility, namely, the reality or not of cognitive accessibility to facts/evidence/reasons that support the target belief p. According to the accessibility reading, epistemic internalism suggests that S justifiably believes that p iff S has cognitive access to epistemic reasons in support of p (compare Chisholm 1966). Epistemic externalism simply consists in the denial of epistemic internalism. S can justifiably believe that p even if S has no cognitive access to epistemic reasons in support of p. If, for example, the belief is produced by a reliable belief-forming process, then S can justifiably believe that p (even without access to supporting reasons). There may of course be cases of justified believing that enjoy cognitive access to reasons in support of p, but for externalists this is not a prerequisite for justification (compare Goldman 1979).

A second reading of the distinction is laid out in terms of mental states rather than accessibility (compare Conee and Feldman 2004). According to the mentalist reading, epistemic internalism suggests that S justifiably believes that p iff S has mental states that count as evidence in support of p. Epistemic externalism again denies this. S can justifiably believe that p even if S has no mental states that count as evidence in support of p (compare Greco 2010). The mere satisfaction of some external conditions (reliability, counterfactual tracking, and so forth) suffices for justification or knowledge.

Of note is that mentalism implies no accessibility to reasons for justification. All that is required for justification are evidential mental states, but the agent need not have access or conscious awareness of such mental states or reasons. In this sense, mentalist internalism concedes to access externalism that cognitive access is not required for justification but still upholds that justification supervenes on and is fixed by mental states. At any rate, for the purposes of this article we could stick to the more standard accessibility reading of the distinction of epistemic internalism/externalism.

With this partial clarification of the epistemic internalism/externalism distinction in hand, we can now revert to the question of how we know, or at least have justified belief, about epistemic duties and obligations. Recall that ordinary epistemic discourse may give some prima facie support to epistemic deontologism and that the traditional (and more natural) interpretation of this has been in internalist contours. We have epistemic duties and obligations, like the general and a priori “You ought to abide by the relevant evidence and pursue the truth” or the particular and a posteriori “You should believe what Mary says because of her relevant expertise,” and these are reflectively accessible. Careful reflection can, in principle, indicate the epistemic duties and obligations an agent may have.

Of importance for the epistemology of epistemic oughts and duties is also the ontological distinction between epistemic realism and antirealism. Realists would claim that there are real, mind-independent epistemic duties and obligations that vindicate the deontic appearance and would offer different epistemological stories of how we get to know them, or at least have justified belief about them (foundationalist, coherentist, foundherentist, reliabilist, virtue-theoretic, and so forth).

To use a toy theory as an example, a foundationalist analytic reductionist would suggest that categorical epistemic duties are either non-inferentially justified or inferentially justified. Non-inferentially justified beliefs are beliefs that are justified but not in virtue of being based on other beliefs. There are various ways how beliefs could be justified without recourse to other beliefs: by means of non-doxastic states (for example, Fumerton 1995; Bonjour 2003); or self-presenting doxastic states (compare Chisholm 1966; McDowell 1994); or by means of belief-independent belief-forming processes (compare Goldman 1979); or in the case of a priori basic beliefs, in virtue of conceptual content (compare Bonjour 1998). To take the latter case, for instance, it could be suggested that “the duty to proportionate belief to the relevant evidence” seems a priori prima facie justified in virtue of conceptual content for anyone rational with proper understanding of the meaning of the proposition.

Instead, inferentially justified beliefs are beliefs that are justified in virtue of other beliefs/reasons. If beliefs/reasons for p are sufficiently strong then we have an obligation to believe that p. If for example I say that “The thing to believe is what Mr Poirot has concluded,” this should be further supported by epistemic reasons, that is, reasons that involve relevant evidence, like, say, Poirot’s character traits of integrity, attentiveness and investigating dexterity. Such a traditional foundationalist model of epistemic duties and obligations would be broadly internalist and intuitionist. On the one hand, internalist because we need access to epistemic reasons for justification and, on the other hand, intuitionist because intuitions should play a distinctive role in identifying at least non-inferentially justified duties to believe.

Unfortunately both features of such a theory are very controversial. To begin with, first, externalists would deny that internalism is a plausible constraint on epistemic justification. For one thing, they would suggest that it over-intellectualizes our everyday cognitive practices and processes and thereby distorts them beyond repair. Our cognitive practices and processes need not be conceived as so reflective and intellectual (compare Goldman 1979; Plantinga 1993; Sosa 2003, 2007; and Greco 2010). Second, what seems to really matter in our cognitive endeavors is cognitive success (of sorts) out of cognitive ability, be it of the goal of truth, knowledge or even other. So accessibility to facts/reasons does not really matter for justificatory matters. What matters is reliability through cognitive ability for truth and knowledge.

The intuitionist component does not fare any better. First, intuitions are often quite unreliable, and this psychological fact is unsurprising if we consider the evolutionary and cultural origins of our intuitions (compare Haidt 2012, and Kahneman 2011). So there is a clear question why trust our intuitions, especially when we have conflicting intuitions with other epistemic peers we esteem and trust as cognizers (for a similar point compare Setiya 2012). Second, it is not clear what sort of epistemic facts about duties we intuit when we invoke our intuitions. Surely such facts do not seem natural in the sense that they do not seem part of the natural world and the corresponding object of study of empirical sciences. Equally, non-natural (epistemic, moral or other) facts sound mysterious, or queer if we are to recall Mackie (1971), not to mention how we can reliably (and causally) track such facts if they exist (compare Street 2006, 2009; Olson 2011b).

Third, suppose our intuitions are at least somewhat reliable and sometimes, somehow, do track corresponding epistemic facts about duties. This much we can suppose with some justification because if our intuitions are completely and universally unreliable then it would seem that we have no rational ground for the starting point of an inquiry. So, it seems that too much of skepticism about intuitions might be self-defeating of any epistemic endeavor and lead to global skepticism, something that almost all epistemologists find unpalatable (pace Unger (1971), compare Plantinga 1993; De Cruz and others 2011; Vavova 2014). In addition, if we also assume so-called factualism, namely, the ontological idea that it is robust facts (of sorts) that stand as truthmakers for propositions, then there must be epistemic facts that render propositions about epistemic duties true.

Of course, internalist deontologism is far from being the only game in town. Externalists typically reject both epistemic deontologism and its internalist underpinnings as implausible. Externalists contend that the idea of internally construed epistemic duties and obligations is rendered obsolete by the advent of the more sophisticated scheme of externalism and, therefore, we should not take the deontic appearance of ordinary epistemic discourse at face value; at least not in the traditional internalist mode of its interpretation (for example, Greco 2010).

Instead, externalists focus on the reliability of cognitive processes (perception, memory, induction, deduction, introspection, and so forth), where reliability is typically understood in terms of a relatively high ratio of truth output. The basic insight is that if a cognitive process systematically and reliably delivers truth, then the belief-output of such a reliable process can be considered justified. No Cartesian-style reflective access to duties and supporting reasons is required for justification (and potentially knowledge). We can have justification and knowledge without further reflective justification or epistemic duties. It is not accidental that often externalist theories like reliabilism are understood as a form of epistemic consequentialism (see Dunn’s Epistemic Consequentialism). That is, of a theory that identifies epistemic value with the maximization and promotion of the goal of truth and the minimization and avoidance of error.

There are a variety of externalist positions in the marketplace of epistemology: Goldman’s (1979, 1986) process reliabilism; Plantinga’s (1993) proper functionalism; Bergmann’s (2008) Reidian externalism; Sosa’s (1991, 2007) and Greco’s (2010) virtue-theoretic externalisms and more. It is interesting to note that some externalist positions make important concessions to internalism and in essence come up with hybrid theories of justification and knowledge that aspire to a reconciliation of sorts between the two camps.

A good example of this is Sosa (1991, 2007). Sosa distinguishes between animal knowledge and reflective knowledge. Animal knowledge is simply “apt” knowledge that is produced by reliable, virtuous faculties, but human knowledge is the distinctively reflective sort of knowledge where reasons in support of the belief are accessed. Reasons are supposed to spring out of a coherence testing of the belief. That is, for human-reflective knowledge, production by the relevant virtuous faculty is not sufficient. Reasons in support of the belief should be accessed and these should come via a coherence test.

Hybrid accounts like Sosa’s seem better poised to fend off the often heard charge against externalism that it seems to leave epistemic normativity out of the picture because it overlooks the question of what one ought to believe (compare Fumerton 1995; Brandom 2000). More precisely, the charge is that it permits only for mechanical, reliable belief-forming processes that mostly deliver true beliefs and therefore ignores the further requirement for reflection about epistemic oughts. Sosa (1991, 2007) can alleviate this worry because he allows for a reflective element that can monitor the operation of the processes and ponder even about which process one ought to employ and rely on. This seems to be an advantage of Sosa’s (1991, 2007) theory over other mere externalist theories.

We have now seen some of the basic issues surrounding the epistemology of epistemology. Let us now turn to reasons for belief and epistemic psychology.

6. Reasons for Belief and Epistemic Psychology

Mainstream externalists dislike the whole idea of talking about (at least reflectively accessible) epistemic duties and reasons. For one thing, they tend to think that it is overly intellectualistic and, therefore, distorts ordinary cognitive practice. We neither usually appeal nor need to appeal to reasons for belief and even when we do we cannot really choose what to believe. Belief is involuntary and usually comes swiftly and effortlessly, plausibly for evolutionary psychological reasons. So, externalists are often also skeptical about the dialectical import of the deontic appearance and sometimes even of its concomitant normativity.  The concepts of deontology, internalism and sometimes normativity do not ring very well to their ears.

Admittedly, there is some insight in externalist misgivings about reasons-talk and the deontic appearance, but it is also hard to expel talk of reasons and duties from epistemological theorizing. Think for example of Locke’s (1690/1975: 687) pithy aphorism that “those who believe without reason are in love with their own fancies.”. As we have seen, this conflict of externalist and deontological/internalist intuitions seems to leave us in a bind because it is not clear which intuition we should discard (or even how we should attempt to reconcile them). Hence, the resolution of the conflict of internalist/externalist intuitions constitutes one of the most serious challenges for any epistemological theory.

Be that as it may, in this section we set aside externalist misgivings about reasons and introduce basic issues surrounding reasons for belief and epistemic psychology. Besides, even externalists give some reasons in support of externalism. First, let us distinguish epistemic from non-epistemic reasons like moral, pragmatic, prudential, aesthetic, political, and even more. One way to delineate epistemic reasons from other kinds of reasons is to emphasize that they stand as reasons for belief from the epistemic point of view. That is, from the point of view of commitment to some epistemic goals/values, whatever these may be: justification, truth, knowledge, understanding, wisdom (or even some other). Suffice it to say that, very roughly, epistemic goals are goals that promote our goal for cognitive contact with reality.

Non-epistemic kinds of reasons are not oriented towards distinctively epistemic goals. Reasons that are oriented towards practical utility are often—somewhat loosely—called pragmatic, and they are akin to a consequentialist/instrumental understanding of practical reasons. So, if for instance you are a cynic enough to believe in divine existence simply because it is consoling or because you find Pascal’s wager cogent, then you have pragmatic reasons for such belief but not epistemic reasons. Even if you should believe in God because that is what your relevant evidence rationally prescribes, but believe because it is consoling or because you find Pascal’s wager cogent then you believe for the wrong kind of reasons.

Epistemic reasons are also closely connected to moral reasons. Indeed, it is sometimes said that epistemic reasons have a moral flavor because they are reasons that should regulate behavior and such reasons cannot fail to be in some sense moral/practical (compare Clifford 1877; Cuneo 2007). Even if this is right, which is a moot thesis, epistemic reasons would make for a very distinctive species of moral reasons, perhaps so distinctive that they merit a distinctive appellation too: epistemic instead of moral/practical. They are reasons that have to do with the proper regulation of doxastic behavior and not just any kind of practical behavior. Typical doxastic behavior seems different from typical practical behavior at least in terms of conscious and direct control of the behavior.

Much more needs to be said about the classification of reasons, but I have said enough for an intuitive grasp of what might make epistemic reasons different from other kind of reasons. Let me close this topic with two concluding observations. First, epistemic reasons may be sensitive to moral/practical reasons (and other reasons as well). To see this, think of the case of a cosmologist that by trade has a duty to inquire about the origins of the universe but also has to attend to time-consuming filial duties towards her ageing parents. Surely, it is hard to combine both conflicting duties because the time dedicated to filial duties is taken out of work and reflection of epistemic duties. So, there must be a tradeoff between epistemic and moral/epistemic reasons.

Moral/practical reasons may even affect epistemic reasons in a more dramatic way. You may have good epistemic reasons to believe that your business partner and childhood friend is cheating on you but you might have moral reasons to trust him. So, it seems that epistemic and moral reasons may pull to opposite directions and the interesting question is what sort of reasons may have the upper hand in such situations. In such cases, we have a moral-epistemic obligation dilemma.

Second, a short note to the so-called “wrong kind of reasons problem” is due (compare Rabinowicz and Ronnow-Rasmussen 2004; Olson 2004; Schroeder 2010; Heuer 2010; Kyriacou 2013). It is a problem that concerns all propositional attitudes (not just belief) and it popped to the surface as a problem for so-called buckpassing accounts of value. Buckpassing accounts of value contend that something is valuable iff it has a property that elicits proattitudes like approval, admiration, desire and so forth (compare Scanlon 1998). The problem now is that something may have a property that elicits proattitudes for the wrong kind of reasons. It is not enough that something is valuable. This is beside the point.

The point is that what we judge as valuable should be so judged for the right kind of reasons. If, for example, I admire a truly remarkable painting just because this will please the painter, I have the right attitude concerning its aesthetic value for the wrong reasons. In the epistemic case, if you endorse a truly justified belief as justified because it is consoling for you, and not because it is evidentially grounded, reliably produced, or what have you as the correct justificatory story, then you have the right kind of doxastic attitude but for the wrong kind of reason. The wrong kind of reasons problem poses a challenge for any metanormative theory and it is also a challenge for a theory of epistemic justification. We need a theory of epistemic justification that would get round the problem and have agents believe justifiably for the right kind of reasons.

A further important distinction about epistemic reasons is the distinction between hypothetical/instrumental and categorical reasons (or rationality). One position supports that epistemic rationality is merely instrumental. We have certain (or opt for some) epistemic goals like truth and knowledge and then we try to satisfy them in the best possible way (to the extent that we are rational). According to instrumental epistemic rationality, there are no reasons for belief going over and above these conditional\hypothetical epistemic goals. For example, if we have the hypothetical goal of truth about p, then epistemic rationality is merely instrumental in the sense that it should find the best means to satisfy the goal. In other words, epistemic rationality is instrumental, and, as such, it should be understood in terms of hypothetical requirements like ‘If you aim for the truth of p, then do such-and-such, say, employ reliable belief-forming processes’ (compare Kornblith 2002).

Their opponents disagree and reject instrumental epistemic rationality in favor of categorical rationality. They contend that we may have reasons for belief, even if we are lacking some conditional epistemic goal (compare Kelly 2003). For example, I may have no desire to find the truth about who killed my boss but I may still have a binding (external/categorical) reason to believe that it was John because there is abundant and accessible evidence at my disposal that confirms this. In a sense, whether I care about the truth of the matter or not, I have sufficient epistemic reason to believe that John is the murderer. Epistemic rationality constrains what I should believe unconditionally, that is, independently of my desires, interests, or adopted goals. In this sense, epistemic rationality should be understood in terms of non-optional, categorical requirements like “Given the evidence, you ought to believe that such-and-such (here, that John is the murderer).”

A related question about reasons for belief that may be inspired by a parallel discussion in metaethics concerns the nature of reasons for belief. As in metaethics there has been considerable discussion about the nature of reasons for action and corresponding internal/external reasons for action, the same fruitful discussion could be opened in metaepistemology about reasons for belief. Internal reasons are reasons that depend on the agent’s subjective motivational set (desires, intentions, dispositions, plans, goals) while external reasons are reasons that are independent of the agent’s subjective motivational set (see Williams 1990; also Turri 2009).

For example, I may have a reason to believe or inquire about p only if I have a desire or care about the truth about p. According to this internal reading of having a reason, my having a reason exclusively depends on my caring and desiring to know about p. Were I not to care, I wouldn’t have a reason to believe or inquire into p at all, which is to restate the instrumental conception of epistemic rationality, but there is also an externalist reading of the notion of having a reason. For example, I might say “Sophie has a reason to believe that John is from New York” and imply that she has a reason to believe this insofar as she is rational, independently of whether she cares or not about the truth of the matter. Even if she couldn’t care less, it remains an epistemic fact that she has a reason to believe that John is from New York, which is to restate the categorical conception of epistemic rationality. There are epistemic facts about what (categorical) reasons for belief we have (compare Boghossian 2007; Cuneo 2007).

A different issue that has recently drawn attention, and also parallels an analogous discussion in metaethics, concerns the relation between epistemic judgment and motivation (compare Kappel and Moeller 2014; Grajner 2015). That is, the so-called judgment internalism/externalism controversy. Like moral judgments, it seems that epistemic judgments like “p is justified” or “S’s belief that p is true” or “I know that p” seem to implicate certain kinds of motivations. They seem to have an internal, conceptual connection with certain motivations. Judgment internalists hold onto the intuition of internal connection of judgment and motivation while judgment externalists, though they accept the systematic character of the connection, deny the internalist intuition and suggest that the connection may be severed in certain circumstances. Hence, the connection may only be external and defeasible in certain circumstances.

For example, for judgment internalists sincerely saying that “p is justified” implicates that I have at least some reason or doxastic motivation to believe that p, rely on p, recommend to others that p, approve of the norms, habits, processes , in virtue of which p was formed; or saying that “S’s belief that p is true” implicates some reason to also believe that p, approve the norms in virtue of which p was formed, give some credit to S and so forth; or asserting that “I know that p” implicates, all other things equal, termination of inquiry. The inquiry about p is at least for time being settled (compare Kappel and Moeller 2014).

In contrast externalists would question the conceptual connection and its modal strength (compare Kvanvig 2003; Grajner 2015). They might highlight the conceivability of Spock-like agents who make sincere normative judgments but remain entirely cold and unmotivated by that judgment. In the moral case, such personas are called amoralists, and the typical characters externalists have in mind are psychopaths, sociopaths, social delinquents, and so forth that seem capable of sincere moral judgment without any corresponding motivation for action. (Perhaps we can call the epistemic counterparts of amoralists, “acognitivists.”)

Like in the parallel case of moral motivation in metaethics, expressivists seem able to easily explain the systematic—if not strictly necessary—connection between epistemic judgment and motivation for belief, reliance, termination of inquiry, and so forth. This is the case because, as a noncognitivist theory about the nature of the mental state expressed in epistemic judgment, the expressivist can appeal to the motivating nature of noncognitive states to explain this systematic connection of motivation. Desires, intentions, and even epistemic sentiments may be called for explanatory psychological work.

For example, if I say that “p is justified” the expressivist will suggest that I express some sort of approval for p or the norms that license it or the doxastic habits in virtue of which was reliably formed (for example, Kyriacou 2012). The bottom line of such a noncognitivist account is that the relevant desire or intention for belief might be expressible in such judgments of justification. Similar stories would be drawn for other sorts of epistemic judgments. Thus, epistemic motivation is easily explained as a psychological phenomenon in an expressivist framework.

Cognitivists about epistemic judgment, though, cannot directly appeal to the same style of psychological explanation as the expressivist because they suggest that epistemic judgments express beliefs and beliefs do not seem to be intrinsically motivating states. They are representational states and as such they do not seem to be engaging the will at all. Saying that “Paris is a city” or that “It’s a hot day today” may be representational, but it is also “motivationally inert.” As is often said, beliefs and desires have an opposite direction of fit with the world (compare Anscombe 1957; Smith 1994). Beliefs are representational states in the sense that they aspire to represent aspects of the world and get at the respective truth. They are mind-to-world directed states. Desires are nonrepresentational states in the sense that they urge for changing the world in order to be satisfied (compare Smith 1994). They are world-to-mind directed states.

This leaves cognitivists with an obvious puzzle about epistemic motivation. They could insist that epistemic beliefs can motivate but this seems entirely ad hoc because beliefs seem representational and motivationally inert states. Alternatively, they could suggest that beliefs may induce corresponding desires, but this will have to explain how this systematic inducing works given that beliefs and desires are so-called Humean “distinct existences” (compare Korsgaard 1986; Smith 1994). That is, there is no necessary (logical or psychological) connection between belief and desire as the two states can pull apart. There can be belief without any corresponding desire and vice versa.

They could even appeal to a sophisticated hybrid (or so-called “ecumenical”) picture about normative judgment and suggest that they express both cognitive and noncognitive states (compare Copp 2001; Ridge 2007; Grajner 2015). At any event, cognitivists have some extra explaining to do that noncognitivists do not. This seems to give the noncognitivists the edge in regard to understanding the psychological relation between normative judgment and motivation, and it is considered as one of the best arguments in favor of noncognitivism. Nevertheless, one should not question the resourcefulness of the cognitivist tradition and its ability to account for moral and epistemic motivation (for ecumenical cognitivism see Grajner 2015).

With this much about reasons for belief and epistemic psychology, in the next section we briefly visit the notions of agency and responsibility.

7. Agency and Responsibility

Epistemic deontology has seemed especially implausible to many externalists because it does not comport with a plausible picture of epistemic agency. That is, epistemic deontologism assumes that as rational and responsible agents we have epistemic duties about what to believe and to the extent that we do not conform our doxastic behavior to these duties we are responsible and culpable. The problem now is that a plausible account of epistemic agency does not allow for the responsibility that deontologism requires and, since this is a prerequisite for deontologism, deontologism seems implausible.

This is arguably the case because typically doxastic behavior seems beyond the reach of our direct and conscious control. To illustrate this, let someone try to believe something that considers an obvious falsehood, or at least something she considers implausible, like “1+1=3” or that “Paris is not the French capital.” Let us even offer a powerful motive for the fruition of this cognitive exercise, say, 10,000 pounds for sincerely acquiring the belief. But alas, it seems psychologically impossible because direct belief-fixation is not up to us in any profound manner. Unlike ordinary decisions about what to do (for example, dance or standing up) we can’t directly decide what to believe.

In light of the fact that an intuitive understanding of responsibility is explicated in terms of minimal control of action, and control in terms of relative freedom to choose from open alternatives (for example, Pink 2004) one may consider epistemic deontologism a lost cause. That is, a lost cause because it relies on the existence of epistemic responsibility but, on the other hand, does not allow for sufficient control and freedom of choice that is a necessary prerequisite for such responsibility—this is “the doxastic involuntarism problem” for  (internalist) epistemic deontologism (for the classic statement compare Alston 1988).

As one might expect, there have been a number of responses on behalf of deontologism. Some have blankly denied the involuntarism intuition and insisted that we can at least sometimes directly decide what to believe (for example, Ginet 2001). Others have denied that responsibility is exhausted by the absence of direct control (for example, Feldman 2001). We might have no direct control over the working of our heart but we might still be to some extent responsible for its proper working because we can take steps to indirectly enhance its functioning. After all, we can have a healthy lifestyle that includes a pro-heart diet, exercising, regular medical exams, and so forth

Similarly, we can take steps to indirectly enhance our cognitive functioning. We could for example cultivate reliance on methods, habits, and processes that are reliable belief-producers—that is, produce a high ratio of true beliefs. If I am myopic for example, I could cultivate habitual reliance for visual perception on my glasses because I know they make for reliable perception, or I could rely on certain sources of information that are generally reliable like a trustworthy newspaper columnist or a trustworthy friend. I could also cultivate epistemic virtues, namely, character traits that are generally truth-conducive like conscientiousness, inquisitiveness, open-mindedness, perseverance, respect, courage, tolerance, fairness, love of truth and humility.

In this vein Chrisman (2008), for example, has argued that doxastic involuntarism and epistemic deontology can be reconciled. He appeals to Wilfrid Sellars’ distinction between “rules of criticism” and “rules of action” and explains how doxastic oughts might be subject to rules of criticism that require no direct voluntary control but, nevertheless, do not compromise the categoricity of doxastic oughts (and responsibility). The doxastic involuntarism problem remains a live question for epistemic deontologism and agency (for more discussion see Vitz’s Doxastic Voluntarism).

A second topic of some recent debate that relates to (epistemic) agency is the so-called “situationist challenge about character.”  Roughly, situationism about character is the thesis that character is a figment of folk psychology (and virtue theory) and does not really exist. Although it was first applied to moral character and thereby virtue ethics and then transposed to responsibilist virtue-theoretic epistemology (compare Alfano 2012) that appeals to character, it is arguably not of merely local responsibilist virtue-theoretic concern. This is the case because if, as Baehr (2012) has cogently argued, even theories that as stipulated have nothing to do with character, and virtuous/vicious traits (like evidentialism and reliabilism) are inevitably involving virtuous and vicious traits, then the debunking of character as a mere figment will have serious repercussions for these theories as well.

Harman (1999) is a classic statement of so-called situationism about moral character traits, namely, the idea that what drives our behavior is the current of the social situation and not fixed character traits, as virtue theories and folk psychology tend to think. In fact, according to Harman (1999), there is no positive empirical evidence in favor of character traits, and there is much negative empirical evidence against them and, therefore, in so thinking we commit the so-called “fundamental attribution error.” We excessively focus on the agent and downplay the importance of the social situation that the agent is found in.

The situationist challenge questions the psychological reality of the theoretical cornerstone of responsibilist virtue theories, namely, character. Drawing from various empirical psychological studies, like Milgram’s (1974) famous obedience experiments, it is suggested that the folk psychological and virtue-theoretical concept of character is a mere fiction. These studies are taken to show that in reality the notion of character does not exist because in these experiments the agents’ actions are not guided by their character, relevant virtues, or traits but by the situational context they are found in.

True enough, we speak about character and character traits (virtuous and vicious) but empirical, psychological studies confirm that in a given situation our supposed character traits may fail to manifest themselves in our conduct, at least for a statistically significant portion of us. This indicates that there is no reliable source of dispositions for action according to character traits as it is widely assumed. If there were, most of us couldn’t just so easily behave out of character. This corollary would suffice to shake the building foundations of folk psychology (that relies extensively on the notion of character) and responsibilist virtue theory as well as other theories that surreptitiously involve the notion of character (for some discussion of situationism, see Timpe’s Moral Character; for responses to moral situationism see Miller 2003; Kamterkar 2004; and Snow 2010).

More recently Alfano (2011) transposed the situationist challenge from virtue ethics to responsibilist virtue epistemology. He appeals to abundant empirical work from cognitive and social psychology to rest his case. He indicates that intellectual virtues like curiosity, flexibility, creativity, and courage are susceptible to the vagaries of non-intellectual factors of their situation like mood elevators, mood depressors, ambient sounds, ambient smells, and even the weather.

This suggests, according to Alfano (2011), a puzzle that can be framed as an inconsistent triad: (non-skepticism) Most people know quite a bit; (responsibilism) Knowledge is true belief acquired and retained through the exercise of intellectual virtue; and (epistemic situationism) Most people do not possess the epistemic virtues countenanced by responsibilism. Alfano suggests that a plausible way to resolve the puzzle is to give up responsibilism and with it the whole enterprise of responsibilist virtue epistemology. The safe prediction is that situationism about moral and intellectual character poses a sharp challenge to folk psychology, virtue theory (and beyond) and is here to stay as a topic for discussion.

With this much about agency and responsibility, let us now wrap up the overall discussion.

8. New Directions in Metaepistemology

We have covered a lot of ground in large strides and introduced various key aspects of metaepistemological theorizing, ranging from semantics to metaphysics and agency. There is, of course, much more going on in metaepistemological debates (both in terms of depth and breadth), especially as new and fascinating directions of inquiry are constantly emerging. New semantic theories have been suggested like hybrid semantic theories of epistemic concepts and inferentialist theories. More light is being shed on neglected epistemic properties and their value like understanding, entitlement, and wisdom and new and interesting parallels with metaethics are constantly emerging, like reasons and motivation. Finally, exciting experimental work on intuitions is emerging and formal work is being carried out that links metaepistemological themes with probability calculus and decision theory (for example, Pettigrew 2011, 2013). This indicates that metaepistemology is an emerging and promising field of epistemological inquiry that is anticipated to blossom in the years to come.

9. References and Further Reading

  • Alfano, Mark. (2011). ‘Expanding the Situationist Challenge to Responsibilist Virtue Epistemology’. Philosophical Quarterly 62(247): 223-249.
  • Alston, William. (1988). ‘The Deontological Conception of Epistemic Justification’, Philosophical Perspectives, 2, Epistemology, pp.115-152.
  • Alston, William. (2005). Beyond Justification. Ithaca, NY: Cornell University Press.
  • Annis, David. (1978). ‘A Contextualist Theory of Epistemic Justification’. American Philosophical Quarterly, Vol.15, No.3, pp.213-219.
  • Anscombe, G. E. M. (1957). Intention. Oxford, Blackwell.
  • Ayer, A. J. (1936). Language, Truth and Logic. London, Penguin Books.
  • Baehr, Jason. (2012). The Inquiring Mind. Oxford, Oxford University Press.
  • Bergmann, Michael. (2008). ‘Reidian Externalism’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave MacMillan.
  • Blackburn, Simon. (1993). Essays in Quasi-Reaslism. Oxford, Oxford University Press.
  • Blackburn, Simon. (1998). Ruling Passions. Oxford, Oxford University Press.
  • Blackburn, Simon. (2006). Truth. London, Penguin Books.
  • Bonjour, Lawrence. (1985). The Structure of Empirical Knowledge. Cambridge MA: Harvard University Press.
  • Bonjour, Lawrence. (1998). In Defense of Pure Reason. London, Cambridge University Press.
  • Bonjour, Lawrence and Sosa, Ernest. (2003). Epistemic Justification. Oxford, Blackwell.
  • Boghossian, Paul. (2007). Fear of Knowledge. Oxford, Oxford University Press.
  • Brandom, Robert. (2000). Articulating Reasons. Cambridge MA, Harvard University Press.
  • Brink, David. (1989). Moral Realism and the Foundations of Ethics. New York, Cambridge University Press.
  • Charlow, Nate. (2014). ‘The Problem with the Frege-Geach Problem’. Philosophical Studies 167(3): 635-665.
  • Chisholm, Roderick. (1966). Theory of Knowledge. Englewood Cliffs, NJ, Prentice Hall.
  • Chrisman, Matthew. (2007). ‘From Epistemic Contextualism to Epistemic Expressivism’. Philosophical Studies 135(2):225-254.
  • Chrisman, Matthew. (2008). ‘Ought to Believe’. Journal of Philosophy 105 (7): 346-370.
  • Chrisman, Matthew. (2012). ‘Epistemic Expressivism’. Philosophy Compass 7(2):118-126.
  • Clifford, William (1877/2008). ‘The Ethics of Belief’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp.101-5.
  • Cohen, Stewart. (1998). ‘Contextualist Solutions to Epistemological Problems: Scepticism, Gettier and the Lottery’. Australasian Journal of Philosophy 74, 4, pp.549-567.
  • Conee, Earl and Feldman, Richard. (2004). Evidentialism. Oxford, Oxfrod University Press.
  • Copp, David. (2001). Realist-Expressivism: A Neglected Option for Moral Realism. Social Philosophy and Policy 18(02):1-43.
  • Cuneo, Terence. (2007). The Normative Web. Oxford, Oxford University Press.
  • Cuneo, Terence and Kyriacou, Christos. (2017) ‘Defending the Moral/Epistemic Parity’ in Metaepistemology. (eds.) C. McHugh, J. Way and D. Whiting.
  • Darwall, Stephen, Gibbard, Allan and Railton, Peter. (1992). ‘Toward Fin de Siecle Ethics: Some Trends’. Philosophical Review 101(1):115-189.
  • De Cruz, Helen, Boudry, Maarten, De Smedt, Johan, and Blancke, Stefaan. (2011). ‘Evolutionary Approaches to Epistemic Justification’. Dialectica 65(4):517-535.
  • DeRose, Keith. (1995). ‘Solving the Skeptical Problem’. The Philosophical Review 104, 1, pp. 1-52.
  • Descartes, Rene. (1641\2008). Meditations on First Philosophy. Oxford, Oxford University Press. Translated by Michael Moriarty.
  • Dowden, Bradley and Swartz, Norman. ‘Truth’ in the Internet Encyclopedia of Philosophy.
  • Dreier, James. (2004). ‘Meta-ethics and the Problem of Creeping Minimalism’. Philosophical Perspectives 18(1):23-44.
  • Dunn, Jeffrey. ‘Epistemic Consequentialism’. Internet Encylopedia of Philosophy.
  • Elgin, Catherine. (2004). ‘True Enough’. Philosophical Issues 14, Epistemology, pp. 113-131.
  • Enoch, David. (2013). Taking Morality Seriously. Oxford, Oxford University Press.
  • Feldman, Richard. (2001). ‘Voluntary Belief and Epistemic Evalutation’ in Knowledge, Truth and Duty, (ed.) Matthias Steup. Oxford, Oxford University Press, pp.77-92.
  • Feldman, Richard. (2002). ‘Epistemological Duties’ in The Oxford Handbook of Epistemology, ed. Paul Moser. Oxford, Oxford University Press. pp.362-384.
  • Fisher, Andrew. (2011). Metaethics: An Introduction. Acumen.
  • Floridi, Luciano. (2004). ‘On the Logical Unsolvability of the Gettier Problem’. Synthese 142(1): 61-79.
  • Fogelin, Robert. (1994). Pyrrhonian Reflections on Knowledge and Justification. Oxford, Oxford University Press.
  • Frankena, William. (1939). ‘The Naturalistic Fallacy’. Mind XLVIII(192):464:477.
  • Frege, Gottlob. (1997). ‘On Negation’ in The Frege Reader
  • Fricker, Miranda. (2010). Epistemic Injustice. Oxford, Oxford University Press.
  • Fumerton, Richard. (1995). Metaepistemology and Skepticism. London, Rowman and Littlefield.
  • Geach, Peter. (1960). ‘Ascriptivism’. Philosophical Review 2:221-225.
  • Geach, Peter. (1965). ‘Assertion’. Philosophical Review 74(4):449-465.
  • Gibbard, Allan. (1990). Wise Choices, Apt Feelings. Oxford, Oxford University Press.
  • Ginet, Carl. (2001). ‘Deciding to Believe’ in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford, Oxford University Press. pp.63-76.
  • Goldman, Alvin. (1979). ‘What is Justified Belief?’ in Epistemology : An Anthology, eds., Ernest Sosa and Jaegkwon Kim. Blackwell. pp. 340-353.
  • Grajner, Martin. (2015). ‘Hybrid Expressivism About Epistemic Justification’. Philosophical Studies 172(9):2349-2369
  • Greco, Daniel. (2015).  ‘Epistemological Open Question Arguments’. Australasian Journal of Philosophy 93(3):509-523.
  • Greco, John. (2010). Achieving Knowledge. Oxford, Oxford University Press.
  • Grimm, Stephen. (2011). ‘Understanding’ in The Routledge Companion to Epistemology, (eds.) Sven Bernecker and Duncan Pritchard. New York, Routledge. pp.
  • Haidt, Jonathan. (2011). The Righteous Mind. New York, Vintage Books.
  • Harman, Gilbert. (1986). Change in View. Cambridge, MA: MIT Press.
  • Harman, Gilbert.(1999). ‘‘Moral Philosophy Meets Social Psychology: Virtue Ethics and the Fundamental Attribution Error’. Proceedings of the Aristotelian Society 99, pp. 315-331.
  • Hare, Richard. (1952). The Language of Morals. Oxford, Clarendon Press.
  • Hawthorne, John. (2004). Knowledge and Lotteries. Oxford, Oxford University Press.
  • Heathwood, Chris. (2009). ‘Moral and Epistemic Open Question Arguments’, Philosophical Books 50: 83-98.
  • Heuer, Ulrike. (2010). ‘Beyond Wrong Reasons: The Buck-Passing Account of Value’ in New Waves in Metaethics, (ed.) Michael Brady. Palgrave Macmillan. pp. 166-184.
  • Hume, David (1739\1985). A Treatise of Human Nature. London, Penguin.
  • Huemer, Michael. (2008). Ethical Intuitionism. London, Palgrave Macmillan.
  • James, William. (1896\2008). ‘The Will to Believe’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp. 106-113.
  • Jenkins, Catherine. (2007). ‘Epistemic Norms and Natural Facts’. American Philosophical Quarterly 44 (3): 259-272.
  • Kahneman, Daniel. (2012). Thinking, Fast and Slow. New York, Farrar, Straus and Giroux.
  • Kamtekar, Rachana. (2004). ‘Situationism and Virtue Ethics on the Content of Our Character’. Ethics 114(3):458-491.
  • Kappel, Klemens and Moeller, Eric. (2014). ‘Epistemic Expressivism and the Argument from Motivation’. Synthese, pp.1-19.
  • Kelly, Thomas. (2003). ‘Epistemic Rationality as Instrumental Rationality: A Critique’. Philosophy and Pehenomenological Research 66, 3, pp.612-40.
  • Kim, Jaegwon. (1988). ‘What is Naturalized Epistemology?’. Philosophical Perspectives 2: 381-405.
  • Kirkham, Richard. (1984). ‘Does the Gettier Problem Rest on A Mistake?’. Mind 93(372): 501-513.
  • Kyriacou, Christos. (2012).‘Habits-Expressivism About Epistemic Justification’. Philosophical Papers 41(2):209-237.
  • Kyriacou, Christos. (2013). ‘How Not To Solve the Wrong Kind of Reasons Problem’. Journal of Value Inquiry 47(1-2):101-110.
  • Kyriacou, Christos. (2015). ‘Critical Discussion of David Velleman’s ‘Foundations for Moral Relativism’’ UK: Open Book Publishers. 2013. Ethical Theory and Moral Practice 18(1),pp. 209-214.
  • Kyriacou, Christos. (2016a). ‘Ought to Believe, Evidential Understanding and the Pursuit of Wisdom’ in Epistemic Reasons, Epistemic Norms, Epistemic Goals, eds. Martin Grajner and Pedro Schmechtig. Berlin, DeGruyter. Pre-print version.
  • Kyriacou, Christos. (2016b). ‘Metaepistemology’ in Oxford Bibliographies Online, ed. Duncan Pritchard. New York, Oxford University Press. Pre-print version.
  • Kyriacou, Christos. (2017). ‘Bifurcated Sceptical Invariantism’ in Journal of Philosophical Research. Pre-print version.
  • Kornblith, Hilary. (2002). Knowledge and Its Place in Nature. Oxford, Oxford University Press.
  • Korsgaard, Christine. (1986). ‘Skepticism About Practical Reason’, Journal of Philosophy 83 (1):5-25.
  • Kvanvig, Jonanthan. (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge, Cambridge University Press.
  • Lenman, James. (2008). ‘Review of Terence Cuneo. The Normative Web.’ Notre Dame Philosophical Reviews 2008(6).
  • Lewis, David. (1996). ‘Elusive Knowledge’. Australasian Journal of Philosophy 74(4): 549-567.
  • Locke, John. (1690\1975). An Essay Concerning Human Understanding. Oxford, Oxford University Press. Edited with an Introduction by P.H. Nidditch.
  • Mackie, John. (1971). Ethics. Inventing Right and Wrong. London, Penguin.
  • MacFarlane, John. (2005). ‘The Assessment Sensitivity of Knowledge Attributions’. In T.S.Gendler and J.Hawthorne, eds. Oxford Studies in Epistemology, Oxford University Press. pp. 197-234.
  • Miller, Christian. (2003). ‘Social Psychology and Virtue Ethics’. Journal of Ethics 7(4): 365-392.
  • Milgram, Stanley. (1974). Obedience to Authority: An Experimental View. New York, Harper and Row.
  • Moore, G. E. (1903\2000). Principia Ethica. Cambridge, Cambridge University Press. Edited with an introduction by T. Baldwin.
  • Neta, Ram. (2008). ‘How to Naturalize Epistemology’ in New Waves in Epistemology, eds. Vincent Hendricks and Duncan Pritchard. New York, Palgrave Macmillan. pp. 324-353.
  • Nozick, Robert. (1981). Philosophical Explanations. Cambridge MA, Harvard University Press.
  • Olson, Erik. (2007). ‘The Place of Coherence in Epistemology’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave Macmillan.
  • Olson, Jonas. (2004). ‘Buck-Passing and the Wrong Kind of Reasons’. Philosophical Quarterly 54(215):295-300.
  • Olson, Jonas. (2011a). ‘In Defense of Moral Error Theory’ in New Waves in Metaethics. New York, Palgrave Macmillan. pp. 62-84.
  • Olson, Jonas. (2011b). ‘Error Theory and Reasons for Belief’ in Reasons for Belief, eds. Andrew Reisner and Asbjorn Stegligh-Petersen. Cambridge, Cambridge University Press. pp. 75-93.
  • Papineau, David. (2003). ‘The Evolution of Knowledge’ in The Roots of Reason. Oxford, Oxford University Press, pp. 39-82.
  • Pettigrew, Richard. (2011). ‘Epistemic Utility Arguments for Probabilism’. Stanford Encyclopedia of Philosophy.
  • Pettigrew, Richard. (2013). ‘Epistemic Utility and Norms for Credences’. Philosophy Compass 8/10: 897-908.
  • Pink, Thomas. (2004). Free Will: A Very Short Introduction. Oxford, Oxford University Press.
  • Plantinga, Alvin. (1993). Warrant: The Current Debate. Oxford, Oxford University Press.
  • Plantinga, Alvin. (1993). Warrant and Proper Function. Oxford, Oxford University Press.
  • Plato. (2005). Euthyphro, Apology, Crito, Phaedo, Phaedrus. Cambridge MA : Harvard University Press.
  • Pollock, John and Cruz, Joseph. (1999). Contemporary Theories of Knowledge. Lanham, Rowman and Littlefield.
  • Poston, Ted. ‘Internalism and Externalism in Epistemology’ in the Internet Encyclopedia of Philosophy.
  • Pritchard, Duncan. (2010). The Nature and Value of Knowledge. Oxford, Oxford University Press. Coauthored with Allan Millar and Adrian Haddock.
  • Putnam, Hilary. (1975/1997). ‘The Meaning of ‘Meaning’’ in Philosophical Papers Vol.2 : Mind, Language and Reality. Cambridge, Cambridge University Press. pp. 215-271.
  • Quine, W. V. O. (1953). ‘Two Dogmas of Empiricism’ in From A Logical Point of View. Cambridge, MA: Harvard University Press. pp. 20-46.
  • Quine W. V. O. (1992). Pursuit of Truth. Cambridge, MA: Harvard University Press
  • Rabinowicz, Wlodek and Ronnow-Rasmussen, Toni. (2004). ‘The Strike of the Demon: On Fitting Pro-Attitudes and Value’. Ethics 114(3):391-423.
  • Ridge, Michael. (2007). ‘Ecumenical Expressivism: The Best of Both Worlds?’ Oxford Studies in Metaethics 2:51-76.
  • Ridge, Michael. (2014). ‘Moral Non-naturalism’ in Stanford Encyclopedia of Philosophy, (ed.) Edward Zalta.
  • Scanlon, T. M. (1998). What We Owe To Each Other. Cambridge MA, Harvard University Press.
  • Schaffer, Jonathan. (2004). ‘From Contextualism to Contrastivism’. Philosophical Studies 119(1-2): 73-104.
  • Schaffer, Jonathan. (2013). ‘On What Grounds What’ in Metametaphysics, eds. D.Chalmers, D.Manley and R.Wasserman. Oxford, Oxford University Press. pp.347-383.
  • Schroeder, Mark. (2008a). Being For. Oxford, Oxford University Press.
  • Schroeder, Mark. (2008b). ‘What is the Frege-Geach Problem?’. Philosophy Compass 3(4):703-720.
  • Schroeder, Mark. (2010). ‘Value and the Right Kind of Reason’. Oxford Studies in Metaethics 5:25-55.
  • Setiya Kieran. (2012). Knowing Right from Wrong. Oxford, Oxford University Press.
  • Smith, Michael. (1994). The Moral Problem. Oxford, Oxford University Press.
  • Snow, Nancy. (2010). Virtue as Social Intelligence. New York, Routledge.
  • Sosa, Ernest. (1991). Knowledge in Perspective. Cambridge, Cambridge University Press.
  • Sosa, Ernest. (2003). ‘The Place of Truth in Epistemology’ in Intellectual Virtue, (eds.) M.DePaul and L.Zagzebski. Oxford, Oxford University Press, pp.155-79.
  • Sosa, Ernest. (2007). A Virtue Epistemology. Oxford, Oxford University Press.
  • Street, Sharon. (2006). ‘A Darwinian Dilemma for Realist Theories of Value’. Philosophical Studies 127 (1), pp. 109-166.
  • Street, Sharon. (2009). ‘Evolution and the Normativity of Epistemic Reasons’. Canadian Journal of Philosophy 39 (supplement 1): 213-248.
  • Stephen, Stich. (1990). The Fragmentation of Reason. Cambridge MA, MIT Press.
  • Stevenson, C. L. (1963). ‘The Nature of Ethical Disagreement’ in Facts and Values. New Haven: Yale University Press. pp.1-9.
  • Timmons, Mark and Horgan, Terence. (1991). ‘New Wave Moral Realism Meets Moral Twin Earth’. Journal of Philosophical Research 16:447-465.
  • Timpe, Kevin. ‘Moral Character’ in the Internet Encyclopedia of Philosophy.
  • Turri, John. (2009). ‘The Ontology of Epistemic Reasons’. Nous 43(3): 490-512.
  • Unger, Peter. (1971). ‘A Defense of Skepticism’. Philosophical Review, LXXX :198-219.
  • Vahid, Hamid. (2005). Epistemic Justification and the Skeptical Challenge. London, Palgrave Macmillan.
  • Vavova, Katia. (2014). ‘Debunking Evolutionary Debunking’. Oxford Studies in Metaethics 9:76-101.
  • Velleman, David. (2013). Foundations for Moral Relativism. UK: Open Publishers.
  • Vitz, Rico. ‘Doxastic Voluntarism’ in the Internet Encyclopedia of Philosophy.
  • Yalcin, Seth. (2012). ‘Bayesian Expressivism’, Proceedings of the Aristotelian Society 112, Vol.2:123-160.
  • Zagzebski, Linda. (1996). Virtues of the Mind. Cambridge, Cambridge University Press.
  • Zagzebski, Linda. (2003). ‘The Search for the Source of the Epistemic Good’. Metaphilosophy 34, pp.12-28.
  • Zagzebski, Linda. (2009). On Epistemology. Belmont, CA : Wadsworth.
  • Wedgwood, Ralph. (2008). ‘Contextualism About Justified Belief’. Philosopher’s Imprint 8, No.9, pp.1-20.
  • Wiggins, David. (1987). ‘A Sensible Subjectivism?’ in Needs, Values and Truth. Oxford, Oxford University Press. pp. 185-214.
  • Williams, Bernard. (1979). ‘Internal and External Reasons’ in Rational Action, ed. Ross Harrison. Cambridge, Cambridge University Press. pp. 101-113.
  • Williams, Michael. (2001). Problems of Knowledge. Oxford, Oxford University Press.
  • Williamson, Timothy. (2000). Knowledge and its Limits. Oxford, Oxford University Press.

Author Information

Christos Kyriacou
Email: ckiriakou@gmail.com
University of Cyprus
Cyprus

Interpretations of Quantum Mechanics

Quantum mechanics is a physical theory developed in the 1920s to account for the behavior of matter on the atomic scale. It has subsequently been developed into arguably the most empirically successful theory in the history of physics. However, it is hard to understand quantum mechanics as a description of the physical world, or to understand it as a physical explanation of the experimental outcomes we observe. Attempts to understand quantum mechanics as descriptive and explanatory, to modify it such that it can be so understood, or to argue that no such understanding is necessary, can all be taken as versions of the project of interpreting quantum mechanics.

The problematic nature of quantum mechanics stems from the fact that the theory often represents the state of a system using a sum of several terms, where each term apparently represents a distinct physical state of the system. What’s more, these terms interact with each other, and this interaction is crucial to the theory’s predictions. If one takes this representation literally, it looks as if the system exists in several incompatible physical states at once. And yet when the physicist makes a measurement on the system, only one of these incompatible states is manifest in the result of the measurement. What makes this especially puzzling is that there is nothing in the physical nature of a measurement that could privilege one of the terms over the others.

According to the Copenhagen interpretation of quantum mechanics, the solution to this puzzle is that the quantum state should not be taken as a description of the physical system. Rather, the role of the quantum state is to summarize what we can expect if we make measurements on the system. According to the many-worlds interpretation, the quantum state is to be taken as a description of the system, and the solution to the puzzle is that each term in that description produces a corresponding measurement outcome. That is, for any quantum measurement there are generally multiple measurement results occurring on distinct “branches” of reality. According to hidden variable theories, the quantum state is a partial description of the system, where the rest of the description is given by the values of one or more “hidden” variables. The solution to the puzzle in this case is that the hidden variables pick out one of the physical states described by the quantum state as the actual one. According to spontaneous collapse theories, the quantum state is a complete description of the system, but the dynamical laws of quantum mechanics are incomplete, and need to be supplemented with a “collapse” process that eliminates all but one of the terms in the state during the measurement process.

These interpretations and others present us with very different pictures of the nature of the physical world (or in the Copenhagen case, no picture at all), and they have different strengths and weaknesses. The question of how to decide between them is an open one.

Table of Contents

  1. The Development of Quantum Mechanics
  2. The Copenhagen Interpretation
  3. The Many-Worlds Interpretation
  4. Hidden Variable Theories
  5. Spontaneous Collapse Theories
  6. Other Interpretations
  7. Choosing an Interpretation
  8. References and Further Reading

1. The Development of Quantum Mechanics

Quantum mechanics was developed in the early twentieth century in response to several puzzles concerning the predictions of classical (pre-20th century) physics. Classical electrodynamics, while successful at describing a large number of phenomena, yields the absurd conclusion that the electromagnetic energy in a hollow cavity is infinite. It also predicts that the energy of electrons emitted from a metal via the photoelectric effect should be proportional to the intensity of the incident light, whereas in fact the energy of the electrons depends only on the frequency of the incident light. Taken together with the prevailing account of atoms as clouds of positive charge containing tiny negatively charged particles (electrons), classical mechanics entails that alpha particles fired at a thin gold foil should all pass straight through, whereas in fact a small proportion of them are reflected back towards the source.

In response to the first puzzle, Max Planck suggested in 1900 that light can only be emitted or absorbed in integral units of hn, where n is the frequency of the light and h is a constant. This is the hypothesis that energy is quantized—that it is a discrete rather than continuous quantity—from which quantum mechanics takes its name. This hypothesis can be used to explain the finite quantity of electromagnetic energy in a hollow cavity. In 1905 Albert Einstein proposed that the quantization of energy can solve the second puzzle too; the minimum amount of energy that can be transferred to an electron from the incident light is hn, and hence the energy of the emitted electrons is proportional to the frequency of the light.

Ernest Rutherford’s solution to the third puzzle in 1911 was to posit that the positive charge in the atom is concentrated in a small nucleus with enough mass to reflect an alpha particle that collides with it. According to Niels Bohr’s 1913 elaboration of this model, the electrons orbit this nucleus, but only certain energies for these orbital electrons are allowed. Again, energy is quantized. The model has the additional benefit of explaining the spectrum of light emitted from excited atoms; since only certain energies are allowed, only certain wavelengths of light are possible when electrons jump between these levels, and this explains why the spectrum of the light consists of discrete wavelengths rather than a continuum of possible wavelengths.

But the quantization of energy raises as many questions as it answers. Among them: Why are only certain energies allowed? What prevents the electrons in an atom from losing energy continuously and spiraling in towards the nucleus, as classical physics predicts? In 1924 Louis de Broglie suggested that electrons are wave-like rather than particle-like, and that the reason only certain electron energies are allowed is that energy is a function of wavelength, and only certain wavelengths can fit without remainder in the electron orbit for a given energy. By 1926 Erwin Schrödinger had developed an equation governing the dynamical behavior of these matter waves, and quantum mechanics was born.

This theory has been astonishingly successful. Within a year of Schrödinger’s formulation, Clinton Davisson and Lester Germer demonstrated that electrons exhibit interference effects just like light waves—that when electrons are bounced off the regularly-arranged atoms of a crystal, their waves reinforce each other in some directions and cancel out in others, leading to more electrons being detected in some directions than others. This success has continued. Quantum mechanics (in the form of quantum electrodynamics) correctly predicts the magnetic moment of the electron to an accuracy of about one part in a trillion, making it the most accurate theory in the history of science. And so far its predictive track record is perfect: no data contradicts it.

But on a descriptive and explanatory level, the theory of quantum mechanics is less than satisfactory. Typically when a new theory is introduced, its proponents are clear about the physical ontology presupposed—the kind of objects governed by the theory. Superficially, quantum mechanics is no different, since it governs the evolution of waves through space. But there are at least two reasons why taking these waves as genuine physical entities is problematic.

First, although in the case of electron interference the number of electrons arriving at a particular location can be explained in terms of the propagation of waves though the apparatus, each electron is detected as a particle with a precise location, not as a spread-out wave. As Max Born noticed in 1926, the intensity (squared amplitude) of the quantum wave at a location gives the probability that the particle is located there; this is the Born rule for assigning probabilities to measurement outcomes. The second reason to doubt the reality of quantum waves is that the quantum waves do not propagate through ordinary three-dimensional space, but though a space of 3n dimensions, where n is the number of particles in the system concerned. Hence it is not at all clear that the underlying ontology is genuinely of waves propagating through space. Indeed, the standard terminology is to call the quantum mechanical representation of the state of a system a wavefunction rather than a wave, perhaps indicating a lack of metaphysical commitment: the mathematical function that represents a system has the form of a wave, even if it does not actually represent a wave.

So quantum mechanics is a phenomenally successful theory, but it is not at all clear what, if anything, it tells us about the underlying nature of the physical world. Quantum mechanics, perhaps uniquely among physical theories, stands in need of an interpretation to tell us what it means. Four kinds of interpretation are described in detail below (and some others more briefly). The first two—the Copenhagen interpretation and the many-worlds interpretation—take standard quantum mechanics as their starting point. The third and fourth—hidden variable theories and spontaneous collapse theories—start by modifying the theory of quantum mechanics, and hence are perhaps better described as proposals for replacing quantum mechanics with a closely related theory.

2. The Copenhagen Interpretation

The earliest consensus concerning the meaning of quantum mechanics formed around the work of Niels Bohr and Werner Heisenberg in Copenhagen during the 1920s, and hence became known as the Copenhagen interpretation. Bohr’s position is that our conception of the world is necessarily classical; we think of the world in terms of objects (for example, waves or particles) moving through three-dimensional space, and this is the only way we can think of it. Quantum mechanics doesn’t permit such a conceptualization, either in terms of waves or particles, and so the quantum world is in principle unknowable by us. Quantum mechanics shouldn’t be taken as a description of the quantum world, and neither should the evolution of the quantum state over time be taken as a causal explanation of the phenomena we observe. Rather, quantum mechanics is an extremely effective tool for predicting measurement results that takes the configuration of the measuring apparatus (described classically) as input, and produces probabilities for the possible measurement outcomes (described classically) as output.

It is sometimes claimed that the Copenhagen interpretation is a product of the logical positivism that flourished in Europe during the same period. The logical positivists held that the meaningful content of a scientific theory is exhausted by its empirical predictions; any further speculation into the nature of the world that produces these measurement outcomes is quite literally meaningless. This certainly has some resonances with the Copenhagen interpretation, particularly as described by Heisenberg. But Bohr’s views are importantly different from Heisenberg’s, and are more Kantian than positivist. Bohr is happy to say that the micro-world exists, and that it can’t be conceived of in causal terms, both of which would be meaningless claims according to positivist scruples. However, Bohr thinks we can say little else about the micro-world. Bohr, like Kant, thinks that we can only conceive of things in certain ways, and that the world as it is in itself is not amenable to such conceptualization. If this is correct, it is inevitable that our fundamental physical theories are unable to describe the world as it is, and the fact that we can make no sense of quantum mechanics as a description of the world should not concern us.

Unless one is convinced of Kant’s position concerning our conceptual access to the world, one may not find Bohr’s pronouncements concerning what we can conceive compelling. However, the motivation for adopting a Copenhagen-style interpretation can be made independent of any overarching philosophical position. Since the intensity of the wavefunction at a location gives the probability of the particle occupying that location, it is natural to regard the wavefunction as a reflection of our knowledge of the system rather than a description of the system itself. This view, held by Einstein, suggests that quantum mechanics is incomplete, since it gives us only an instrumental recipe for calculating the probabilities of outcomes, rather than a description of the underlying state of the system that gives rise to those probabilities. But it was later proved (as we shall see) that given certain plausible assumptions, it is impossible to construct such a description of the underlying state. Bohr did not know at the time that Einstein’s task was impossible, but its evident difficulty provides some motivation for regarding the quantum world as inscrutable.

However, the Copenhagen interpretation has at least two major drawbacks. First, a good deal of the early evidence for quantum mechanics comes from its ability to explain the results of interference experiments involving particles like electrons. Bohr’s insistence that quantum mechanics is not descriptive takes away this explanation (although, of course, viewing the wavefunction as descriptive only of our knowledge does no better). Second, Bohr’s position requires a “cut” between the macroscopic world described by classical concepts and the microscopic world subsumed under (but not described by) quantum mechanics. Since macroscopic objects are made out of microscopic components, it looks like macroscopic objects must obey the laws of quantum mechanics too; there can be no such “cut”, either sharp or vague, delimiting the realm of applicability of quantum mechanics.

3. The Many-Worlds Interpretation

In 1957 Hugh Everett proposed a radically new way of interpreting the quantum state. His proposal was to take quantum mechanics as descriptive and universal; the quantum state is a genuine description of the physical system concerned, and macroscopic systems are just as well described in this way as microscopic ones. This immediately solves both the above problems; there is no “cut” between the micro and macro worlds, and the explanation of particle interference in terms of waves is retained.

An immediate problem facing such a realist interpretation of the quantum state is the provenance of the outcomes of quantum measurements. Recall that in the case of electron interference, what is detected is not a spread-out wave, but a particle with a well-defined location, where the wavefunction intensity at a location gives the probability that the particle is located there.

How does Everett account for these facts? What he suggests is that we model the measurement process itself quantum mechanically. It is by no means uncontroversial that measuring devices and human observers admit of a quantum mechanical description, but given the assumption that quantum mechanics applies to all material objects, such a description ought to be available at least in principle. So consider for simplicity the situation in which the wavefunction intensity for the electron at the end of the experiment is non-zero in only two regions of space, A and B. The detectors at these locations can be modeled using a wavefunction too, with the result that the electron wavefunction component at A triggers a corresponding change in the wavefunction of the A-detector, and similarly at B. In the same way, we can model the experimenter who observes the detectors using a wavefunction, with the result that the change in the wavefunction of the A-detector causes a change in the wavefunction of the observer corresponding to seeing that the A-detector has fired, and the change in the wavefunction of the B-detector causes a change in the wavefunction of the observer corresponding to seeing that the B-detector has fired. The observer’s final state, then, is modeled by two distinct wave structures superposed, much in the way two images are superposed in a double-exposure photograph.

In sum, the wave structure of the electron-detector-observer system consists of two distinct branches, the A-outcome branch and the B-outcome branch. Since these two branches are relatively causally isolated from each other, we can describe them as two distinct worlds, in one of which the electron hits the detector at A and the observer sees the A-detector fire, and in the other of which the electron hits the detector at B and the observer sees the B-detector fire. This talk of worlds needs to be treated carefully, though; there is just one physical world, described by the quantum state, but because observers (along with all other physical objects) exhibit this branching structure, it is as if the world is constantly splitting into multiple copies. It is not clear whether Everett himself endorsed this talk of worlds, but this is the understanding of his work that has become canonical; call it the many-worlds interpretation.

According to the many-worlds interpretation, then, every physically possible outcome of a measurement actually occurs in some branch of the quantum state, but as an inhabitant of a particular branch of the state, a particular observer only sees one outcome. This explains why, in the electron interference experiment, the outcome looks like a discrete particle even though the object that passes through the interference device is a wave; each point in the wave generates its own branch of reality when it hits the detectors, so from within each of the resulting branches it looks like the incoming object was a particle.

The main advantage of the many-worlds interpretation is that it is a realist interpretation that takes the physics of standard quantum mechanics literally. It is often met with incredulity, since it entails that people (along with other objects) are constantly branching into innumerable copies, but this by itself is no argument against it. Still, the branching of people leads to philosophical difficulties concerning identity and probability, and these (particularly the latter) constitute genuine difficulties facing the approach.

The problem of identity is a philosophically familiar one: if a person splits into two copies, then the copies can’t be identical to (that is, the same person as) the original person, or else they would be identical to (the same person as) each other. Various solutions have been developed in the literature. One might follow Derek Parfit and bite the bullet here: what fission cases like this show is that strict identity is not a useful concept for describing the relationship between people and their successors. Or one might follow David Lewis and rescue strict identity by stipulating that a person is a four-dimensional history rather than a three dimensional object. According to this picture, there are two people (two complete histories) present both before and after the fission event; they initially overlap but later diverge. Identity over time is preserved, since each of the pre-split people is identical with exactly one of the post-split people. Both of these positions have been proposed as potential solutions to the problem of personal identity in a many-worlds universe. A third solution that is sometimes mentioned is to stipulate that a person is the whole of the branching entity, so that the pre-split person is identical to both her successors, and (despite our initial intuition otherwise) the successors are identical to each other.

So the problem of identity admits of a number of possible solutions, and the only question is how one should try to decide between them. Indeed, one might argue that there is no need to decide between them, since the choice is a pragmatic one about the most useful language to use to describe branching persons.

The problem of probability, though, is potentially more serious. As noted above, quantum mechanics makes its predictions in the form of probabilities: the square of the wavefunction amplitude in a region tells us the probability of the particle being located there. The striking agreement of the observed distribution of outcomes with these probabilities is what underwrites our confidence in quantum mechanics. But according to the many-worlds interpretation, every outcome of a measurement actually occurs in some branch of reality, and the well-informed observer knows this. It is hard to see how to square this with the concept of probability; at first glance, it looks like every outcome has probability 1, both objectively and epistemically. In particular, if a measurement results in two branches, one with a large squared amplitude and one with a small squared amplitude, it is hard to see why we should regard the former as more probable than the latter. But unless we can do so, the empirical success of quantum mechanics evaporates.

It is worth noting, however, that the foundations of probability are poorly understood. When we roll two dice, the chance of rolling 7 is higher than the chance of rolling 12. But there is no consensus concerning the meaning of chance claims, or concerning why the higher chance of 7 should constrain our expectations or behavior. So perhaps a quantum branching world is in no worse shape than a classical linear world when it comes to understanding probability. We may not understand how squared wavefunction amplitude could function as chance in guiding our expectations, but perhaps that is no barrier to postulating that it does so function.

A more positive approach has been developed by David Deutsch and David Wallace, arguing that given some plausible constraints on rational behavior, rational individuals should behave as if squared wavefunction amplitudes are chances. If one combines this with a functionalist attitude towards chance—that whatever functions as chance in guiding behavior is chance—then this program promises to underwrite the contention that squared wave amplitudes are chances. However, the assumptions on which the Deutsch-Wallace argument is based can be challenged. In particular, they assume that it is irrational to care about branching per se: having two successors experiencing a given outcome is neither better nor worse than having one successor experiencing that outcome. But it is not clear that this is a matter of rationality any more than the question of whether having several happy children is better than having one happy child.

A further worry about the many-words theory that has been largely put to rest concerns the ontological status of the worlds. It has been argued that the postulation of many worlds is ontologically profligate. However, the current consensus is that worlds are emergent entities just like tables and chairs, and talk of worlds is just a convenient way of talking about the features of the quantum state. On this view, the many-worlds interpretation involves no entities over and above those represented by the quantum state, and as such is ontologically parsimonious. There remains the residual worry that the number of branches depends sensitively on mathematical choices about how to represent the quantum state. Wallace, however, embraces this indeterminacy, arguing that even though the many-worlds universe is a branching one, there is no well-defined number of branches that it has. If tenable, this goes some way towards resolving the above concern about the rationality of caring about branching per se: if there is no number of branches, then it is irrational to care about it.

4. Hidden Variable Theories

The many-worlds interpretation would have us believe that we are mistaken when we think that a quantum measurement results in a unique outcome; in fact such a measurement results in multiple outcomes occurring on multiple branches of reality. But perhaps that is too much to swallow, or perhaps the problems concerning identity and probability mentioned above are insuperable. In that case, one is led to the conclusion that quantum mechanics is incomplete, since there is nothing in the quantum state that picks out one of the many possible measurement results as the single actual measurement result. As mentioned above, this was Einstein’s view. If this view is correct, then quantum mechanics stands in need of completion via the addition of extra variables describing the actual state of the world. These additional variables are commonly known as hidden variables.

However, a theorem proved by John Bell in 1964 shows that, subject to certain plausible assumptions, no such hidden-variable completion of quantum mechanics is possible. One version of the proof concerns the properties of a pair of particles. Each particle has a property called spin: when the spin of the particle is measured in some direction, one either gets the result up or down. Suppose that the spin of each particle can be measured along one of three directions 120° apart. What quantum mechanics predicts is that if the spins of the particles are measured along the same direction, they always agree (both up or both down), but if they are measured along different directions they agree 25% of the time and disagree 75% of the time. According to the hidden variable approach, the particles have determinate spin values for each of the three measurement directions prior to measurement. The question is how to ascribe spin values to particles to reproduce the predictions of quantum mechanics. And what Bell proved is that there is no way to do this; the task is impossible.

Many physicists concluded on the basis of Bell’s theorem that no hidden-variable completion of quantum mechanics is possible. However, this was not Bell’s conclusion. Bell concluded instead that one of the assumptions he relied on in his proof must be false. First, Bell assumed locality—that the result of a measurement performed on one particle cannot influence the properties of the other particle. This seems secure because the measurements on the two particles can be widely separated, so that a signal carrying such an influence would have to travel faster than light. Second, Bell assumed independence—that the properties of the particles are independent of which measurements will be performed on them. This assumption too seems secure, because the choice of measurement can be made using a randomizing device or the free will of the experimenter.

Despite the apparent security of his assumptions, Bell knew when he proved his theorem that a hidden-variable completion of quantum mechanics had been explicitly constructed by David Bohm in 1952. Bohm assumed that in addition to the wave described by the quantum state, there is also a set of particles whose positions are given by the hidden variables. The wave pushes the particles around according to a new dynamical law formulated by Bohm, and the law is such that if the particle positions are initially statistically distributed according to the squared amplitude of the wave, then they are always distributed in this way. In an electron interference experiment, then, the existence of the wave explains the interference effect, the existence of the particles explains why each electron is observed at a precise location, and the new Bohmian law explains why the probability of observing an electron at a given location is given by the squared amplitude of the wave. As Bell often pointed out, to call Bohm’s theory a hidden variable theory is something of a misnomer, since it is the values of the hidden variables—the positions of the particles—that are directly observed on measurement. Nevertheless, the name has stuck.

Bohm’s theory, then, provides a concrete example of a hidden variable theory of quantum mechanics. However, it is not a counterexample to Bell’s theorem, because it violates Bell’s locality assumption. The new law introduced by Bohm is explicitly non-local: the motion of each particle is determined in part by the positions of all the other particles at that instant. In the case of Bell’s spin experiment, a measurement on one particle instantaneously affects the motion of the other particle, even if the particles are widely separated. This is a prima facie violation of special relativity, since according to special relativity simultaneity is dependent on one’s choice of coordinates, making it impossible to define “instantaneous” in any objective way. However, this does not mean that Bohm’s theory is immediately refuted by special relativity, since one can instead take Bohm’s theory to show the need to add a universal standard of simultaneity to special relativity. Bell recognized this possibility. It is worth noting that even though Bohm’s theory requires instantaneous action at a distance, it also prevents these influences from being controlled so as to send a signal; there is no “Bell telephone”.

Bohm chooses positions as the properties described by the hidden variables of his theory. His reason for this is that it is plausible that it is the positions of things that we directly observe, and hence completing quantum mechanics via positions suffices to ensure that measurements have unique outcomes. But it is possible to construct measurements in which the outcome is recorded in some property other than position. As a response to this possibility, one might suggest adding hidden variables describing every property of the particles simultaneously, rather than just their positions. However, a theorem proved by Kochen and Specker in 1967 shows that no such theory can reproduce the predictions of quantum mechanics. A second response is to stick with Bohm’s theory as it is, and argue that while such measurements may initially lack a unique outcome, they will rapidly acquire a unique outcome as the recording device becomes correlated with the positions of the surrounding objects in the environment.

A final way to accommodate such measurements within a hidden variable theory is to make it a contingent matter which properties of a system are ascribed determinate values at a particular time. That is, rather than supplementing the wavefunction with variables describing a fixed property (the positions of things), one can let the wavefunction state itself determine which properties of the system are described by the hidden variables at that time. The idea is that the algorithm for ascribing hidden variables to a system is such that whenever a measurement is performed, the algorithm ascribes a determinate value to the property recording the outcome of the measurement. Such theories are known as modal theories. But while Bohm’s theory provides an explicit dynamical law describing the motion of the particles over time, modal theories generally do not provide a dynamical law governing their hidden variables, and this is regarded as a weakness of the approach.

Modal theories, like Bohm’s theory, evade Bell’s theorem by violating Bell’s locality assumption. In the modal case, the rule for deciding which properties of the system are made determinate depends on the complete wavefunction state at a particular instant, and this allows a measurement on one particle to affect the properties ascribed to another particle, however distant. As mentioned above, one can solve this problem by supplementing special relativity with a preferred standard of simultaneity. But this is widely regarded as an ad hoc and unwarranted addition to an otherwise elegant and well-confirmed physical theory. Indeed, the same charge is often levelled at the hidden variables themselves; they are an ad hoc and unwarranted addition to quantum mechanics. If hidden variable theories turn out to be the only viable interpretations of quantum mechanics, though, the force of this charge is reduced considerably.

Nevertheless, it may be possible to construct a hidden variable theory that does not violate locality. In order to evade Bell’s theorem, then, it will have to violate the independence assumption—the assumption that the properties of the particles are independent of which measurements will be performed on them. Since one can choose the measurements however one likes, it is initially hard to see how this assumption could be violated. But there are a couple of ways it might be done. First, one could simply accept that there are brute, uncaused correlations in the world. There is no causal link (in either direction) between my choice of which measurement to perform on a (currently distant) particle and its properties, but nevertheless there is a correlation between them. This approach requires giving up on the common cause principle—the principle that a correlation between two events indicates either that one causes the other or that they share a cause. However, there is little consensus concerning this principle anyway.

A second approach is to postulate a common cause for the correlation—a past event that causally influences both the choice of measurement and the properties of the particle. But absent some massive unseen conspiracy on the part of the universe, one can frequently ensure that there is no common cause in the past by isolating the measuring device from external influences. However, the measuring device and the particle to be measured will certainly interact in the future, namely when the measurement occurs. It has been proposed that this future event can constitute the causal link explaining the correlation between the particle properties and the measurements to be performed on them. This requires that later events can cause earlier events—that causation can operate backwards in time as well as forwards in time. For this reason, the approach is known as the retrocausal approach.

The retrocausal approach allows correlations between distant events to be explained without instantaneous action at a distance, since a combination of ordinary causal links and retrocausal links can amount to a causal chain that carries an influence between simultaneous distant events. No absolute standard of simultaneity is required by such explanations, and hence retrocausal hidden variable theories are more easily reconciled with special relativity than non-local hidden variable theories.

Bohm’s theory operates with a two-element ontology—a wave steering a set of particles. Retrocausal theories vary in their ontological presuppositions. Some—retrocausal Bohmian theories—incorporate two waves steering a set of particles; one wave carries the “forward-causal” influences on the particles from the initial state of the system, and the other carries the “backward-causal” influences on the particles from the final state of the system. But it may be possible to make do with the particles alone, with the wavefunction representing our knowledge of the particle positions rather than the state of a real object. The idea is that the interaction between the causal influences on the particles from the past and from the future can explain all the quantum phenomena we observe, including interference. However, at present this is just a promising research program; no explicit dynamical laws for such a theory have been formulated.

5. Spontaneous Collapse Theories

Hidden variable theories attempt to complete quantum mechanics by positing extra ontology in addition to (or perhaps instead of) the wavefunction. Spontaneous collapse theories, on the other hand, (at least initially) take the wavefunction to be a complete representation of the state of a system, and posit instead that the dynamical law of standard quantum mechanics—the Schrödinger equation—is not exactly right. The Schrödinger equation is linear; this means that if initial state A leads to final state A’ and initial state B leads to final state B’, then initial state A + B leads to final state A’ + B’. For example, if a measuring device fed a spin-up particle leads to a spin-up reading, and a measuring device fed a spin-down particle leads to a spin-down reading, then a measuring device fed a particle whose state is a sum of spin-up and spin-down states will end up in a state which is a sum of reading spin-up and reading spin-down. This is the multiplicity of measurement outcomes embraced by the many-worlds interpretation.

To avoid sums of distinct measurement outcomes, one needs to modify the basic dynamical equation of the quantum mechanics equation so that it is non-linear. The first proposal along these lines was made by Gian Carlo Ghirardi, Alberto Rimini, and Tullio Weber in 1986; it has become known as the GRW theory. The GRW theory adds an irreducibly probabilistic “collapse” term to the otherwise deterministic Schrödinger dynamics. In particular, for each particle in a system there is a small chance per unit time of the wavefunction undergoing a process in which it is instantly and discontinuously localized in the coordinates of that particle. The localization process multiplies the wave state by a narrow Gaussian (bell curve), so that if the wave was initially spread out in the coordinates of the particle in question, it ends up concentrated around a particular point. The point on which this collapse process is centered is random, with a probability distribution given by the square of the pre-collapse wave amplitude (averaged over the Gaussian collapse curve).

The way this works is as follows. The collapse rate for a single particle is very low—about one collapse per hundred million years. So for individual particles (and systems consisting of small numbers of individual particles), we should expect that they obey the Schrödinger equation. And this is exactly what we observe; there are no known exceptions to the Schrödinger equation at the microscopic level. But macroscopic objects contain on the order of a trillion trillion particles, so we should expect about ten million collapses per second for such an object. Furthermore, in solid objects the positions of those particles are strongly correlated with each other, so a collapse in the coordinates of any particle in the object has the effect of localizing the wavefunction in the coordinates of every particle in the object. This means that if the wavefunction of a macroscopic object is spread over a number of distinct locations, it very quickly collapses to a state in which its wavefunction is highly localized around one location.

In the case of electron interference, then, each electron passes through the apparatus in the form of a spread-out wave. The collapse process is vanishingly unlikely to affect this wave, which is important, as its spread-out nature is essential to the explanation of interference: wave components traveling distinct paths must be able to come together and either reinforce each other or cancel each other out. But when the electron is detected, its position is indicated by something we can directly observe, for example, by the location of a macroscopic pointer. To measure the location of the electron, then, the position of the pointer must become correlated with the position of the electron. Since the wave representing the electron is spread out, the wave representing the pointer will initially be spread out too. But within a fraction of a second, the spontaneous collapse process will localize the pointer (and the electron) to a well-defined position, producing the unique measurement outcome we observe.

The spontaneous collapse approach is related to earlier proposals (for example, by John von Neumann) that the measurement process itself causes the collapse that reduces the multitude of pre-measurement wave branches to the single observed outcome. However, unlike previous proposals, it provides a physical mechanism for the collapse process in the form of a deviation from the standard Schrödinger dynamics. This mechanism is crucial; without it, as we have seen, there is no way for the measurement process to generate a unique outcome.

Note that, unlike in Bohm’s theory, there are no particles at the fundamental level in the GRW theory. In the electron interference case, particle behavior emerges during measurement; the measured system exhibits only wave-like behavior prior to measurement. Strictly speaking, to say that a system contains n particles is just to say that its wave representation has 3n dimensions, and to single out one of those particles is really just to focus attention on the form of the wave in three of those dimensions.

An immediate difficulty that faces the GRW theory is that the localization of the wave induced by collapse is not perfect. The collapse process multiplies the wave by a Gaussian, a function which is strongly peaked around its center but which is non-zero everywhere. No part of the pre-collapse wavefunction is driven to zero by this process; if the wavefunction represents a set of possible measurement results, the wave component corresponding to one result becomes large and the wave component corresponding to the others become small, but they do not disappear. Since one motivation for adopting a spontaneous collapse theory is the perceived failure of the many-worlds interpretation to recover probability claims, it cannot be argued that the small terms are intrinsically improbable. Instead, it looks like the GRW spontaneous collapse process fails to ensure that measurements have unique outcomes.

A second difficulty with the GRW theory is that the wavefunction is not an object in a three-dimensional space, but an object occupying a high-dimensional space with three dimensions for each “particle” in the system concerned. David Albert has argued that this makes the three-dimensional world of experience illusory.

A third difficulty with the GRW theory is that the collapse process acts instantaneously on spatially separated parts of the system; it instantly multiplies the wavefunction everywhere by a Gaussian. Like Bohm’s theory, the GRW theory violates Bells’ locality assumption, since a measurement performed on one particle can instantaneously affect the state of a distant particle (although in the case of the GRW theory talk of “particles” has to be cashed out in terms of the coordinates of the wavefunction). As discussed in relation to Bohm’s theory, this requires an objective conception of simultaneity that is absent from special relativity, and hence it is hard to see how to reconcile the GRW theory with relativity.

One way of responding to these difficulties, advocated by Ghirardi, is to postulate a three-dimensional mass distribution in addition to and determined by the wavefunction, such that our experience is determined directly by the mass distribution rather than the wavefunction. This responds to the second difficulty, since the mass distribution that we directly experience is three-dimensional, and hence our experience of a three-dimensional world is veridical. It may also go some way towards resolving the first difficulty, since the mass density corresponding to non-actual measurement outcomes is likely to be negligible relative to the background mass density surrounding the actual measurement outcome (the mass density of air, for example). Ghirardi’s mass density is not intended to address the third difficulty; this requires modifying the collapse process itself, and several proposals for constructing a relativistic collapse process based on the GRW theory have been developed.

An alternative approach to the difficulties facing the GRW theory is to adapt a suggestion made by John Bell that the center of each collapse event should be regarded as a “flash of determinacy” out of which everyday objects and everyday experience are built. Roderich Tumulka has developed this suggestion into a “flashy” spontaneous collapse theory, in which the wavefunction is regarded instrumentally as that which connects the distribution of flashes at one time with the probability distribution of flashes at a later time. On this proposal, the small wave terms corresponding to non-actual measurement outcomes can be understood in a straightforwardly probabilistic way: there is only a small chance that a flash will be associated with such a term, and so only a small chance that the non-actual measurement outcome will be realized. The flashes are located in three-dimensional space, so there is no worry that three-dimensionality is an illusion. And since the flashes, unlike the wavefunction, are located at space-time points, it is easier to envision a reconciliation between the flashy theory and special relativity.

6. Other Interpretations

There are several other interpretations of quantum mechanics available that don’t fit neatly into one of the categories discussed above. Here are some prominent ones.

The consistent histories (or decoherent histories) interpretation developed by Robert Griffiths, Murray Gell-Mann and James Hartle, and defended by Roland Omnès, is mathematically something of a hybrid between collapse theories and hidden variable theories. Like spontaneous collapse theories, the consistent histories approach incorporates successive localizations of the wavefunction. But unlike spontaneous collapse theories, these localizations are not regarded as physical events, but just as a means of picking out a particular history of the system in question as actual, much as hidden variables pick out a particular history as actual. If the localizations all constrain the position of a particle, then the history picked out resembles a Bohmian trajectory. But the consistent histories approach also allows localizations to constrain properties other than position, resulting in a more general class of possible histories.

However, not all such sets of histories can be ascribed consistent probabilities: notably, interference effects often prevent the assignment of probabilities obeying the standard axioms to histories. However, for systems that interact strongly with their environment, interference effects are rapidly suppressed; this phenomenon is called decoherence. Decoherent histories can be ascribed consistent probabilities—hence the two alternative names of this approach. It is assumed that only consistent sets of histories can describe the world, but other than this consistency requirement, there is no restriction on the kinds of histories that are allowed. Indeed, Griffiths maintains that there is no unique set of possible histories: there are many ways of constructing sets of possible histories, where one among each set is actual, even if the alternative actualities so produced describe the world in mutually incompatible ways. Absent a many worlds ontology, however, some have worried about how such a plurality of true descriptions of the world could be coherent. Gell-Mann and Hartle respond to such concerns by arguing that organisms evolve to exploit the relative predictability of one among the competing sets of histories.

The transactional interpretation, initially developed by John Cramer, also incorporates elements of both collapse and hidden variable approaches. It starts from the observation that some versions of the dynamical equation of quantum mechanics admit wave-like solutions traveling backward in time as well as forward in time. Typically the former solutions are ignored, but the transactional interpretation retains them. Just as in retrocausal hidden variable theories, the backward-travelling waves can transmit information about the measurements to be performed on a system, and hence allow the transactional interpretation to evade the conclusion of Bell’s theorem.

The transactional interpretation posits rules according to which the backward and forward waves generate “transactions” between preparation events and measurement events, and one of these transactions is taken to represent the actual history of the system in question, where probabilities are assigned to transactions via a version of the Born rule. The formation of a transaction is somewhat reminiscent of the spontaneous collapse of the wavefunction, but due to the retrocausal nature of the theory, one might conclude that the wavefunction never exists in a pre-collapse form, since the completed transaction exists as a timeless element in the history of the universe. Hence some have questioned the extent to which the story involving forwards and backwards waves constitutes a genuine explanation of transaction formation, raising questions about the tenability of the transactional interpretation as a description of the quantum world. Ruth Kastner responds to these challenges by developing a possibilist transactional interpretation, embedding the transactional interpretation in a dynamic picture of time in which multiple future possibilities evolve to a single present actuality.

Relational interpretations, such as those developed by David Mermin and by Carlo Rovelli, take quantum mechanics to be about the relations between systems rather than the properties of the individual systems themselves. According to such an interpretation, there is no need to assign properties to individual particles to explain the correlations exhibited by Bell’s experiment, and hence one can evade Bells’ theorem without violating either locality or independence. Superficially, this approach resembles Everett’s, according to which systems have properties only relative to a given branch of the wavefunction. But whereas Everettians typically say that a relation such as an observer seeing a particular measurement result holds on the basis of the properties of the observer and of the measured system within a branch, Mermin denies that there are such relata; rather, the relation itself is fundamental. Hence this is not a many worlds interpretation, since world-relative properties provide the relata that relational interpretations deny. Without such relata, though, it is hard to understand relational quantum mechanics as a description of a single world either. However, citing analogies with spatiotemporal properties in relativistic theories, Rovelli insists that it is enough that quantum mechanics ascribe properties to a system relative to the state of a second system (for example, an observer).

Informational interpretations, such as those developed by Jeffrey Bub and by Carlton Caves, Christopher Fuchs and Rüdiger Schack, interpret quantum mechanics as describing constraints on our degrees of belief. They develop rules of quantum credence by analogy with the rules of classical information theory, expressing the difference between quantum systems and classical systems in informational terms, for example in terms of an unavoidable loss of information associated with a quantum measurement. Some proponents of an informational interpretation take an explicitly instrumentalist stance: quantum mechanics is just about the beliefs of observers, treated as external to the quantum systems under consideration. Others take their informational interpretation to be a realist one, in the sense that it can in principle be applied to the whole universe, with “information” serving as a new physical primitive. However, the adequacy of the informational approach as realist can be challenged, for example, on the basis that it does not provide a dynamics for the evolution of the actual state of the world over time. Bub responds that an account of the information-theoretic properties of our measurement results may be the deepest explanation we can hope for.

7. Choosing an Interpretation

Setting aside interpretations such as Copenhagen that eschew describing the quantum world, the interpretations discussed above present us with a number of very different ontological pictures. The many-worlds interpretation tells us that the underlying nature of physical objects is wave-like and branching. Bohm’s theory adds particles to this wave, and some hidden variable theories attempt to do away with the wave as a physical entity. The GRW theory, like the many-worlds interpretation, takes waves as fundamental, but rejects the many-worlds picture of a branching universe. Other spontaneous collapse theories add a mass density distribution to the wave, or replace the wave with point-like flashes. The GRW theory is indeterministic, casting quantum mechanical probabilities as genuine objective chances appearing in the fundamental physical laws. Bohm’s theory is deterministic, since the physical laws involve no chances, making quantum probabilities merely epistemic. The many-worlds interpretation involves no objective chances in the laws, but nevertheless (if successful) casts quantum mechanical probabilities as objective chances grounded in the branching process.

It seems, then, that we have a classic case of underdetermination: while the experimental data strongly confirm quantum mechanics, it is unclear whether those data confirm the metaphysical picture of many-worlds, Bohm, GRW or some other alternative. Since it has been doubted that underdetermination is ever actually manifested in the history of science, this is a striking example.

Nevertheless, the nature and even the existence of this underdetermination can be contested. It is worth noting that spontaneous collapse theories differ in their empirical predictions from standard quantum mechanics; the collapse process destroys interference effects, and the larger the object the more quickly one expects these effects to be detectable. At present, the differences between spontaneous collapse theories and standard quantum mechanics are beyond the reach of feasible experiments, since small objects cannot be kept isolated for long enough, and large objects cannot be kept isolated at all. Even so, the empirical underdetermination between spontaneous collapse theories and the other interpretations is not a matter of principle, and may be resolved in favor of one side or the other at some point.

The underdetermination between hidden variable theories and the many-worlds interpretation is of a different character. These two interpretations are empirically equivalent, and hence no experimental evidence could decide between them. It seems that here we have a case of underdetermination in principle. One could try to decide between them on the basis of non-empirical theoretical virtues like simplicity and elegance. On measures like this, the many-worlds interpretation would surely win, since hidden variable theories begin with the mathematical formalism of the many-worlds interpretation and add complicated and arguably ad hoc extra theoretical structure. But judging theories on the basis of extra-theoretical virtues is a controversial endeavor, particularly if we take the winner to be a guide to the metaphysical nature of the world.

Alternatively, it is not unreasonable to think that either the many-worlds interpretation or hidden variable theories could prove to be untenable. As noted above, it is unclear whether the many-worlds interpretation can account for the truth of probability claims, and if it cannot, then it fails to make contact with the empirical evidence. On the other hand, it is unclear whether any hidden variable theory can be made consistent with special relativity (and generalized to cover quantum field theory), and if not, then the hidden variable approach is arguably inadequate.

Some have argued that there is no underdetermination in the interpretation of quantum mechanics, since the many-worlds interpretation alone follows directly from a literal reading of the standard theory of quantum mechanics. It is true that both hidden variable theories and spontaneous collapse theories supplement or modify standard quantum mechanics, so perhaps only the many-worlds interpretation qualifies as an interpretation of standard quantum mechanics rather than a closely related theory. The many-worlds interpretation may be the only reasonable interpretation of quantum mechanics as it stands, and there may be good methodological reasons against modifying successful scientific theories. However, given the possibility that quantum mechanics according to the many-worlds interpretation is not in fact a successful scientific theory (because of the probability problem), it seems reasonable to consider modifications to the standard theory.

Nevertheless, it is certainly true that there may be no underdetermination in quantum mechanics, since it is possible that only one of the interpretations described here will prove to be tenable. Indeed, it is possible that none of these interpretations will prove to be tenable, since all of them face unresolved difficulties. Hence the interpretation of quantum mechanics is still very much an open question.

8. References and Further Reading

  • Albert, David Z. Quantum mechanics and experience. Harvard University Press, 1992.
    • Non-technical overview of the various interpretations of quantum mechanics and their problems.
  • Bell, John Stewart. Speakable and unspeakable in quantum mechanics: Collected papers on quantum philosophy. Cambridge University Press, 2004.
    • A mix of technical and non-technical papers, including the original 1964 proof of Bell’s theorem and discussions of various interpretations of quantum mechanics, especially hidden variable theories.
  • Bohm, David. Quantum theory. Prentice-Hall, 1951.
    • Classic quantum mechanics textbook, with early chapters covering the historical development of the theory.
  • Bohm, David, and Basil J. Hiley. The undivided universe: An ontological interpretation of quantum theory. Routledge, 1993.
    • A guide to Bohm’s theory and its implications by its originator. Technical in parts.
  • Bub, Jeffrey. Bananaworld: Quantum mechanics for primates. Oxford University Press, 2016.
    • Accessible introduction to the phenomena of entanglement, and an extended argument for an informational interpretation of quantum mechanics.
  • Cushing, James T. Quantum mechanics: Historical contingency and the Copenhagen hegemony. University of Chicago Press, 1994.
    • A comparison of the Copenhagen interpretation and Bohm’s theory, and a defense of the view that the former became canonical largely for social reasons.
  • Greaves, Hilary. “Probability in the Everett interpretation.” Philosophy Compass 2.1 (2007): 109-128.
    • Non-technical overview of the attempts to find a place for probability within Everett’s branching universe.
  • Kastner, Ruth. The transactional interpretation of quantum mechanics: The reality of possibility. Cambridge University Press, 2013.
    • Non-technical introduction to the transactional interpretation, and development of a “possibilist” version as a response to objections.
  • Maudlin, Tim. Quantum non-locality and relativity. Blackwell, 1994.
    • Non-technical guide to the problems of reconciling quantum mechanics with relativity.
  • Mermin, N. David. “Quantum mysteries for anyone.” The Journal of Philosophy 78 (1981): 397-408.
    • Non-technical exposition of Bell’s theorem and discussion of its implications.
  • Ney, Alyssa, and David Z. Albert, eds. The wavefunction: Essays on the metaphysics of quantum mechanics. Oxford University Press, 2013.
    • Essays on the ontological status of the wavefunction, including the issue of whether realism about the wavefunction makes the three-dimensional world of experience illusory.
  • Omnès, Roland. Understanding quantum mechanics. Princeton University Press, 1999.
    • Accessible (but in parts moderately technical) defense of the consistent histories approach.
  • Price, Huw. Time’s arrow & Archimedes’ point: New directions for the physics of time. Oxford University Press, 1997.
    • An extended, non-technical defense of the retrocausal hidden variable interpretation of quantum mechanics.
  • Rovelli, Carlo. “Relational quantum mechanics.” International Journal of Theoretical Physics 35 (1996): 1637-1678.
    • Exposition and defense of relational quantum mechanics. Moderately technical in parts.
  • Saunders, Simon, Jonathan Barrett, Adrian Kent, and David Wallace, eds. Many Worlds?: Everett, Quantum Theory, & Reality. Oxford University Press, 2010.
    • A collection of essays on the many-worlds interpretation, for and against, technical and non-technical. Includes an essay by Peter Byrne on the history of Everett’s interpretation.
  • Wallace, David. The emergent multiverse: Quantum theory according to the Everett interpretation. Oxford University Press, 2012.
    • An exposition and defense of the many-worlds interpretation, focusing especially on the issue of probability. Technical in parts.

 

Author Information

Peter J. Lewis
Email: plewis@miami.edu
University of Miami
U. S. A.

Human Dignity

The mercurial concept of human dignity features in ethical, legal, and political discourse as a foundational commitment to human value or human status.  The source of that value, or the nature of that status, are contested.  The normative implications of the concept are also contested, and there are two partially, or even wholly, different deontic conceptions of human dignity implying virtue-based obligations on the one hand, and justice-based rights and principles on the other.  Added to this, the different practical and philosophical presuppositions of law, ethics, and politics mean that definitive adjudication between different meanings is frustrated by disciplinary incommensurabilities.

What follows is an analysis of human dignity’s uses in law, ethics, and politics, and a critical description of the functions and tensions generated by human dignity within these fields. Crucial conceptual and methodological questions arise from the outset regarding whether human dignity can be reconstructed as one concept or must be treated as several concepts. It is argued here that a focal concept of human dignity can be reconstructed and that this concept provides the most illuminating perspective from which to view human dignity’s range of conceptions and uses.

Table of Contents

  1. Introduction
  2. Conceptual Background
  3. Themes
    1. Law
    2. Ethics
      1. General
      2. Philosophical Anthropology
      3. Tensions
    3. Politics
  4. Conceptual Analysis
    1. The Conceptual Features of Human Dignity
    2. The Credibility of an Interstitial Concept
    3. The Implications of an Interstitial Concept
  5. Conclusion
  6. References and Further Reading

1. Introduction

There are a number of competing conceptions of human dignity taking their meaning from the cosmological, anthropological, or political context in which human dignity is used. Human dignity can denote the special elevation of the human species, the special potentiality associated with rational humanity, or the basic entitlements of each individual.  There are, by extension, dramatically different normative uses to which the concept can be put. It is connected, variously, to ideas of sanctity, autonomy, personhood, flourishing, and self-respect, and human dignity produces, at different times, strict prohibitions and empowerment of the individual. It can also, potentially, be used to express the core commitments of liberal political philosophy as well as precisely those duty-based obligations to self and others that communitarian philosophers consider to be systematically neglected by liberal political philosophy.

As a consequence of these antagonistic currents of thought, philosophical analysis of human dignity cannot be separated from wider debates in moral, political, and legal philosophy. Nor can a certain level of selective reconstruction be avoided. The genealogy of the concept has been traced, tendentiously, through the whole history of Western, and sometimes non-Western, philosophical thought; such genealogies are not always illuminating at a conceptual level. More specifically, it is a desideratum of philosophical analysis of human dignity that the concept can be shown to have sufficient clarity to make a useful contribution to modern philosophical debate. This article therefore locates human dignity within a range of debates and suggests—using one important reconstruction of the concept—that human dignity represents a claim about human status that is intended to have a unifying effect on our ethical, legal and political practices.

We begin with an extended methodological and conceptual exploration, asking what should be taken as primary in examining human dignity. Noting a particularly close relationship between contemporary uses of human dignity, international law, and human rights, this connection is treated as focal without assuming that it is definitive of the concept (for related but alternative starting points see Debes 2009; Waldron 2013; Donnelly 2015).

2. Conceptual Background

The use of human dignity in public international law is a marker for understanding the moral, legal and political discourse of human dignity. A characteristic expression is found in the Preamble of the International Covenant on Civil and Political Rights (1966) whose rights “derive from the inherent dignity of the human person” and whose animating principle is “recognition of the inherent dignity and of the equal and inalienable rights of all members of the human family [as] the foundation of freedom, justice and peace in the world.” This assertion and others like it form a common reference point in contemporary literature on human dignity. Importantly, this ‘inherent dignity’ represents a potential bridge between a number of different ideas and ideals, namely freedom, justice and peace.

In fact, it is this potential to bridge different fields of regulation—human rights, bioethics, humanitarian law, equality law and others—that we might take to be the most important function of human dignity in international law. We will refer to an interstitial concept of human dignity (IHD). This concept, arising from discourses and practices of international law, has a strong relationship with equality, liberty, and the basic status of the individual. And, crucially, it implies an interstitial or conjunctive function across our normative systems. It is where law, ethics, and politics meet and are practically and critically interrelated. It is where domestic, regional, and international regulation find a common principle. It is where positive law and morality become difficult to distinguish. And it is where specific norms and general principles are linked. By extension, this concept of human dignity is the concept we should treat as the foundation of human rights because any reconstruction of the complex menu of human rights in international law has to take account of their wide-ranging implications for legal, moral and political governance. Put another way, one necessary condition for a defensible, foundational account of human rights is that their foundational principle must have an interstitial function straddling these fields of normative practice.

Note that this does not capture, and is potentially in tension with, many existing linguistic and normative practices related to human dignity. For instance, discussion of ‘dignitarian harms’ relevant to healthcare law, or local prohibitions on degrading work, might well invoke the language of human dignity without intending any implications for other normative systems. They imply nothing about politics or about law more generally. These linguistic and normative manifestations of human dignity should be considered in their own terms and are returned to in what follows. But the question of why there are tensions between these uses and the IHD is a revealing line of enquiry in itself. It concerns genealogical changes in the concept but also, and more importantly, the ways in which norms and principles are shaped and conditioned within the different practices of law, ethics and politics. To be sure, an interstitial concept is treated here as the best vantage point for all the competing claims. But this is not to insist it is the only intelligible concept. What follows is a description of an IHD’s form, content, and normative uses and an initial comparison with competing characterizations.

First, the idea of form allows us to distinguish the IHD from other uses of ‘dignity.’ Human dignity in international law is associated with a cluster of closely related, but distinguishable, formal characteristics. Human dignity connotes universality (ascription to every human person), inalienability (it is a non-contingent implication of one’s status as human), unconditionality (a property requiring no performance or maintenance), and overridingness (having priority in normative disputes). These immediately assist in distinguishing an IHD concept from a behavioral description of dignity which would not be inalienable, a virtue ethical reading which would either not include ascription to every human person or would be contingent, or a healthcare ethics reading which might not insist on the overridingness of human dignity. Note that these formal criteria are not treated as necessary conditions for human dignity but are, rather, claims commonly associated with human dignity in international law. They assist, amongst other things, in distinguishing human dignity from dignity simpliciter with its associations with behavior and comportment. They also situate the IHD close to certain currents of Kantianism and deontology without assuming that Kant’s work is definitive of the concept.

Second, content encompasses the ‘what’ and the ‘who’ of human dignity. Invocation of human dignity invites us to ask what underlying conception of humanity is at work. The discourse of the ‘human person,’ often associated with human dignity in international law, captures the mixture of formal personhood and embodiment or vulnerability. The conjunction of human and person also produces potentially competing conceptual and ontological commitments, and we can draw a distinction between normative and taxonomical humanity in our discourse of human dignity (Donnelly 2015). Further complexity arises from strong species-based claims or discussions of transhumanism that are focused on potential changes in the ontology of humanity. Undoubtedly human dignity is associated with species claims but it is also intelligible to rely upon more formal claims about the characteristics of agents or persons in analysis of human dignity. Related to these questions of ascription, the ontological and normative commitments involved in a human dignity claim (the question of what) are varied. Human dignity could concern capacities, could include the direct requirement to exercise capacities, and might also concern a teleology for humanity (that is, the ontology of human dignity). Human dignity will—at least in the use of concern here—be closely linked to notions of autonomy, personhood and free will (that is, the correlates of human dignity). Related to this is a contrast (concerning what we might call the metaphysics of human dignity) between human dignity considered broadly as a property or as something arising relationally through recognition or respect.

Third, normative use concerns characteristic normative implications and normative functions. This has been usefully expressed as a distinction between empowerment and constraint (Beyleveld and Brownsword 2001). The IHD is commonly associated with empowerment through human rights. This is distinguishable from the constraint function commonly found in bioethics and healthcare ethics, often a peremptory ban on certain kinds of uses of human beings. It is less clear how the IHD functions regarding another common distinction, that between horizontal application (between individuals) and vertical application (between the state and individual). International human rights law predominantly concerns vertical application, but the IHD, particularly given its linking of law, morality and politics does not preclude (and may imply) horizontal application. We may also note at this point a common distinction between human dignity as status and value. This turns, in part, on what response is required in the light of human dignity: status demands respect but also rights, duties and privileges; the existence of a value potentially requires fostering or enhancement. Only the former rights, duties and privileges are likely to be treated as having systemic application (being justiciable or enforceable), at least within liberal political systems that refuse to enforce moral conduct. As a consequence, the normative use of any IHD concept is undoubtedly conditioned by liberal assumptions concerning the proper scope of legislation. Nonetheless there are many instances of enforcement of more perfectionist or self-regarding conceptions of human dignity (for instance in the prohibition of ‘dwarf tossing’).

The last point reveals the most important tension in the general philosophical study of human dignity, namely the seeming co-existence of the interstitial concept characteristic of international law on the one hand and a perfectionist, virtue or purely self-regarding concept on the other. The assumption made here, that the latter perfectionist claims are non-focal or non-standard, is contentious (for the opposing view see Hennette-Vauchez 2011). Nevertheless this would appear to make the best sense of the majority of post-World War Two literature and thinking. Indeed the important post-war legal instruments themselves represent an interstitial process or moment, and the reconfiguration of the international legal order was the seedbed in which a certain idea of human dignity was given international expression. Far from being an accident of drafting or the contingencies of finding consensus, the (re)assertion of a notion of human dignity can be seen as the intention to transcend the boundaries of the legal, moral and political. Accordingly, while the following analysis does point to some historically contingent aspects of the use of human dignity, this is less important than the fact that the drafting of the Universal Declaration of Human Rights (1948) [UDHR] took place when the foundations of the international legal and political order were undergoing massive upheaval and when the need for a unifying moral principle was acute. We begin with law as the normative system within which the putative interstitial concept arose.

3. Themes

a. Law

There is no doubt that an IHD concept finds its most important expression in post-World War Two international law and constitutional instruments (the Universal Declaration of Human Rights, the Twin Covenants, and others). As such, the nature and function of human dignity in law could be assumed to be clear and well documented. This is the case at the level of doctrinal analysis of human dignity, and there is important jurisprudence arising in particular from the European Court of Human Rights and from constitutions including those of Germany, South Africa and Hungary. The sum of this jurisprudential thought is a mixture of general thinking about the foundation of constitutional rights alongside specific focus on the prohibition of degradation and objectification. This however points to two areas of deeper complexity, one hermeneutical and one concerning the conditioning effects of legal systems. First, different jurisdictions and institutions have given such radically different functions to human dignity that it is not always clear that one concept, the IHD, is at work. Indeed more substantive and perfectionist notions are often in evidence in national legal settings. Second, the IHD seems an ideal candidate for a kind of Grundnorm or secondary rule in law: a norm giving validity to legal systems as a whole or a principle governing the application of all norms within a system. However, this is difficult to defend as anything other than a loose generalization. In principled terms, legal systems treat justice as their foundational norm and this means that consistency, rather than moral defensibility, guides adjudication. And, in practice, it is not at all clear how human dignity can or should function as a ‘higher’ norm. There is, in other words, something of a mismatch between the putative function of the concept and its actual potential.

The nature and content of international law can partially explain such tensions. The prominent place of human dignity in international human rights instruments, as the foundation of those rights, has given human dignity enormous symbolic and heuristic significance. The foundational significance of human dignity is frequently assumed to extend beyond international human rights law to the international legal system as a whole. Where there are tensions between different fields of international law, or emerging practices in international law, human dignity is an important tool for focusing on the normative forces at work, in particular the significance of the individual as transcending the boundaries of state authority and as justifying state authority. It is fair to say that at this level human dignity is of enormous symbolic importance though human dignity is not, in itself, an enforceable norm of international law (the exception to this is in international humanitarian law’s Common Article 3, a prohibition on “outrages upon personal dignity”).

At the regional and domestic levels the normative implications of human dignity become more precise. While the European Court of Human Rights takes from international law the assumption that human dignity is foundational, it has operationalized it within its jurisprudence as an interpretive tool generally, and with particular reference to the idea of “torture, inhuman or degrading treatment.” This association between human dignity and the worst forms of degradation and objectification is shared with international humanitarian law and with German constitutional thinking. It is also the focus of the US constitutional deployment of human dignity as an interpretive tool in Eighth Amendment jurisprudence (concerning “cruel and unusual punishment”). The merit of this association with degradation is to give human dignity a clearer normative implication: the absolute impermissibility of certain kinds of gross mistreatment of the individual. Conversely, it is difficult to reconcile this restrictive, prohibitive reading with the assumption that human dignity is broad and foundational.

This relates, in turn, to a tension between human dignity operationalized as a specific norm (or in some instances a right) and a more general principle in law. Consider, for instance, Article 23 of the Universal Declaration of Human Rights (1948) (“everyone who works has the right to just and favourable remuneration ensuring for himself and his family an existence worthy of human dignity”). Here human dignity is neither a principle nor clearly foundational of the right it is associated with (or any other right); instead, it is a telos or standard. That standard is, potentially, related to material sufficiency or to flourishing and could be seen, to that extent, to have an aspiration to being interstitial. Nevertheless it is (in fact) rare for human dignity to be enforced as a standard and is (in principle) unclear how this would amount to normative or conceptual unification of law, ethics and politics. It is possible that some instances of human dignity as a right or as a telos appear to have clear interstitial implications but nonetheless represent a different concept from the IHD because both their content and their normative implications differ (see Waldron 2013).

The kind of complexities and possibilities that arise from human dignity being in law a right, standard or telos as well as a principle, value or status, gives rise to an underlying uncertainty as to whether law contains a single concept, a number of conceptions or simply a confusion of several ideas. There are a number of proposed normative and conceptual solutions to this tension, though it is not obvious how we might adjudicate between them. First, we can assume that human dignity necessarily has a dual status as norm (a more or less prohibitive norm) and as principle (predominantly symbolic and heuristic) (Alexy 2009). Second, we can assume that law has a number of different conceptions at work, conceptions that are either incommensurable (McCrudden 2008) or loosely linked by family resemblance (Neal 2012). Third, we can assume that law now has two very different concepts at work, one ancient and honor-based and the second closer to the IHD. We give this last option closer attention.

While many domestic or constitutional uses of human dignity are closely related to autonomy, privacy and the protection of agency, there is no doubt that (human) dignity has also been used to impose limitations on acts that can be seen as voluntarily diminishing an individual’s own human dignity or violating duties to themselves. In the broadest terms, then, there is a tension between a permissive reading of human dignity that protects autonomous individual agency from state intrusion, and a conservative reading that allows law to protect individuals from themselves. (This partially resembles Beyleveld and Brownsword’s contrast between the empowerment and constraint conceptions of human dignity.) These kinds of tensions are explored by Stephanie Hennette-Vauchez (2011), who insists on the coexistence of a human dignity principle, which is in essence a principle of equality, and an older (ancient) notion which is closer to a hierarchical notion of honor and permits the enforcement of certain norms related to self-respect. The form, content, and normative implications of these two ideas are clearly very different. While the idea of respect is morally important, it is difficult to reconcile the enforcement of respect with the assumptions we would treat as definitive of liberal legal systems, namely formal equality and division between public and private obligations. As such the honorific manifestations of human dignity are distinct from the liberal concept of human dignity; they are only rarely treated as enforceable (through personality law or public morality provisions) and lack the universal or inalienable characteristics of the IHD. They are nevertheless an irreducible part of contemporary law.

In sum, international law is a source of much of our thinking about human dignity, and in particular it gives credence to the idea of an IHD concept that can link different fields of legislation and different jurisdictions. At the same time, international and domestic legal institutions exercise a conditioning force on the discourse of human dignity. The implications of this are two-fold. First, as argued by James Griffin, human dignity acts as the foundation of human rights and gives rise to a large range of rights related to personhood and agency; nevertheless, the menu of human rights potentially generated by human dignity must be reduced or rationalized given the equal importance of legal institutions in national legal systems as a source of settled norms and practices (Griffin 2008). Second, legal systems require normative precision, and positive law invoking human dignity often appears to fall short of that precision; this has meant that jurists have favored conceptualizing and operationalizing human dignity through an association with degradation (Kaufmann et al, 2011). As Beitz insists, these implications raise related questions:

human dignity seem to apply (differently) at two distinct levels of thought about human rights—as a feature of a public system of norms and as a more specific value that explains why certain ways of treating people are (almost?) always impermissible. If there could be a theory of human dignity, one of its desiderata would be to show what (if anything) these senses of human dignity have in common and how they hang together (if they do). (2013, 283)

Beitz’s own analysis retains a certain kind of bifurcation between prohibitive and empowering conceptions of human dignity (2013, 289–290), suggesting resilient problems in making sense of human dignity’s place in law. Does the overridingness of human dignity have, in legal systems, to be conditioned by the normal institutional limits on legal norms and principles or does it retain its (extra-legal) moral force? And what role does philosophical anthropology play in our ethical and legal thinking, and should this inform what we take to be enforceable in law? This is a question of what we hold to be distinctively human and how, if at all, this should inform our thinking about law. A philosophical anthropology, along with related moral commitments, may demand or prevent perfectionist readings of human dignity which, in turn, has implications for any putative interstitial concept.

b. Ethics

i. General

Those concerns with philosophical anthropology form a point of departure for reflection on ethics. For example, animal ethics concerns sometimes explicitly, but always at least implicitly, questions about the value of human beings in contrast to nonhuman animals. Answers to such questions will typically concern whether human beings have standing over animals, or whether human beings have an inner significance that animal beings lack. These two questions are ambiguous and the relation between them is far from clear. Supported by tradition which has overshadowed much of our understanding of human dignity, the first question can be variously understood as the elevation of the human species, human dominion over nature, humanity as imago dei, or as the special worth of humanity relative to all other natural phenomena. In other words, human dignity as elevation rather than human dignity as human inner significance (compare Sensen, 2011). The second question, by contrast, leaves open the possibility that human beings and nonhuman animals have potentially incommensurable significances (Korsgaard, 2013; Nussbaum, 2006; Balzer, Rippe and Schaber, 2000; Kaldewaij, 2013). Each of these presumptions has a questionable relationship with an IHD.

Starting from the idea that human beings have a distinctive significance, at least two possibilities flow: the existence of duties of dignity that address its bearer, and duties of dignity that address others. Some philosophical theories deny a distinctive significance for human (and nonhuman) beings as such, but emphasize the contractual basis of our norms or argue that what matters morally is sentience (compare Gauthier, 1987; Singer, 2001). By contrast, philosophical views on human dignity emphasize that there is a distinctive significance to human beings and that this entails certain stringent ethical norms. Note that claiming a distinctive significance for human beings does not necessarily amount to prioritizing human beings over animals. (Claiming that human beings should be prioritized over animals would of course entail that human beings have a distinctive significance.) Indeed claims that both human nature and animal nature have their own distinctive significance can be interpreted both in terms of elevation and in terms of inner significance. When animal and human interests clash, one could try to compromise the interests of one to satisfy the same or even a different interest for the other, in line with or even as a matter of respect to their different dignities.

That being said, the claim of human significance has often found expression in philosophies that elevate human beings over animals. It should be noted that the very idea of a relative standing of human beings over nonhuman animals and nature does not entail that human beings should be protected for that dignity (Sensen, 2011). Rather, the relative elevation of a human being is conceived in terms of his distinctive human capacities that, given some teleological or religious background assumptions, entail for him a duty to exercise these. These capacities are, in turn, typically understood to be exercised by acting morally, that is, to act in line with a morality that concerns what one does to oneself, to other humans, or to God. It is these teleological or religious assumptions that generally benefit humans over animals. It has been argued that this view of humanity was central to Western traditional views of dignity including those of the ancients, medieval Christians, Renaissance and early Modern thinkers.

Within these moral schemes the question of what we should do to a human being is not (fully) decided by recognizing their dignity (as elevation), whereas the individual’s own duty to comply with that scheme is the main normative implication of the set of capacities that ground his dignity. He has initial dignity as subject to such a moral scheme, in particular by virtue of his capacity and correlated duty to live up to it. As such, his dignity may not entail any or all duties that others have to him, such as to respect or even support him. What we are to do to him depends on the content of the moral duty that we have as a result of our dignity grounding capacities, duties which are conceptualized in terms of cosmic principles or divine commands. That is to say, we are to respect each other not for our relative standing, our initial dignity, but given that and insofar as non-interference or support for beings that happen to have this standing is required by cosmic or divine principle. This principle specifies what we should value in the individual. As such, it specifies a type of dignity that comes closer to the inner significance view, which in turn may be, but does not necessarily require, an expression in terms of schemas that advance ideas of human elevation.

It is the inner significance view, not the human elevation view, that fits more easily within the formal features of the IHD. The normative significance view has found expressions in at least three ways: as a status (Habermas, 2010; Waldron and Dan-Cohen, 2012), a value (Rosen, 2012; Sulmasy, 2007) or a principle (Düwell, 2014). As a status, human dignity gives human beings a set of duties and rights. A value, by contrast, sets human dignity as something to sustain or promote. As a principle, human dignity sets a fundamental standard for action. These three types of specifications are featured in broader philosophical anthropologies that explain who has it and what should be protected in them—as well as entail implications for policy and law with regard to it. In other words, whether we treat human dignity as a value, status or principle will depend in large measure on the background assumptions—anthropological and/or cosmological—that we take to form the background of a claim about human dignity.

ii. Philosophical Anthropology

All three claims—status, value and principle—can be interpreted in terms of the formal features of the IHD (universal, unconditional, inalienable and overriding). At the same time, some views on the significance of humanity may deny one of these features, and this will affect the content and normative use of such a view of the significance of humanity considerably. In these respects, attempts to reconstruct non-Western traditional views on dignity should be especially sensitive not only to distinctions between status, value and principle, but particularly to the formal as well as substantive specifications of the significance of humanity in these traditions (Donnelly, 2009). It has been argued, for example, that the normatively relevant notion of humanity in, for example, Confucian tradition should be understood in terms of dignity’s achievement through virtuous conduct, rather than in terms that make it independent of one’s character and conduct (Luo, 2014). This would touch on the issue of universality, unconditionality, alienability and overridingness. In Confucian tradition, dignity (qua ‘worth’) can be seen as a universal human potential that we may fail to cultivate: it is therefore universal but not unconditional; it can also be self-alienated and overridden.

It has been argued also that in certain Islamic traditions, Man has a God-given status as vicegerent on earth (Mozaffari, no date; Kamali, 2002; Maroth, 2014). This status may demand some respect, but how he is to be treated depends largely on what God has specified by law. If God demands—as some traditions seem to imply—respect for human individuals as a matter of their good deeds, piety or their living by the Book, then this would raise questions about consistency with the unconditionality and inalienability of an IHD. A further significantly different tradition, Hinduism, is sometimes interpreted to operate with a concept of dignity that a human individual shares because and insofar as his soul cannot be distinguished from the universe (Braarvig, 2014). On the one hand, this implies the significance of human individuals. On the other hand, given differentiations in the world of appearances we can distinguish degrees of dignity not only between individuals, but also between classes—which one can enter only through birth—specified by the presence of the universal whole in them. The possibility of rebirth in a higher caste—conditional on loyalty to the caste system or on pure chance—renders consistent this universal notion of dignity with the social one.

On top of these possible alternatives to an IHD at the formal level, it is also crucial to note the possibility of different accounts of the IHD in which these formal features may have different and incompatible contents, if not opposing implications for normative use. The differences concern not only questions about the nature of the subject of human dignity—a species, humanity or the human person—but also what is significant in him. Further differences emerge from answers to other questions: are we to grant him rights and impose on him duties; are we to value him, non-interfere and support him to perfect himself; are we to respect him?

iii. Tensions

This mixture of concerns and foci—different background assumptions in terms of cosmology and anthropology, different assumptions in terms of normative functioning of human dignity as statue, principle, and value—gives rise to an expansive field of enquiry. Even if we were to consider how the IHD may or may not be present in ethical accounts of human dignity, this would have to encompass the two substantial fields of normative ethics and applied ethics and would require careful analysis of how and why further links between politics, ethics and law are issues. For present purposes we narrow our concerns to applied ethics.

Applied ethics can be understood by reference to ethical problems that arise from concrete practices. These practices emerge or have their existence in society and as such require attention by politics and law—not only by philosophical ethics. What we typically see is that the ethical issue is addressed in terms of norms or principles accepted in the practice, and that politics or law let this happen and regulate only in their own terms—quite independent of an explicit assessment in terms of IHD, let alone in terms of a coherent integration of philosophical ethics, politics, law, empirical knowledge and practical constraints (compare Düwell, 2012).

‘Dignity’ has different usages in different applied ethical practices, and in some it has none (Beyleveld and Brownsword, 2001; Nordenfelt, 2004; Sulmasy, 2013). For example, in the life sciences dignity is used to legitimize a patient’s right to informed consent, to set constraints on her choices. Further, it is used to constrain her choice options, such as deciding when to die. It is also used to characterize the way a patient deals with and adapts to his condition, the way a patient is treated, and to emphasize the effects of his condition or of the actions of others on his identity. It is used to emphasize the value a person attaches to himself, the extent to which he respects himself (Dillon, 2013). Dignity is the central term in assessing technological developments for their application to human life (Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics, 2008). Dignity is also used to argue against abortion, against the pre-natal experimentation on early human life. It has been argued by some that all human life should be protected as a matter of dignity, whereas others emphasize protection of human life only if it will develop a personality. In this context, it especially interesting to note that in debates on pre-natal enhancement, the notion of dignity is appealed to in defense of respecting the human species as such (Bostrom, 2005; Habermas, 2005). Here human dignity is said to be threatened by attempts to bring to life human beings enhanced in certain ways, such as enhanced to be more competent in certain abilities that are valued by parents or society. Here the worry not only concerns the dignity of the enhanced individual, whether it is violated or enhanced, but also the dignity of humanity as such: whether humanity is compromised by these interferences. It also concerns the dignity of non-enhanced human beings, whether it is threatened by the increased capacity of enhanced beings. Not all of these usages express the same concept, let alone an IHD. Those that do may give only partial expression to competing versions of an IHD. Often, however, we see that problems are addressed without explicit recourse to an IHD, let alone via an integral assessment in terms of the philosophical commitments that come with such an IHD. It would make a significant difference if these discourses were orientated towards coherence with an IHD.

c. Politics

Already in discussion of applied ethics certain of the constraining and conservative uses of human dignity are in evidence. A ‘dignitarian alliance’ of conservative thinkers and activists has deployed a notion of dignity close to that of sanctity in order to oppose or constrain reproductive and biotechnological innovations (Brownsword 2003). Political discourse of the twentieth century also, by contrast, witnessed radical and liberation-focused discourses of human dignity. While the division between human dignity as empowerment and as constraint helps to partially map this contrast, this section draws a more general divide between power-focused conceptions of politics as opposed to principle-focused conceptions of politics. Principled accounts can in turn be divided between those who make ethics (and potentially human dignity) central to politics, and those who might accommodate other interstitial principles like justice or the rule of law.

In those accounts that make ethics clearly foundational to politics, human dignity could be conceived as a regulative idea, providing the trajectory of politics but not necessarily central to its practice. Slightly differently, human dignity could be treated as providing a conception of good politics and implying practical side-constraint within political systems. More directly, human dignity might be identified with the good, which would give human dignity a more clearly normative and perhaps perfectionist role (Boylan 2004). Efforts to synthesize aspects of pluralism with such accounts of the good have informed a capabilities approach intended to encompass both a substantial conception of the individual and the protections of agency and individuality characteristic of liberal thought. This itself is often expressed in the language of human dignity (Nussbaum 2006, Claassen 2014). This interpretation of human dignity in terms of capability based flourishing has been reviewed and critically reinterpreted by reference to a different idea of dignity, that of dignity as a basic principle that demands recognition of the generic features of human agency as a matter of basic rights (Gewirth 1992). Far from being unrelated to the perfectionist notion of dignity, this latter notion of dignity functions as an underlying principle that may help us identify relevant from irrelevant human capabilities as well as to rank them so as to prevent or settle clashes between them (Düwell 2009, Claassen and Düwell 2012). Such a take on capabilities would imply that possibilities for certain forms of flourishing should be protected as a matter of dignity, indeed the same kind of dignity that demands respect for freedom and well-being as basic features of agency. One further upshot of this approach would be that those things to be secured or provided might, in view of this principle, differ between persons as well as between contexts. That is to say, to protect a capability for one agent may require different or more resources than protecting it in someone else (Boylan 2004). Also, when possibilities of securing agency are scarce in a community, priority should be given to capabilities at the core of agency. It might be that this represents a manifestation of the IHD concept in that the idea is intended to have application across different systems and also be extended to other, new forms of moral and political challenges.

In contrast, those positions that give the right priority over the good place rights and a plurality of reasonable conceptions of the good at the center of just institutional design. Such a ‘community of rights’ is quite directly committed to an interstitial notion of human dignity cashed-out as both basic human rights and systems for preserving freedom and welfare across all normative systems (Gewirth 1998). Rawls’s position (2009) in contrast faces the challenge of reconciling commitment to human dignity with treating justice as a primary institutional virtue. Rawls’s two principles of justice—while expressed in the language of basic rights and institutional virtues—could intelligibly be taken as an expression of a politics based on human dignity. However, this should give rise to important hermeneutical and conceptual hesitations. First, little is added to our understanding of Rawls’s work by associating it with human dignity, and conversely the distinctive conceptual characteristics of human dignity are immediately lost in more general debates about liberal political theory. Second, in Rawls’s later work where “decent non-liberal” societies are insulated from criticism and intervention from liberal states, we might say that Rawls concedes that non-liberal states—states that would clearly not accept an IHD principle as foundational—are nonetheless morally and politically justified (2001). By extension, the links between liberal political theory and human dignity are enormously complex, and can be conditioned by the demands of realism or non-ideal theory. With that in mind we turn to more practice-based and power-focused links.

The concept of human dignity as it appeared in post-war international law was undoubtedly intended to mark a decisive political, not just legal, turning-point. The concept is closely associated with the commitment “never again”—that never again should there be atrocities of the kind in the Second World War—and we could see human dignity as a predominantly political idea focused on the impermissibility of widespread and systematic attacks on civilian populations and by extension fundamental limitations on states’ sovereignty. In this sense there is credibility to an interstitial reading of human dignity that links international law, politics and morality in supporting a more individual-focused, less state-focused account of international relations. This, in turn, strengthens a link between human dignity and (moral and institutional) cosmopolitanism given that the value of individuals transcends state boundaries.

Conversely, this—interstitial and cosmopolitan—reading of human dignity has important limitations. First, the interstitial understanding of human dignity could be assumed to be, at heart, an ideological reading of human rights discourse: it is the rhetoric of human rights that links international law and politics rather than any systemic or philosophically defensible normative framework related to dignity. Second, the cosmopolitan understanding of human dignity faces the general vulnerability of all cosmopolitan philosophies (the priority of local and natural attachments in our moral thinking) and a specific attack via the problem of statelessness. That is, unless human dignity rests on or implies a ‘right to have rights,’ any political and legal discourse of human dignity will be inadequate in comparison to the systematic and concrete protections offered to citizens by constitutions and constitutional rights. We return to the right to have rights later by way of a more general analysis of social theory.

Certain historical and sociological trends are important for understanding human dignity and its role in politics. The first and most obvious is a shift from hierarchical societies to more democratic societies and with this an emphasis on the equal status and rights of individuals. A clash between the notions of dignity as aristocratic bearing and dignity as fundamental status is a characteristic of debates concerning the French Revolution. The ‘dignity of Man’ as emblematic of political emancipatory projects finds its first major expression during this revolutionary period, and it allowed the articulation of new emancipatory projects as in Wollstonecraft’s appeal to the equal dignity of men and women ([1792]1982). The post-World War Two invocation of human dignity undoubtedly shares basic humanistic, enlightenment, and liberal assumptions with these currents of eighteenth and nineteenth century thought, though by the twentieth century the idea of the ‘dignity of Man’ was being opposed not directly by defenders of the Ancien Régime but by Marxist and communitarian critics of liberalism. What unites these latter positions is concern about the insensitivity of human dignity relative to pressing political problems including colonialism and minority rights, along with more fundamental concerns about the emptiness of the concept relative to collective interests that cannot be disaggregated into individual interests.

Sociological shifts are also crucial in understanding the competing functions of human dignity in political discourse. The characteristics of modernity, as charted by both Weber and Durkheim, involve changes in the conception of the individual (including for Durkheim the creation of an ‘ethic’ or ‘religion’ of humanity), changes in the concept of politics, and changes in the political significance of human dignity. On the one hand, the more technocratic and bureaucratic nature of politics was held to have yielded a demystifying, but also dangerously dehumanizing, relationship between the individual and political power. In the light of that and related concerns, Margalit (2009) and others use human dignity to stress the importance of retaining dignity qua self-respect within political and social practices. By the same token, Honneth’s work on the political conditions of recognition (1996) entwines respect with the basic conditions of individual and group identity. On the other hand, liberal institutions that intended to preserve the basic status of the individual have been held to be inadequate to maintain the conditions of the possibility of ethical life. This has meant direct attacks on ‘liberal’ practices, including human rights, by communitarian theorists.

It is against this background that a different style of political theorizing about human dignity can be found in the second half of the twentieth century. Hannah Arendt’s Aristotle-inspired political theory emphasizes the importance of recognition in a political community and of strong constitutional rights with an equation between human dignity and the right to have rights (Arendt 1958). Arendt offers an influential internal critique of politico-legal understandings of human dignity. Broadly, Arendt is unsympathetic to any potential interstitial concept (given her views on the basic conditions of politics) and to generalizations about the rights of Man (given her writings on the emptiness of this notion, particularly with regard to the status of refugees). In contrast she stresses the basic importance of citizenship as a condition of protecting the basic status of the individual. There are nevertheless resources in Arendt’s work that are clearly sympathetic to human dignity and human rights as more expansive commitments, and human dignity could be seen as the best expression of that view of human dignity as opposition to atrocity and defensive of human status and human plurality (Menke 2014).

In the light of these competing currents of thought, and the complexities of the concept itself, human dignity does not map neatly onto the division between empowerment and constraint or between the priority of the good and the priority of the right. The IHD, to the extent that it is a recognizable component of political thinking, might be assumed to be closer to conceptions of politics focused on the rule of law rather than a substantive conception of the good. Understood as interstitial concepts, human dignity and the rule of law are intended precisely to express the importance of links between politics and law and the co-regulation of the two. The rule of law is important not only as an expression of self-restraint in politics but also as a necessary condition of a permissive politics of human agency, choice and self-creation. This might be otherwise expressed in terms of a defense of the public-private divide. It could be expressed in more sociological terms as a defense of functional differentiation, the coexistence of different social systems that an individual can move between. Or this might be linked to a libertarian defense of minimalism in the power of the state. The unifying idea here is that human dignity is a principle with significance for political, legal and moral systems and which preserves, one way or another, the freedom and self-creation of the individual. It has been the recurrent theme of communitarian critics of liberalism and human rights that such permissiveness undermines the self-constitution of the individual within a polity. Middle ground could, potentially, be found in the capabilities approach or in an Arendtean stress on the right to have rights.

4. Conceptual Analysis

a. The Conceptual Features of Human Dignity

It is desirable, but no simple task, to begin to draw more general conclusions about human dignity as a concept and as a component of normative debate. It is worth briefly contrasting how we might approach the analysis of human dignity with that of human rights. Discussion of human rights features settled debates concerning their moral or political justification, an appropriate theory of rights, and human rights’ tailoring to practice. Analysis of human dignity, in contrast, lacks such clearly defined parameters because it is plausible that there are competing concepts of human dignity and not just competing conceptions. That is, it is not simply that in academic debate different aspects of a single concept can be given special emphasis or that there are competing justificatory strategies for the same, shared, idea. Rather, ‘human dignity’ might encompass historically different, and antagonistic, ideas. For this reason, meta-studies of the uses of human dignity have difficulty yielding definitive analysis of the concept’s presuppositions and functions, or have mapped a number of functions that are difficult to cohere (Nordenfeld 2004; Sulmasy 2013). Bonding the many functions of human dignity may be possible, at best, only through performative analysis (O’Malley 2011) or family resemblance analysis (Neal 2012), but these involve abandoning a single idea of human dignity in favor of describing various local uses.

b. The Credibility of an Interstitial Concept

In contrast, we would argue that the three normative fields of law, morality and politics together offer at least the possibility of a distinctive, focal concept. The idea of the absolute status of every individual can intelligibly be held to frame our normative practices. Indeed, the magnitude of this commitment is such that it would have to be manifest in all of our social practices. Clearly, however, this is not without problems. Any conceivable defense of an IHD concept—one that, by definition, sits between and links different normative practices—faces the immediate problem of the conditioning assumptions of those disciplines and practices (including the local practices and settled dispositions and attitudes of those working within the fields). This can be treated as a three-fold problem. The validity of any legal norm is conditional on political will (the problem of the primacy of the political); the moral justification of the idea still requires further explanation and justification (the problem of the foundations of morality); and the legal notion itself will be conditioned by a legal system so that it can be consistently operationalized within the system (the problem of the demands of justice or the normative closure of law). These three problems are pressing problems for any IHD claim precisely because the concept must claim to transcend these conditioning aspects of our normative practices.

However, it can be argued that the possibility of an interstitial concept nevertheless has support within the fields. For example, the idea of a rule of law is intended to unify different fields of legal and political regulation (through demanding their consonance with good law consistent with human agency), and for that reason a number of theorists closely associate human dignity and the rule of law (Waldron 2008; Fuller 1964). Beyond this, human dignity might well inspire more productive and precise regulatory practices, be they related to global, social or procedural justice. If the rule of law is the minimal demand that there be a good match between regulation and agency, wider ‘projects’ conjoining law, ethics, and politics can be meaningfully expressed in the language of human dignity given its unifying function. Put more modestly, the idea of politics as an anomic practice is difficult to defend—after all, law and politics stand in a relation of productive co-constitution with politics making law and legal systems revising the content of that law and regulating political practices themselves—and our best reconstructions of the foundations of political practices and institutions are likely to involve commitment to the kinds of formal assumptions associated with human dignity (Rawls 2009; Habermas 2010). And moral theories can enforce duties which in turn generate institutional designs and procedural mechanisms intended to protect human dignity and render it immanent in social systems (Gewirth 1998). In sum the three problems associated with an IHD claim are not uniformly accepted and should not be treated as a refutation of interstitial claims in general or an IHD concept specifically.

Above all, a connection between human rights and human dignity gives critical force to human dignity and indicates precisely why the predominant concept of human dignity should be assumed to be an interstitial one. Conceptualizing human dignity as foundational is sometimes construed as bonding the existing body of human rights law with a moral claim that guarantees their force as moral, not just positive, rights. The most plausible explanation of such a guarantee is through deontological theory granting supreme moral importance to the individual and immunizing them from consequentialist determinations of the common good that would potentially sacrifice their rights and their status. Beyond this, the precise account of justification, rights, and practice is open to debate, but human dignity is the foremost expression of the deontological commitments sketched here. Even in this sketch it is clear that the normative fields of law, ethics, and politics are not intended to be absolutely divided but rather guided and judged by their consistency with the protection of human rights. It is this claim that lies at the heart of an interstitial concept of human dignity (and much else besides in international law). It remains to draw out the implications of this.

c. The Implications of an Interstitial Concept

Assuming that an IHD concept—sitting between normative fields, linking these fields, and conditioning them—is intelligible, then its implications are considerable. Let us assume that the commitments contained in such a concept are as follows. Human dignity is treated as having the formal features identified (universality, overridingness, and so forth); it has the characteristic content of human dignity claims (a species claim or a claim about human dignity being relational or a property); and it encompasses commitment to a distinctive normative use (for example, empowerment of the individual, expressed in terms of claim rights, that holds at least between the individual and all political institutions). The sum of this commitment would be as follows. In all interactions between state and individual, claim rights (expressible as human rights) can and should be exercised by all human persons, and the exercise of those rights would not be conditioned by any jurisdictional boundaries. This amounts to having significance in all possible interactions between the collective and the individual. It will imply that there is no interaction between individuals that is not at least potentially normatively governed by human dignity. And it implies that any special demands about normative priorities made by law, ethics or politics would be justified only to the extent that they were consistent with, or directly conditioned by, the overarching commitment to human dignity. This concept is, then, enormously demanding insofar as its fulfillment would not be discharged on the basis of respecting a single norm (be it a Grundnorm or an anti-atrocity norm) but would, rather, demand an ongoing commitment to subject every executive and administrative decision to scrutiny on the basis of its consonance with the content and implications of human dignity particularly as this is expressed through human rights.

What conceptual and practical problems does this imply? The actual enforceability of human dignity itself as a norm or right is potentially unclear here, and the idea of human dignity’s overridingness sits uneasily with many common legal, political and moral assumptions. For related reasons it is not clear if human dignity should be a named, explicit norm within a constitution. It would be impracticable (indeed perhaps senseless) to have a norm that trumped all other norms; human dignity cannot be assumed to function in a normative vacuum. And the function of an interstitial concept is to link and justify different normative fields, not to directly govern them through one explicit Grundnorm. In fact, having concrete implications for these fields demands a more complete explication of the concept in terms of human rights which themselves require clear institutional arrangements. What human dignity amounts to is an expression of the foundations of any and all of our normative practices and the demand that human rights and human dignity have a constitutive and not just regulative role in our social institutions and practices. Nevertheless, this is a demand for a far more substantial explication of human rights, institutions, and good—that is, human dignity preserving—interaction between law, morality and politics in practice.

If, despite such challenges, we accept this IHD reading, we should reject a number of other readings of human dignity as peripheral or incoherent. Common uses of human dignity in healthcare and medical ethics that treat human dignity as one amongst many ‘middle-level principles,’ or bioethical readings that treat human dignity as synonymous with sanctity, would be non-standard readings on these assumptions and intelligible only as idiosyncratic local uses. Common criticisms of human dignity as vacuous or empty (because human dignity apparently collapses into notions of autonomy) would be rejected as incoherent because they fail to distinguish an IHD from either idiosyncratic local uses or from irrelevant non-interstitial uses. There would remain, however, an important but complex line of enquiry concerning how human dignity and self-regarding duties should be thought to interact. On the one hand, the IHD concept has been detached from the perfectionist Stoic tradition invoking species norms which determine whether individuals are ‘fully human.’ On the other hand the typical form, content, and normative implications of the IHD need not exclude the possibility of self-regarding duties arising from respecting one’s own status as human person.

5. Conclusion

The foregoing analysis stressed the problems of using human dignity in philosophical and ethical thought. The concept itself is opaque, and one important modern usage faces the problem of aspiring to be interstitial within and between normative fields that are themselves resistant to the very idea of such interstitial concepts. Nevertheless, there are good reasons why such a far-reaching concept should be primary in our thinking, and for this reason human dignity is likely to remain a component of normative discourse despite its problematic characteristics.

6. References and Further Reading

  • Alexy, R. (2009) A theory of constitutional rights. Oxford University Press.
  • Arendt, H. (1958) Origins of Totalitarianism, Meridian Books.
  • Balzer, P., Rippe, K. P. and Schaber, P. (2000) ‘Two Concepts of Dignity for Humans and Non-Human Organisms in the Context of Genetic Engineering’, Journal of Agricultural and Environmental Ethics, 13(1), pp. 7–27. doi: 10.1023/A:1009536230634.
  • Beitz, C. (2013) ‘Human Dignity in the Theory of Human Rights: Nothing But a Phrase?’, Philosophy and Public Affairs, 41(3), pp. 259–290.
  • Beyleveld, D. and Brownsword, R. (2001) Human dignity in bioethics and biolaw. Oxford: Oxford University Press.
  • Bostrom, N. (2005) ‘In Defense of Posthuman Dignity’, Bioethics, 19(3), pp. 202–214. doi: 10.1111/j.1467-8519.2005.00437.x.
  • Boylan, M. (2004) A Just Society. Rowman & Littlefield Publishers.
  • Brownsword, R. (2003) ‘Bioethics today, bioethics tomorrow: stem cell research and the dignitarian alliance’, Notre Dame JL Ethics & Pub. Policy, 17, pp. 15–51.
  • Braarvig, J. (2014) ‘Hinduism: the universal self in a class society’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • Claassen, R., and Düwell, R. ‘The foundations of capability theory: comparing Nussbaum and Gewirth’, Ethical theory and moral practice 16(3), pp. 493–510.
  • Claassen, R. (2014) ‘Human Dignity in the Capability Approach’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • Debes, R. (2009) ‘Dignity’s gauntlet’, Philosophical Perspectives, 23(1), pp. 45–78.
  • Dillon, R. S. (2013) Dignity, Character and Self-Respect. Routledge.
  • Donnelly, J. (2009) ‘Human Dignity and Human Rights’, Commissioned by and Prepared for the Geneva Academy of International Humanitarian Law and Human Rights in the framework of the Swiss Initiative to Commemorate the 60th Anniversary of the Universal Declaration of Human Rights. Available at: http://www.udhr60.ch/report/donnelly-HumanDignity_0609.pdf.
  • Düwell, M. (2009) ‘On the Possibility of a Hierarchy of Moral Goods’, in Morality and Justice: Reading Boylan’s A Just Society, John-Steward Gordon (ed.), Rowman & Littlefield Publishers, Inc: Lanham, MD.
  • Düwell, M. (2012) Bioethics: Methods, Theories, Domains. Routledge.
  • Düwell, M. (2014) ‘Human dignity: concepts, discussions, philosophical perspectives’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
  • Fuller, L.L. (1964) The Morality of Law. Yale University Press.
  • Gauthier, D. (1987) Morals By Agreement. Oxford University Press, USA.
  • Gewirth, A. R. (1998) The community of rights. Springer Netherlands.
  • Habermas, J. (2005) Die Zukunft der menschlichen Natur: auf dem Weg zu einer liberalen Eugenik?. Frankfurt am Main: Suhrkamp.
  • Habermas, J. (2010) ‘The Concept of Human Dignity and the Realistic Utopia of Human Rights’, Metaphilosophy, 41(4), pp. 464–480. doi: 10.1111/j.1467-9973.2010.01648.x.
  • Hennette-Vauchez, S. (2011) ‘A human dignitas? Remnants of the ancient legal concept in contemporary dignity jurisprudence’, International journal of constitutional law, 9(1), pp. 32–57.
  • Honneth, A. (1996) The struggle for recognition: The moral grammar of social conflicts. MIT Press.
  • Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics. (2008). Washington: [s.n.].
  • Kaldewaij, F. E. (2013) The animal in morality. Justifying duties to animals in Kantian moral philosophy. Department of Philosophy, Utrecht University. Available at: http://dspace.library.uu.nl/handle/1874/275543.
  • Kamali, P. M. H. (2002) The Dignity of Man: An Islamic Perspective. 2nd edition. Islamic Texts Society.
  • Kaufmann, Paulus, et al. (2011) ‘Human dignity violated: a negative approach–introduction’, in Kaufmann, P., Kuch, H., Neuhäuser, C., & Webster, E. (eds) Humiliation, Degradation, Dehumanization. Netherlands: Springer, pp. 1–5.
  • Korsgaard, C. M. (2013) ‘Kantian Ethics, Animals, and the Law’, Oxford Journal of Legal Studies, 33(4), pp. 629–648. doi: 10.1093/ojls/gqt028.
  • Luo, A. (2014) ‘Human dignity in traditional Chinese Confucianism’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.021.
  • Margalit, M. A. (2009) The decent society. Cambridge Mass.: Harvard University Press.
  • Maroth, M. (2014) ‘Human dignity in the Islamic world’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • McCrudden, C., (2008) ‘Human Dignity and Judicial Interpretation of Human Rights, European Journal of International Law, 19(4), pp. 655–724.
  • Menke, C. (2014) ‘Human Dignity as the Right to Have Rights: Human Dignity in Hannah Arendt’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
  • Mozaffari, M. H. (no date) ‘The concept of Human Dignity in the Islamic Thought’, Hekmat: International Journal of Academic Research, (4), pp. 11–28.
  • Neal, M. (2012) ‘Dignity, law and language-games’, International Journal for the Semiotics of Law-Revue internationale de Sémiotique juridique, 25(1), pp. 107–122.
  • Nordenfelt, L. (2004) ‘The varieties of dignity’, Health care analysis: HCA: journal of health philosophy and policy, 12(2), pp. 69–81; discussion 83–89. doi: 10.1023/B:HCAN.0000041183.78435.4b.
  • Nussbaum, M. C. (2006) Frontiers of justice: disability, nationality, species membership. Cambridge, Mass.: The Belknap Press : Harvard University Press.
  • O’Malley, M. J. (2011) ‘A Performative Definition of Human Dignity’ Facetten Der Menschewürde: 75–101.
  • Rawls, J. (2001) The law of peoples: with, the idea of public reason revisited. Cambridge, Mass.: Harvard University Press.
  • Rawls, J. (2009) A theory of justice. Cambridge, Mass.Harvard University Press.
  • Rosen, M. (2012) Dignity its history and meaning. Cambridge, Mass: Harvard University Press.
  • Sensen, O. (2011) ‘Human dignity in historical perspective: The contemporary and traditional paradigms’, European Journal of Political Theory, 10(1), pp. 71–91. doi: 10.1177/1474885110386006.
  • Singer, P. (2001) Animal Liberation. Ecco Press.
  • Sulmasy, D. P. (2007) ‘Human dignity and human worth’, in Perspectives on human dignity: A conversation. Springer, pp. 9–18. Available at: http://link.springer.com/content/pdf/10.1007/978-1-4020-6281-0_2.pdf.
  • Sulmasy, D. P. (2013) ‘The varieties of human dignity: a logical and conceptual analysis’, Medicine, health care, and philosophy, 16(4), pp. 937–944. doi: 10.1007/s11019-012-9400-1.
  • Waldron, J. (2008) ‘The Concept and the Rule of Law’, Georgia Law Review, 43(1), pp. 1–62.
  • Waldron, J. and Dan-Cohen, M. (2012) Dignity, rank, and rights. Oxford; New York: Oxford University Press.
  • Waldron, J. (2013) ‘Is dignity the foundation of human rights?’ NYU School of Law, Public Law Research Paper 12–73. doi: http://dx.doi.org/10.2139/ssrn.2196074.
  • Wollstonecraft, M. (1982) Vindication of the Rights of Woman. Ontario: Broadview Press.

 

Author Information

Stephen Riley
Email: stephenriley12@gmail.com
Utrecht University
Netherlands

and

Gerhard Bos
Email: bos.gerhard@gmail.com
Utrecht University
Netherlands

Luck

Winning a lottery, being hit by a stray bullet, or surviving a plane crash, all are instances of a mundane phenomenon: luck. Mundane as it is, the concept of luck nonetheless plays a pivotal role in central areas of philosophy, either because it is the key element of widespread philosophical theses or because it gives rise to challenging puzzles. For example, a common claim in philosophy of action is that acting because of luck prevents free action. A platitude in epistemology is that coming to believe the truth by sheer luck is incompatible with knowing. If two people act in the same way but the consequences of one of their actions are worse due to luck, should we morally assess them in the same way? Is the inequality of a person unjust when it is caused by bad luck? These two complex issues are a matter of controversy in ethics and political philosophy, respectively.

A legitimate question is whether the concept of luck itself is worthy of philosophical investigation. One might think that it is not given (i) how acquainted we are with the phenomenon of luck in everyday life and (ii) the fact that progress has been made in the aforementioned debates on the assumption of a pre-theoretical understanding of the notion.

However, the idea that a rigorous analysis of the general concept of luck might serve to make further progress in areas of philosophy where the notion plays a fundamental role has motivated a recent and growing philosophical literature on the nature of luck itself. Although some might be skeptical that investigating the nature of luck in general can help shed some light on long-standing philosophical debates such as the nature of knowledge—see Ballantyne 2014—it is hardly sustainable that no general account of luck will be able to ground any substantive claim in areas of philosophy where the notion is constantly invoked but left undefined. This article gives an overview of current philosophical theorizing about the concept of luck itself.

Table of Contents

  1. Preliminary Remarks
    1. The Bearers of Luck
    2. The Target of the Analysis
    3. General Features of Luck
  2. Luck and Significance
  3. Probabilistic Accounts
    1. Objective Accounts
    2. Subjective Accounts
  4. Modal Accounts
  5. Lack of Control Accounts
  6. Hybrid Accounts
  7. Luck and Related Concepts
    1. Accidents
    2. Coincidences
    3. Fortune
    4. Risk
    5. Indeterminacy
  8. References and Further Reading

1. Preliminary Remarks

The following preliminary remarks will address three questions: (1) What are the bearers of luck? (2) What is the target of the analysis of current accounts of luck? (3) What general features of luck should an adequate analysis of luck be able to explain?

a. The Bearers of Luck

The best way to find out what the bearers of luck are consists in considering the kind of entities of which we predicate luck-involving terms and expressions such as “lucky,” “a matter of luck,” or “by luck.”

1. Agents. On the one hand, the term “lucky” can be predicated of agents—for example, “Chloe is lucky to win the lottery.” In general, the kind of beings to which we attribute luck are beings with objective or subjective interests such as self-preservation or desires—see Ballantyne (2012) for further discussion. In this sense, a human or a dog are lucky to survive a fortuitous rockfall, but a stick of wood or a car are not. Still, at least in some contexts, it seems correct to attribute luck to an object without interests, as when one says that one’s beloved car is lucky not to have been damaged by a fortuitous rockfall. However, this kind of assertions are felicitous insofar as they are parasitic on our interests. No one would say that a stick of wood is lucky not to have been destroyed by a rockfall if its existence bore absolutely no significance to anyone’s interests, and if one would, one would only say it figuratively.

A related question is whether the kind of agents to which we attribute luck are only individuals or whether luck can be also ascribed to collectives. There is certainly a sense in which a group of individuals can be said to be lucky, as when we say that a group of climbers is lucky to have survived an avalanche. Coffman (2007) suggests that there seems to be no reason why group luck cannot be reduced to or explained in terms of individual luck. But if one holds—with many theorists working on collective intentionality—that groups can be the bearers of intentional states, it might turn out that group luck cannot be so easily reduced to individual luck. For example, if it is by bad luck that a manufacturing company fails to achieve its yearly revenue goal—so it is bad luck for the company—it does not necessary follow that each and every one of its workers—for example, people working on the assembly line—are also unlucky, if, say, they cannot be fired by law and the company is not compromised.

2. Events. On the other hand, the term “lucky” and expressions such as “a matter of luck” or “by luck” can be predicated of events—for example, “Chloe’s lottery win was lucky”—and states of affairs—for example, “It is a matter of luck that Chloe won” or “Chloe’s winning the lottery was by luck”; see Coffman (2014) for further discussion. Plausibly, luck-involving expressions can be also predicated of items belonging to related metaphysical categories such as accomplishments, achievements, actions, activities, developments, eventualities, facts, occurrences, performances, processes, and states. For presentation purposes, luck will be here described as a phenomenon that applies to agents and events, where by “agent” is meant any being with interests and by “event” any member of the previous categories.

b. The Target of the Analysis

1. Relational versus non-relational luck. We say things such as (1) an event E is lucky for an agent S and (2) S is lucky that E. We also say things such as (3) it is a matter of luck that E and (4) E is by luck. Milburn (2014) argues that (1) and (2) are plausibly equivalent: E is lucky for S if and only if S is lucky that E. (3) and (4) also seem equivalent: it is a matter of luck that E if and only if E is by luck. However, (1) and (2) are not equivalent to (3) and (4). Milburn is right in pointing out that this marks an important distinction that anyone in the business of analyzing luck should keep in mind.

The difference between (1) and (2), on the one hand, and (3) and (4), on the other, is that (1) and (2) denote a relation between an agent and an event, whereas (3) and (4) are not indicative of any relation and only apply to events. Call the kind of luck denoted by (1) and (2) relational luck and the kind of luck denoted by (3) and (4) non-relational luck—Milburn uses different terminology: he employs the expression “subjective-relative luck” to refer to relational luck and “subjective-involving luck” to refer to non-relational luck when the relevant event concerns an agent’s action.

Relational luck can be distinguished from non-relational luck regardless of the fact that the target event is an agent’s state or action. For instance, when the relevant event is an action by the agent—for example, that S scores a goal—the luck-involving expressions in (3) and (4) apply to the agent—for example, it is a matter of luck that S scores a goal—but fail to establish a relationship between the agent—S—and the event—S’s scoring of a goal. In contrast, if the target event is the agent’s action, (1) and (2) do establish a relationship between the agent and her action—for example, that S scores a goal is lucky for S.

In the literature, most accounts of luck try to explain what it takes for an event to be lucky for an agent. In other words, they focus on relational luck. But it might well be that in order to shed light on the special varieties of luck—for example, epistemic, moral, distributive luck—one might need to shift the focus of the analysis to non-relational luck—see Milburn (2014) for further discussion.

2. Synchronic versus diachronic luck. Most accounts of (relational and non-relational) luck focus on when an event is lucky—for an agent or simpliciter—at one point in time. However, Hales (2014) argues that luck may be predicated not only synchronically—that is, of an event’s occurrence at a certain time—but also diachronically—that is, of a series or streak of events occurring at different times. For example, synchronically, we say things such as “Joe was lucky to hit the baseball at the end of the game.” Diachronically, we say things such as “Joe was lucky to safely hit in 56 consecutive baseball games.” Hales’s point is that we can be lucky diachronically but not synchronically, and the other way around. By contrast, McKinnon (2013; 2014) argues that while we can determine the presence and degree of diachronic luck—for example, luck in a streak of successful performances—we do not have the ability to determine the presence of synchronic luck—that is, whether a concrete performance is by luck.

3. Strokes of luck. An important departure from standard analysis of relational and non-relational luck is Coffman (2014; 2015), who thinks that the notion of an event E being a stroke of luck for an agent S is more fundamental than the notion of E being lucky for S—or more simply, than the notion of lucky event—and that, therefore, the former should be the target of the analysis of any adequate account of luck. Nonetheless, Coffman’s account of strokes of luck features the same kind of conditions that other authors give in their analyses of the notion of lucky event. In view of this, Hales (2015) objects that Coffman’s approach unnecessarily adds an extra layer of complexity to the already complex analysis of luck and casts doubt on how an analysis of the notion of stroke of luck can shed any more light than an analysis of the notion of lucky event in those areas of philosophy where the concept of luck plays a significant role.

c. General Features of Luck

Before entering into further details, it is convenient to highlight three general features of luck that any adequate analysis of the concept should be able to explain.

Goodness and badness. Luck can be good or bad. This is clearly true of relational luck. For instance, we say things such as “Dylan was lucky to survive the car accident” or “Dylan was unlucky to die in the car accident” to mean, respectively, that it is good luck that he survived and bad luck that he died. Moreover, one and the same event can be both good and bad luck for an agent, which plausibly has to do with the fact that two or more interests of the agent are at stake—Ballantyne (2012). For example, losing one’s keys and having to spend the night outdoors is bad luck if one gets a cold as a consequence, but it is also good luck if one thereby avoids an explosion in one’s apartment.

By contrast, attributions of non-relational luck not so clearly convey good or bad luck—for example, “The discovery of Pluto was a matter of luck.” This is plausibly due to the fact that such attributions do not denote any relationship between a lucky event and an agent or group of agents. To put it differently, if we interpret that sort of attributions as conveying good or bad luck, it is probably because we read them as denoting such relationship. At any rate, accounting for why luck is good or bad is a desideratum at least for analyses of relational luck.

Finally, although the term “lucky” is ordinarily associated with good luck, in the philosophical literature, it is used to denote events that instantiate good luck as well as events that instantiate bad luck. This is done mainly for the sake of simplicity.

Vagueness. Luck is to some extent a vague notion. Not all instances of luck are as clear-cut as a lottery win. For example, goals from the corner kick in professional soccer matches are considered neither clearly lucky nor clearly produced by skill. Pritchard (2005: 143) gives another example: if someone drops her wallet, keeps walking and after five minutes realizes that she just lost her wallet, returns to the place where she dropped it and finds it, is that person lucky to have found her wallet? The answer is not clear. Accordingly, we should not expect an analysis of luck to remove this vagueness. On the contrary, an adequate account should predict borderline cases, that is, cases that are neither clearly lucky nor clearly non-lucky. This is a desideratum for accounts not only of relational luck but also of non-relational luck.

Gradualness. Luck is a gradual notion. In ordinary parlance, it is common to attribute different degrees of luck to different events. For example, winning one million dollars playing roulette is luckier than winning one dollar, even if the odds are the same. Interestingly, winning the prize of an ordinary lottery is luckier than winning the same amount of money by tossing a coin, that is, when the odds are lower. An adequate analysis of luck should be also able to account for these different differences in degree. Again, this concerns accounts of relational luck as well as of non-relational luck.

2. Luck and Significance

Several atomic nuclei joining and triggering off an explosion is an event that is neither lucky nor unlucky for anyone if it happens at the other end of the galaxy. But it is bad luck if the explosion takes place nearby. One way to account for the difference in luckiness is that while the former event is not significant to anyone, the latter is significant to whoever is nearby. Cases like this motivate philosophers who theorize about the concept of luck to endorse a significance condition, that is, a requirement to the effect that an event is lucky for an agent only if the event is significant to the agent.

Since the significance condition establishes a relationship between an agent and an event, whether one thinks that such a condition is needed or not depends on what the target of one’s account is. For instance, if one is in the business of analyzing relational luck, one will be willing to include a significance condition in one’s analysis. But if one’s aim is to account for non-relational luck instead—that is, when is an event lucky simpliciter—one will be reluctant to include such condition in one’s analysis—see Pritchard (2014) for further discussion.

Although there is a wide agreement that an adequate analysis of relational luck must include a significance condition, there is a significant disagreement on its specific formulation. Pritchard (2005: 132–3) formulates the significance condition as follows:

S1: An event E is lucky for an agent S only if S would ascribe significance to E, were S to be availed of the relevant facts.

S1 requires that lucky agents have the capacity to ascribe significance. But that is problematic insofar as the condition prevents sentient nonhuman beings (Coffman 2007) and human beings with diminished capacities like newborns or comatose adults (Ballantyne 2012) from being lucky.

Coffman (2007) proposes an alternative significance condition in terms of the positive or negative effect of lucky events on the agent:

S2: An event E is lucky for an agent S only if (i) S is sentient and (ii) E has some objective evaluative status for S—that is, E has some objectively good or bad, positive or negative, effect on S.

Ballantyne (2012) gives a counterexample to S2 by arguing, first, that (ii) should be read as follows:

(ii)* E has some objectively positive or negative effect on S’s mental states.

The reason given by Ballantyne is that if the event’s effect is not on the agent’s mental states, it is not obvious why clause (i) is required. With that in place, the counterexample to S2 goes as follows: an unlucky man has no inkling that scientists have randomly selected him to put his brain in a vat to feed his neural connections with real-world experiences. The case is allegedly troublesome for S2 because the event, which is bad luck for the man, has no impact on the man’s mental states and, in particular, on his interior life, which is not altered.

A reply might be that, although the fact that the man’s brain is put in a vat does not affect the man’s interior life and namely his phenomenal mental states, it certainly affects his representational mental states. In particular, most of them turn out false, which seems to be objectively negative for the man, just as S2 requires.

Ballantyne (2012) proposes an alternative formulation of the significance condition in terms of the positive or negative effect of lucky events on the agent’s interests:

S3: An event E is lucky for an agent S only if (i) S has a subjective or objective interest N and (ii) E has some objectively positive or negative effect on N—in the sense that E is good or bad for S.

S3 is more specific than S2 in the kind of attributes that are supposed to be positively or negatively affected by lucky events. While S2 does not say whether these need to be the qualitative states of sentient beings, or their representational states, or their physical condition, S3 is explicit that what lucky events affect are the subjective and objective interests of individuals.

Leaving aside the question of what the correct formulation of the significance condition is, it is interesting to see how a significance condition can help explain the three general features of luck outlined above, that is, the goodness, badness, vagueness, and gradualness of luck. Concerning goodness and badness, the explanation is straightforward: luck is good or bad because the significance that lucky events have for people is positive or negative. Concerning vagueness, significance is a vague concept, so including a significance condition in an analysis of luck at least does not remove its inherent vagueness. Concerning gradualness, it can be argued that the degree of luck of an event proportionally varies with its significance or value—Latus (2003), Levy (2011: 36), Rescher (1995: 211–12; 2014). Consider the previous example of winning one million dollars playing roulette versus winning one dollar when the probability of winning is the same: it can be simply argued that the former event is luckier than the latter because it is more significant.

3. Probabilistic Accounts

Paradigmatic lucky events—for example, winning a fair lottery—typically occur by chance. Probabilistic accounts of luck explicitly appeal to the probability of an event’s occurrence to explain why it is by luck. In addition, they typically include a significance condition to explain why events are lucky for agents. For discussion purposes, the analyses of luck below will be presented as analyses of significant events, so the relevant significance condition can be omitted.

a. Objective Accounts

Some accounts make use of objective probabilities to define luck, that is, the kind of probabilities that are not determined by an agent’s evidence or degree of belief, but by features of the world:

OP1: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low probability that E would occur at t.

OP1 says that lucky events are events whose occurrence was not objectively likely. A related way to formulate a probabilistic view—suggested by Baumann 2012—is by means of conditional objective probabilities:

OP2: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low objective probability conditional on C that E would occur at t.

C is whatever condition one uses to determine the probability that the event will occur. For example, the unconditional probability that Lionel Messi will score a goal in the soccer match is high but given C—the fact that he is injured—the probability that he will score is low. Suppose that Messi ends up scoring by luck. The condition helps explain why: he was injured and therefore it was not very likely that we would score.

According to Hales (2014), probabilistic views of luck such as OP1 or OP2 are the most widespread among scientists and mathematicians. But they face at least two problems. First, a dominant—although not undisputed—idea is that necessary truths have probability 1. In view of it, Hales (2014) argues that probabilistic analyses cannot account for lucky necessities, which are maximally probable. For example, he contends that organisms—humans included—are lucky to be alive because the gravitational constant, G, is the one that actually is, but the probability that G made life possible is 1.

Second, another problem for probabilistic accounts is that, although rare, there are highly probable lucky events, that is, lucky events whose occurrence is highly probable—see Broncano-Berrocal (2015). Suppose that someone is the most wanted person in the galaxy and that billions of mercenaries are trying to kill her, but also that her combat skills drastically reduce the probability that each independent assassination attempt will succeed. Suppose that one such an attempt succeeds for completely fortuitous reasons that have nothing to do with the exercise of her skills. That she is killed is obviously bad luck, but it was also very probable given how many mercenaries were trying to kill her: even if each killing attempt had low probability to succeed, the probability that at least one would succeed was high given the number of independent attempts—that is, the probability of the disjunction of all attempts was high. This shows, contrary to what OP1 and OP2 say, that luck does not entail low probability of occurrence.

OP1 and OP2 are analyses of synchronic luck. McKinnon (2013; 2014) proposes a probabilistic account of diachronic luck instead. The view, called the expected outcome view, starts with the observation that we can determine the expected objective ratio of many events, including people’s performances. By way of illustration, the expected ratio of flipping a coin is 50 percent tails and 50 percent heads. On the other hand, the expected ratio of a certain basketball player’s free-throw shots being successful might be of 90 percent. However, in real life series of tosses or free-throws shots the outcomes typically deviate from those values. In the light of these considerations, McKinnon proposes the following view:

OP3: For any series A of events (E1, E2, …, En) that are significant to an agent S and for any objective expected ratio N of outcomes for events of type E, S is lucky proportionally to how much the actual ratio of outcomes in A deviates from N.

In a nutshell, McKinnon’s view is that we attribute any deviation from the expected ratio of outcomes to luck, and namely to good luck—if the deviation is positive—and to bad luck—if the deviation is negative. If the actual ratio is as expected, the ratio is fully attributable to skill. One key element of McKinnon’s view—and the reason why she rejects any attempt to give an account of synchronic luck—is that she thinks that, while we can know that the set of outcomes that deviate from the expected ratio are due to luck, we cannot know which one of the outcomes in that set is by luck. In other words, we can know whether we are diachronically lucky, but not whether we are synchronically lucky.

Before turning to a different type of probabilistic accounts, let us see how accounts modeling luck in terms of objective probability explain the three general features of luck outlined above. On the one hand, they can explain why luck is a gradual notion in a natural way. For instance, Rescher (1995: 211–12; 2014) thinks that luck varies with not only significance but also chance. If S is the value or significance of an event E, how lucky E is can be determined, according to Rescher, as follows:

Luck = S × (1 – Prob[E]).

In other words, Rescher thinks that luck varies proportionally with the value or significance that the event has for the agent and inversely proportionally with the probability of its occurrence.

On the other hand, defenders of objective probabilistic views might in principle explain why luck is vague notion in epistemic terms. They might argue that knowing exactly how lucky someone is with respect to an event entails that the exact probability of the event’s occurrence is known. However, the relevant probabilities are typically unknown are, at best, approximately known, which might in principle help explain why, say, a goal from the corner kick is neither clearly lucky nor clearly produced by skill: prior to its occurrence, the probability that it would occur was unclear.

Finally, as we have seen, McKinnon thinks that her view also helps explain why luck is good or bad: luck is good or bad depending on whether the actual deviation from the expected ratio is positive or negative.

b. Subjective Accounts

A different way to model luck in probabilistic terms is by means of subjective probabilities, that is, the kind of probabilities that are determined by an agent’s evidence or degree of belief. One way to state this kind of view is that whether or not an event counts as lucky for an agent depends on the agent’s degree of belief in the occurrence of the event, that is, on how confident she is or how strongly she believes that the event will occur—see Latus (2003), Rescher (1995: 78–80), and Steglich-Petersen (2010) for relevant discussion. More precisely:

SP1: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S had a low degree of belief that E would occur at t.

A subjective probabilistic account might be also formulated in terms of the agent’s evidence for the occurrence of the event—see Steglich-Petersen (2010):

SP2: A significant event E is lucky for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, there was low probability that E would occur at t.

SP1 and SP2 characterize luck as a perspectival notion: if for A but not for B it is subjectively improbable that an event E will occur, then, if E occurs, E is lucky for A but not for B—Latus (2003) endorses this thesis. For example, suppose that someone receives a big check from a secret benefactor. From that person’s perspective, it is good luck that she has received the check, but from the perspective of the benefactor, it is not—the example is from Rescher (1995: 35). In addition, those who firmly believe in fate or whose evidence strongly points to its existence are never lucky according to these views, because everything that happens to them is highly probable from their perspective.

Stoutenburg (2015) gives a similar evidential account of degrees of luck. The idea is that an agent is lucky with respect to an event to the extent that her evidence does not guarantee its occurrence, in the sense that if the conditional probability of the occurrence of the event given the agent’s evidence is not maximal, she is lucky to some degree with respect to that event:

SP3: A significant event E is lucky to some degree for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, the probability that E would occur at t is not 1.

A problem for views such as SP1, SP2, and SP3 is that events are no less lucky if we have no evidence or have not thought about them—see Steglich-Petersen (2010). For example, someone would be clearly lucky if, unbeknownst to her, a bullet just missed her head by centimeters. Steglich-Petersen (2010) thinks that one way to fix this problem is to formulate a subjective view in terms of the agent’s total knowledge instead of her degree of belief or evidence for the occurrence of the event:

SP4: A significant event E is lucky for an agent S at time t if only if, for all S knew just before the occurrence of E at t, there was low probability that E would occur at t.

SP4 is compatible with an event being lucky for the agent when she has no prior evidence or doxastic state about its occurrence. But SP4 might still not yield the right results. Consider a macabre lottery in which all the participants have been poisoned and the only way to survive is to win the prize, which is the antidote. The lottery draw is a fair one, so surviving is a pure matter of chance. Suppose that the only difference in knowledge between two participants, A and B, is that only A knows of herself that has been poisoned and is a participant of the lottery. For all A knows, there is low probability that she will survive. In contrast, for all B knows, her survival is very likely—she is a healthy person and has no reason to think that she has been poisoned. According to SP4, B would not be lucky if she won the lottery and survived as a result. Intuitively, however, A and B would be equally lucky if they won the lottery.

In general, this and other cases might be taken to illustrate that what is apparently lucky does not always coincide with what is actually lucky—see Rescher (2014) for the distinction between apparent and actual luck. A potential problem for subjective views is then that they might be only capturing intuitions about the former.

Steglich-Petersen (2010) advances a different account, which is not probabilistic in nature, but which is worth considering in this section, not only because it is a natural development of SP4, but also because, like SP2, SP3, and SP4, it characterizes luck as an epistemic notion. In particular, it analyzes luck in terms of the agent’s epistemic position with respect to the future occurrence of the lucky event:

SP5: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S was not in a position to know that E would occur at t.

Steglich-Petersen explains that we are in a position to know that an event will occur if, by taking up the belief that the event will occur, we thereby know that it will occur. SP5 yields the correct result in the macabre lottery case, which was troublesome for SP4. None of the participants is in a position to know that they would win the lottery and survive as a result. For that reason, the winner is lucky.

However, SP5 might not capture the intuitions of other cases correctly. Suppose that someone is the holder of a ticket in a fair lottery. During the lottery draw, a Laplacian demon predicts and tells that person that she will be the winner, so she comes to know in advance—and therefore is in a position to know—that she will be the winner. However, that person is not less lucky to win the lottery because of that knowledge or because of being in that position. After all, it is still a coincidence that she has purchased the ticket that corresponds to the accurate prediction of the demon. In sum, knowing that one will be lucky—and therefore being in a position to know it—does not necessarily prevent one from being lucky.

Before considering an alternative approach to luck, let us see how subjective probabilistic accounts explain the three general features of luck presented at the beginning of the article. On the one hand, they can account for degrees of luck in terms of degrees of subjective probability. As we have seen, SP3 says that an agent is increasingly lucky with respect to an event the less likely the occurrence of the event—conditional on her evidence—is. On the other hand, advocates of the subjective approach might explain borderline cases of luck by appealing to the fact that the relevant subjective probabilities are not always transparent, so if we cannot determine whether an event is lucky or non-lucky, it is plausibly because the relevant subjective probabilities cannot be determined either. Finally, to explain why luck is good or bad defenders of subjective accounts can simply include a significance condition on luck in their analyses.

4. Modal Accounts

A different approach to luck emphasizes the fact that paradigmatic instances of luck such as lottery wins could have easily failed to occur. Modal accounts accordingly explain luck in terms of the notion of easy possibility. As usual in areas of philosophy where the notion of possibility is invoked, advocates of the modal approach use possible worlds terminology to explain that notion and in turn the concept of luck. In this sense, that a lucky event could have easily not occurred means that, although it occurs in the actual world, it would fail to occur in close possible worlds.

Closeness is simply assumed to be a function of how intuitively similar possible worlds are to the actual world. For example, if an event E occurs at time t in the actual world, close possible worlds can be obtained by making a small change to the actual world at t and by seeing what happens to E at t or at times close to t—see Coffman (2007; 2014) for relevant discussion. One should keep in mind that although current modal views are closeness views, it is in principle possible to give a modal account of luck that ranges over distant possible worlds.

In the literature, there can be found several formulations of modal conditions on luck, where the main point of disagreement concerns the proportion of close possible worlds in which an event needs not occur in order for its actual occurrence to be by luck. For discussion purposes, however, those conditions will be presented here as if they constituted full-fledged analyses of luck, but it is important to keep in mind that modal conditions are typically considered necessary but not sufficient for a significant event to be by luck. A prominent exception is Pritchard (2005), who is the only author in the literature advocating a pure modal account of luck—in more recent work (2014), he drops the significance condition from his analysis, plausibly because he is mainly interested in giving an account of non-relational luck. Also for discussion purposes, the analyses of luck below will be presented, as before, as analyses of significant events. Without further ado, let us consider the following modal account by Pritchard (2005: 128):

M1: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a wide proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

According to M1, one is lucky to win a fair lottery because in a wide class of close possible worlds one would lose. M1 has two important features. The first one is that it does not consider any close possible world relevant to determine whether an event is lucky or not: only those in which the relevant initial conditions are the same as in the actual world. According to Pritchard (2014), the relevant initial conditions for an event are specific enough to allow a correct assessment of the luckiness of the target event, but not so specific as to guarantee its occurrence. Nonetheless, Pritchard leaves as a contextual matter what features of the actual world need to be fixed in our evaluation of close possible worlds. For instance, when we assess the modal profile of lottery results, we typically keep fixed features such as the fairness and the odds of the lottery or the fact that one has decided to purchase a specific lottery number.

Riggs (2007) argues that M1 is defective precisely because there is no non-arbitrary way to fix the relevant initial conditions. In reply, Pritchard (2014) argues that an analysis of a concept should not be more precise than the concept that the analysis intends to account for. Given that luck is a vague notion, the somewhat vague clause on initial conditions might be after all doing some explanatory work.

The second important feature of M1 is that it requires that the lucky event fails to occur in a wide proportion of close possible worlds. Pritchard (2005: 130) explains that by “wide” he means at least approaching half the close possible worlds, where events that are clearly lucky would not obtain in most close possible worlds.

However, there are clearly lucky events, such as obtaining heads by flipping a coin, that would not occur in a large proportion of close possible worlds—since the probability of heads is 0.5, we can suppose that in half the close possible worlds the outcome would be still heads. Perhaps, the following slightly different formulation is to be preferred—see Coffman (2007):

M2: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in at least half the close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

However, Levy (2011: 17–18) argues that if we accept that an event that does not occur in half the close possible worlds is lucky, we can also accept that an event that does not occur in little less than half the close possible worlds—for example, in 49 percent of them—is lucky as well. In view of this, Levy thinks that it is better not to commit one’s modal account to a precise view of the issue. Instead, Levy argues that there is no fixed proportion of close possible worlds where an event must not occur to be considered lucky in the actual world. His point is that there might be different “large enough” proportions of close possible worlds in which events need not occur to be considered lucky. According to Levy, what makes the threshold vary from case to case is the significance that the event has for the agent. A modal account in the spirit of Levy’s considerations would be then the following:

M3: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world, where the relevant proportion of close possible worlds is determined by the significance that E has for S.

Lackey (2008) raises two important objections to the modal approach. The first one challenges the idea that the easy possibility of an event not occurring is necessary for luck. She proposes a counterexample involving a modally robust lucky event. Suppose that (i) A buries a treasure at location L and that (ii) B independently places a plant in the ground of L. When digging, B discovers A’s treasure. Lackey’s point is that if we stipulate that A’s and B’s independent actions are sufficiently modally robust, in the sense that there is no chance that they would fail to occur in close possible worlds, B’s discovery, which is undeniably lucky, would occur in most close possible worlds.

Pritchard (2014) and Levy (2009) try to circumvent the objection in two steps. First, they distinguish between the notions of luck and fortune. Then, they propose an error theory according to which most people would be mistaken to say that B’s discovery is by luck: B’s discovery is in reality fortunate, not lucky—see section 7 for the specific way in which Pritchard and Levy distinguish luck from fortune.

Lackey’s second objection targets the idea that the easy possibility of an event not occurring is sufficient for luck. Lackey thinks that whimsical events—that is, events that result from actions that are done on a whim—show exactly this. For instance, suppose that someone decides to catch the next flight to Paris on a whim. That person’s going to Paris is not by luck—since it is the result of her self-conscious decision—but it would nevertheless fail to occur in most close possible worlds—since she has made the decision on a whim.

In reply, Broncano-Berrocal (2015) argues that Lackey’s objection obviates the clause on initial conditions of modal accounts: if someone decides to go—and goes—to Paris on a whim, close possible worlds in which the relevant initial conditions for that trip are the same as in the actual world—that is, the only possible worlds that according to modal views are relevant to assess whether the trip is by luck—are worlds in which that person makes the decision to go to Paris. But, consistently with what modal accounts say, that person goes to Paris in most of those worlds. In a similar way, when it comes to evaluating whether someone in possession of a specific ticket is lucky to have won the lottery, we only consider close possible worlds in which she has decided to buy that specific ticket. Again, in most of those worlds, that ticket is a loser, just as modal accounts predict.

On the other hand, Hales (2014) thinks that cases of lucky necessities are problematic not only for objective probabilistic accounts but also for modal views. For example, if Jack the Ripper is terrorizing the neighborhood and it is one’s dearest friend Bob knocking on one’s door, one might be lucky that Bob is not Jack the Ripper, but it is metaphysically impossible that Bob is Jack the Ripper because things are self-identical—Hales gives credit to John Hawthorne for the example.

Before turning to lack of control views, let us see how modal accounts explain the three general features of luck. Concerning goodness or badness, modal views can simply include a significance condition—although, as noted, Pritchard (2014), one of the main advocates of the modal approach, thinks that a significance condition is not necessary for luck. In addition, we have seen that the clause on the relevant initial conditions of the event is vague enough to preserve the characteristic vagueness of the concept of luck.

On the other hand, modal views have at least two interesting ways to account for degrees of luck—the terminology below is from Williamson (2009), who applies it to the safety condition for knowledge. M1, M2, and M3 adopt what can be called the proportion view of the gradualness of luck: they cash out the degree of luck of an event in terms of the proportion of close possible worlds in which it would fail to occur—the larger the proportion of such close possible worlds is, the luckier the event is. Church (2013) argues that the proportion view should not be restricted to close possible worlds only: degrees of luck should be modeled in terms of all relevant possible worlds, although he also argues that more weight should be given to close ones.

The idea that more weight should be given to some possible worlds when fixing the degree of luck of an event serves to stipulate a different view of the gradualness of luck. The view, which can be called the distance view, says that the degree of luck of an event varies as a function of the distance to the actual world of possible worlds in which it would fail to occur. In this way, the closer those worlds are, the luckier the event is—Pritchard (2014) endorses the distance view.

On a related note, modal theorists can explore the relation between the significance of a lucky event and its modal profile. As we have seen, Levy (2011) thinks that the size of the proportion of close possible worlds in which an event needs not occur to count as lucky is sensitive to the significance that the event has for the agent. Although Levy thinks that it is a mistake to seek much clarity about how the latter affects the former, he also believes that there is a relation of inverse proportionality between the two: the more significant an event for an agent is, the smaller needs to be the proportion of close possible worlds in which it would not occur to be considered lucky for the agent—Coffman (2014) calls this the inverse proportionality thesis; see Levy (2011: 36).

By way of illustration, compare surviving a round of Russian roulette with one bullet in the chamber of a revolver with a six-shot capacity—approximately 0.16 probability of being shot—with winning one dollar in a poker game after having called an all-in that one knew one only had a 0.16 probability of losing. In both cases, one would succeed—that is, one would survive or win—in most close possible worlds, but only the former case is considered clearly lucky. The inverse proportionality thesis accounts for the difference: surviving is such a significant event that the proportion of close possible worlds in which one dies needs not be large for one’s actual survival to be considered lucky. However, Coffman (2015: 40) argues that the thesis is not sustainable precisely because it leads to the result that all extremely significant events count as lucky if there is at least a small non-zero chance that they will not happen—for example, the thesis seems to entail that we are lucky to survive every time we take a flight.

5. Lack of Control Accounts

One of the most widespread intuitions about luck is that lucky events are events beyond our control. For example, one way to explain why we are lucky to win the lottery is that the outcome of the lottery is beyond our control. In the literature, different lack of control views account for luck in those terms.

Some authors give pure lack of control accounts—for example, Broncano-Berrocal (2015), Riggs (2009). Other authors think that lack of control conditions are necessary but not sufficient for significant events to be by luck—for example, Coffman (2007; 2009), Latus (2003), Levy (2009; 2011). As in the case of modal conditions, and mainly for discussion purposes, the latter will be presented as if they constituted full-fledged analyses of luck—also as before, the analyses will be presented as analyses of significant events. That said, the simplest lack of control account has the following form:

LC1: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t.

Many lucky events are beyond our control, so LC1 seems to be on the right track. However, Lackey (2008) argues that the fact that a significant event is beyond our control is neither necessary nor sufficient for the event being lucky. Against the sufficiency claim, Lackey argues that many nomic necessities—for example, sunrises—are not under our control, but that does not mean that they are by luck—see also Latus (2003) for this objection. To prove that lack of control is not necessary for luck, Lackey proposes a case in which a demolition worker, A, succeeds in demolishing the warehouse she was planning to demolish when pressing the button of the demolition system she had designed to that effect only because the electrical current is accidentally restored after the damage caused by a mouse when chewing the connection wires. According to Lackey, the explosion is both under A’s control and by luck.

Coffman (2009) and Levy (2011), who think that lack of control is not sufficient for luck, argue that Lackey’s counterexample to the necessity claim rests on the false thesis—called by Coffman the luck infection thesis—that if luck affects the conditions that enable an exercise of control, then the exercise of control itself is by luck; more generally, if S is lucky to be in a position to ϕ and S ϕ-es, then S ϕ-es by luck. The thesis, according to Coffman, has blatant counterexamples. For example, a lifeguard who accidentally goes to work very early and sees a swimmer drowning is lucky to be in a position to save the swimmer, but if done competently, it is not by luck that she saves him.

To overcome this and other objections, lack of control theorists define the notion of control in different ways. For example, Coffman (2009) thinks that an event is under an agent’s control just in case she is free to do something that would help produce it and something that would help prevent it. Rescher (1969: 329) gives a similar account of control as the capacity to produce the occurrence of an event—what Rescher calls positive control—and the capacity to prevent it—what he calls negative control. While Rescher defends a probabilistic account of luck, Coffman thinks that lack of both negative and positive control—when understood in terms of freedom—is necessary for luck. The following is a lack of control view in the spirit of Coffman’s and Rescher’s respective conceptions of control:

LC2: A significant event E is lucky for an agent S at time t if only if S is not both free to do something that would help produce E at t—or lacks the capacity to do it—and free to do something that would help prevent E at t—or lacks the capacity to do it.

An immediate problem for LC2 is that it is not the same to have control as to exercise it. We might have control over something in the sense that we are free or have the capacity to control it, but that does not mean that we actually exercise that capacity or freedom. For example, a competent pilot who is free or has the capacity to produce and prevent a plane crash but who refuses to take control of the plane for some reason is objectively lucky that a passenger manages to land the plane safely and that as a result survives.

Levy (2011: chap. 5) understands control in similar terms as Coffman and Rescher, but he introduces additional epistemic constraints. For Levy, an event is under an agent’s control just in case there is a basic action that she could perform which she knows would bring about the event and how it would do so. This way to understand control can be supplemented with Rescher’s point that agents can also control an event by inaction, omission or inactivity (Rescher 1969: 369). Taking the latter into account, the following is a pure lack of control view in the spirit of Levy’s conception of control:

LC3: A significant event E is lucky for an agent S at time t if only if S is able to perform—or to omit performing—a basic action whose occurrence—or non-occurrence—is such that S knows would bring about—or prevent—E at t and how it would do so.

According to LC3, if we do not want to be exposed to the whims of luck not only we have to be able to perform—or omit performing—actions that causally influence the world, but we also need to know that, and how, the world is sensitive to them.

A potential problem for LC3 is that we might be properly described as being in control of something when we act in a way that brings it to a desired state despite we do not know how exactly this happens. For example, a driver might know that by turning the steering wheel to the left she will avoid an obstacle in the road, but she might be completely mistaken about how exactly this works—for example, she might erroneously believe that, whenever she turns the steering wheel to the left, it is a magical dwarf who moves the car to the left. So, she knows that her basic action will bring about the desired effect while failing to know how. The problem is that if that person competently avoids the obstacle, the maneuver seems under her control, no matter that she mistakenly thinks that it is under the dwarf’s.

A different lack of control account is due to Riggs (2009), who tries to defend the lack of control approach from Lackey’s objection that the fact that an event is beyond our control does not suffice for the event being lucky. Riggs admits that although it is true that many nomic necessities—for example, sunrises—are beyond our control, we can still exploit them to our advantage. The idea is that if we exploit them for some purpose, they are not lucky for us even if they are not under our control. The following analysis accounts for luck in those terms:

LC4: A significant event E is lucky for an agent S at time t if only if (i) E is beyond S’s control at t and (ii) S did not successfully exploit E, prior to E’s occurrence at t, for some purpose.

To illustrate how LC4 can distinguish between lucky and non-lucky physical events beyond our control, Riggs proposes a case in which two people, A and B, are about to be executed, but only A knows two important facts: first, that their captors believe that solar eclipses are in reality a message from the gods telling them to stop sacrifices; second, that, unbeknownst to their captors, a solar eclipse will take place at the exact time the execution is planned. Riggs thinks that, while B is lucky to be released, A is not. By being in a position to exploit the eclipse in her favor, A is in control of the situation.

Coffman (2015: 10) argues via counterexample that LC4 does not distinguish correctly between lucky and non-lucky physical events beyond our control. He proposes a case in which someone lives in an underground facility that is, unbeknownst to her, solar-powered. According to Coffman, that person, who has become completely oblivious to sunrises, is not lucky that the sun rises every morning and keeps her facility running, even if it is something that is neither beyond her control, nor successfully exploited by her for some purpose.

Broncano-Berrocal (2015) gives a lack of control account in the spirit of Riggs’s, but with significant differences. According to Broncano-Berrocal, there are two ways in which something might be under our control. On the one hand, we exercise effective control over something by competently bringing it to a desired state—for example, by causally influencing it in a certain way. On the other hand, something is under our tracking control when we actively check or monitor that it is currently in a certain desired state, so that we are thereby disposed or in a position either (i) to exercise effective control over it or (ii) to act in a way that would allow us to achieve goals related to the thing controlled—for example, exploiting it to our advantage. By way of illustration, when flying on autopilot mode, a pilot does not exercise effective control over the plane—for example, she does not exert any causal influence on it—but the plane is under her tracking control if she is sufficiently vigilant. A key point of Broncano-Berrocal’s account is that, depending on the practical context, attributions of control such as “Event E is under S’s control” might refer either to effective control, to tracking control, or to both. The corresponding account of luck is the following:

LC5: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t, where E is beyond S’s control at t either if (i) S lacks effective control over E, or (ii) E is not under S’s tracking control, or (iii) both.

Lotteries are typically not under our tracking control—although they might be if a Laplacian demon tells us what the result will be. The reason why winning a fair lottery is a matter of luck is, according to LC5, that we are not able to causally influence the result in the desired way, that is, the fact that we lack effective control. By the same token, LC5 also considers lucky winning a lottery that, unbeknownst to one, has been rigged in one’s favor.

LC5 allows to give a different response to Lackey’s demolition case: Lackey’s intuition that the explosion is under A’s control can be explained in terms of the fact that A exercises effective control over the explosion by pressing the button. But the intuition that A is lucky to demolish the warehouse is parasitic on the fact that the explosion is not under A’s tracking control. In particular, the practical context provided by Lackey is such that A is responsible for the design of the demolition system but fails to check that the connection wires are damaged—sometimes, tracking control might be very difficult to achieve. In a similar way, LC5 explains that, while we lack effective control over many physical events—for example, sunrises—the reason why they are not lucky is that they are under our tracking control, that is, they are things that we regularly monitor and thereby can exploit to our advantage.

Coffman’s solar-powered facility case, the counterexample to LC4, is also a counterexample to LC5. Coffman’s point is that sunrises are not lucky for the person living in the solar-powered underground facility, despite they are not under her control—tracking or effective. In reply, defenders of lack of control views might argue that it is not unreasonable to say that such a person is lucky that the sun rises every morning and keeps, unbeknownst to her, her facility running. After all, there are similar attributions of luck in ordinary speech. For example, we say things such as “S is lucky to live in an earthquake-free region” even though S ignores it and is therefore lucky that an earthquake will not make her house collapse.

Finally, Hales (2014) thinks that there are cases of skillful achievements that lack of control accounts are compelled to consider lucky. For instance, he thinks that not even the best batter in history can plausibly be said to have control over whether he hits the ball, since there are many factors over which he cannot exercise any sort of control—for example, distractions, the pitches he receives, and the play of the opposing fielders. In reply, lack of control theorists might argue that Hales is illicitly raising the standards of control. After all, intuitions about whether the result of our actions is under our control go hand in hand with intuitions about whether the result of our actions is because of our skills.

As a final note, let us briefly consider how lack of control accounts explain the three general features of luck presented at the beginning of the article. Concerning goodness or badness, lack of control views can, like other views, simply include a significance condition. Concerning vagueness, the notion of control is not as precise as to remove all vagueness from the analysis of luck. Concerning gradualness, control, like luck, comes in degrees. In particular, lack of control of theorists might endorse the view that the degree of luck of an event is inversely proportional to the degree of control that the agent has over it—see Latus (2003) for further discussion.

6. Hybrid Accounts

Some authors opt for giving accounts of luck that mix modal or probabilistic conditions with lack of control conditions. The rationale behind this move is, as Latus (2003) puts it, that although lack of control over an event often goes hand in hand with the event having low chance of happening—or with the event being modally fragile—there are non-lucky events that are either beyond our control—for example, sunrises—or have low chance of occurring—for example, rare significant events brought about by ability. Latus’s hybrid view features a lack of control condition and a subjective probabilistic condition:

H1: A significant event E is lucky for an agent S at time t if only if, (i) just before the occurrence of E at t, S had a low degree of belief that E would occur at t, and (ii) E is beyond S’s control at t.

By contrast, Coffman (2007) and Levy (2011) opt for conjoining a lack of control condition with modal conditions. Coffman’s analysis is roughly the following—he includes several further refinements to handle specific cases of competing significant events:

H2: A significant event E is lucky for an agent S at time t if only if, (i) E does not occur around t in at least half the possible worlds obtainable by making no more than a small change to the actual world at t, and (ii) E is beyond S’s control at t.

Levy’s hybrid analysis (2011) features a different modal condition:

H3: A significant event E is lucky for an agent S at time t if only if, (i) E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds, where the relevant proportion of close possible worlds is inverse to the significance of E for S, and (ii) E is beyond S’s control at t.

Levy calls this kind of luck chancy luck, but argues that there also exists a non-chancy variety of luck, which is the kind of luck that affects one’s psychological traits or dispositions relative to a reference group of individuals—for example, human beings.

Any of the already discussed counterexamples to the necessity for luck of (i) subjective probabilistic conditions—for example, cases of agents without beliefs about events that are lucky for them, (ii) objective probabilistic conditions—for example, cases of highly probable lucky events, (iii) modal conditions—for example, Lackey’s buried treasure case, and (iv) lack of control conditions—for example, Lackey’s demolition case—are troublesome for hybrid views.

7. Luck and Related Concepts

There are several concepts that are closely related to the concept of luck. Here we will focus on the concepts of accident, coincidence, fortune, risk, and indeterminacy.

a. Accidents

The concept of accident is closely related to the concept of luck. After all, most accidents—for example, car crashes—involve luck—mostly bad luck. But as Pritchard (2005: 126) argues, there are paradigmatic cases of luck that involve no accidents. For example, if one self-consciously chooses a specific lottery ticket and wins the lottery, one’s winning is by luck, but it is not an accident given that one was trying to win.

From Pritchard’s example, we might infer that if an agent acts with the intention of bringing about some result, then if it occurs, it is not an accident. However, if someone prays with the intention of bringing about some event and the event occurs by sheer coincidence—because that person’s prayers are causally irrelevant to its occurrence—the event is accidental. But the mere causal relevance of an agent’s actions to an event’s occurrence is not sufficient for excluding accidentality either. If a pilot dancing in the cockpit unintentionally presses the depressurization button and as a result the plane crashes, the crash is an accident despite being caused by the pilot.

This suggests that what prevents the outcomes of an action from being accidental—but not from being lucky—is both the fact that an agent acts with the intention to bring about a certain outcome and the fact that her action is causally relevant to that outcome. For example, if someone wins a lottery in which participants have to pick a ball directly from the lottery drum with a blindfold on, that person’s winning is lucky but not accidental because of being brought about by her direct intentional action.

b. Coincidences

The concept of coincidence is also closely related to the concept of luck. Owens (1992) gives an account according to which a coincidence is an inexplicable event in the following sense: we cannot explain why its constituents come together because they are produced by independent causal factors—see also Riggs (2014) for a similar account. More specifically, coincidences are such that we cannot explain why they occur because there is no common nomological antecedent of their components or a nomological connection between them. For example, if someone prays for rain and it rains, that it rains is a coincidence because there is no nomological connection between that person’s prayers and the fact that it rains. On the other hand, how close or immediate should an antecedent be in order to prevent two events from constituting a coincidence is a matter that usually becomes clear in context. For example, we would regard as a coincidence the fact that someone wishes that her favorite team wins the final and that as a matter of fact it ends up winning the final despite both events have some distant nomological component—for example, the Big Bang; see Riggs (2014) for further discussion.

Not all lucky events are coincidental events. For example, it is no coincidence that a coin lands heads when someone flips it. But that might be clearly lucky for that person. In the same way, as causally relevant intentional action prevents an event from being an accident, causally relevant intentional action seems to prevent a pair of events—someone’s flipping of the coin and the coin landing heads—from being a coincidence. By contrast, all coincidental events, if significant, are lucky. For example, if someone prays for rain because she is in need of water and it rains, the coincidental event that it rains is lucky for that person.

Probabilistic and modal views have difficulties when it comes to accounting for highly probable or modally robust lucky events arising out of coincidence. As Lackey’s buried treasure case illustrates, if the occurrence of the components of a coincidence—A’s burial of the treasure and B’s digging at the same location—is highly probable or modally robust, the occurrence of the resulting coincidental event—B’s discovery of A’s treasure—is also highly probable or modally robust. Yet, the event is lucky precisely because it arises out of a coincidence.

c. Fortune

In the literature, there is some disagreement concerning whether or not the concept of fortune is the same as the concept of luck. Most modal theorists think that luck and fortune are different and use the distinction to argue that Lackey’s buried treasure case is in reality a case of fortune, while their theories are theories of luck.

For example, Pritchard (2005: 144, n.15; 2014) thinks that fortunate events are events beyond our control that count in our favor, but unlike lucky events, they are not chancy or modally fragile. In his way, having good health or a good financial situation are instances of fortune, not of luck, while winning a fair lottery is only an instance of luck. Rescher (1995: 28–9) similarly thinks that we can be fortunate if something good happens to or for us in the natural course of things, but we are lucky only if such eventuality is chancy. In a similar vein, Coffman (2007; 2014) thinks that we are lucky to win a fair lottery—given how unlikely it was—but we are merely unfortunate to lose it—given how likely it was.

Finally, Levy (2009; 2011: 17) thinks that fortunate events are non-chancy events—hence non-lucky—but luck-involving, in the sense that they have luck in their causal history and, in particular, in their proximate causes. His reply to Lackey’s buried treasure case is that luck in the circumstances—the lucky coincidence that someone places a plant at the same location in which someone has buried a treasure—is not inherited by the actions performed in those circumstances or by the events resulting from them—for example, the discovery of the treasure. So while there is luck involved in the circumstances of the discovery, the discovery itself is merely fortunate.

Against the distinction between luck and fortune, Broncano-Berrocal (2015) and Stoutenburg (2015) argue that the terms “luck” and “fortune” can be interchanged in English sentences without any significant semantic difference. Moreover, since English speakers use the terms interchangeably, arguing that luck and fortune are two distinct concepts entails that speakers are systematically mistaken in their usage of the terms, which is a hardly tenable error theory. For example, we would be wrong in saying that someone is fortunate to win a raffle or lucky to win a lottery that, completely unbeknownst to her, has been rigged in her favor.

d. Risk

There is a close connection between the concepts of luck and risk. In fact, some theorists think that the connection is so close that they think that the former can be explained in terms of the latter—see Broncano-Berrocal (2015), Coffman (2007), Pritchard (2014; 2015), and Williamson (2009) for relevant discussion. On the one hand, Pritchard (2015) explains that a risk or a risk event is a potential, unwanted event that is realistically possible—that is, something that could credibly occur—whereas a risky event is a potential, unwanted event that has higher risk than normal of occurring—for example, there is always a risk that one’s plane might crash, but flying by plane is not risky. With that distinction in place, Pritchard distinguishes two competing ways to understand the notion of risk or of risk event.

The probabilistic account of risk says that an event is at risk of occurring just in case there is non-zero objective probability that it will occur. How high its risk of occurrence is—that is, how risky it is—depends on how probable its occurrence is. The modal account of risk, by contrast, says that an event is at risk of occurring just in case it would occur in at least some close possible worlds—see also Coffman (2007) and Williamson (2009). How high its risk of occurrence is—that is, how risky it is—depends on how large the proportion of close possible worlds in which it would occur is—call this the proportion view of degrees of risk—or on how distant possible worlds in which it would occur are—call this the distance view of degrees of risk.

Pritchard contends that the probabilistic account fails to adequately account for degrees of risk. In particular, he argues that if two risk events E1 and E2 have the same probability of occurring but E1 is such that its occurrence is easily possible, E1 is riskier than E2, but the probabilistic account is committed to say that they are equally risky.

Pritchard (2014; 2015) also argues that when risk is understood in modal terms, the notions of luck and risk are basically co-extensive, because both how lucky and risky an event is depends on the modal profile of the event’s occurrence, that is, on the size of the proportion of close possible worlds in which it would not obtain, or the distance to the actual world of possible worlds in which it would not occur. According to Pritchard, the only two minor differences between the two notions are, on the one hand, that risk is typically associated to negative events, whereas luck can be predicated of both negative and positive events; on the other, that while we can talk of very low levels of risk, we cannot so clearly talk of low levels of luck.

Broncano-Berrocal (2015) makes a further distinction between two ways in which we think of risk: the risk that an event has of occurring—or event-relative risk—and the risk at which an agent is with respect to an event—or agent-relative risk. The distinction serves to delimit the scope of Pritchard’s account: his modal account of risk is an account of event-relative risk—the same applies to the probabilistic view. For Broncano-Berrocal, the modal and probabilistic accounts of event-relative risk are both correct: while the probabilistic conception is the one that is typically used or assumed in scientific and technical contexts, the modal conception better fits our everyday thinking about risky events. On the other hand, the best way to understand the agent-relative sense of risk is, according to Broncano-Berrocal, in terms of lack of control: an agent is at risk with respect to the possible occurrence of an event just in case its occurrence is beyond her control. He further argues that the agent-relative sense of risk is the one that really serves to account for luck: when risk is understood in terms of lack of control, the notions of luck and risk are basically co-extensive, because whether an event is lucky or risky for an agent depends on whether it is under the agent’s control.

e. Indeterminacy

In a causally deterministic world, events are necessitated as a matter of natural law by antecedent conditions. It might be thought that lucky events are events whose occurrence was not predetermined in that way. Against this idea, Pritchard (2005: 126–27) argues that at least some lucky events are not brought about by indeterminate factors. For example, given the position and momentum of the balls in a lottery drum at time t1 it might be fully determinate that a certain combination of balls will be the winner combination at t2. To make the point more vivid, Coffman (2007) proposes an example in which someone’s life depends on the fact that a ball remains perfectly balanced on the tip of a cone in a deterministic world. According to Coffman, that person can be properly described as being lucky if her stay in the deterministic world corresponds to the predetermined temporal interval in which the ball would remain balanced on the cone’s tip. Another example is the following: a Laplacian demon, who is able to predict the future given his knowledge of the complete state of a deterministic world at a prior time, might be unlucky to know in advance that he will die in a car accident. The moral of all these cases is that luck is—or at least seems—fully compatible with determinism.

8. References and Further Reading

  • Ballantyne, Nathan 2014. Does luck have a place in epistemology? Synthese 191:1391–1407.
    • Ballantyne argues that investigating the nature of luck does not allow to better understand knowledge.
  • Ballantyne, Nathan. 2012. Luck and interests. Synthese 185: 319–334.
    • Ballantyne provides a detailed examination of the different ways to formulate the significance condition on luck.
  • Baumann, Peter. 2012. No luck with knowledge? On a dogma of epistemology. Philosophy and Phenomenological Research DOI: 10.1111/j.1933-1592.2012.00622.
    • Baumann defends an objective probabilistic condition.
  • Broncano-Berrocal, Fernando. 2015. Luck as risk and the lack of control account of luck. Metaphilosophy 46: 1–25.
    • Broncano-Berrocal proposes a lack of control account and argues that luck can be explained in terms of risk.
  • Coffman, E. J. 2015. Luck: Its nature and significance for human knowledge and agency. Palgrave Macmillan.
    • Coffman’s monograph includes extensive criticism of leading theories of luck and argues that luck can be explained in terms of the notion of stroke of luck; it also explores the applications in epistemology and philosophy of action of that idea.
  • Coffman, E. J. 2014. Strokes of luck. Metaphilosophy 45: 477–508.
    • Coffman proposes an account of strokes of luck.
  • Coffman, E. J. 2009. Does luck exclude control? Australasian Journal of Philosophy 87: 499–504.
    • Coffman defends a specific way to understand the lack of control condition on luck.
  • Coffman, E. J. 2007. Thinking about luck. Synthese 158: 385–398.
    • Coffman gives a hybrid account of luck in terms of easy possibility and lack of control.
  • Church, Ian M. (2013). Getting ‘Lucky’ with Gettier. European Journal of Philosophy. 21: 37–49.
    • Church explores several ways to model degrees of luck in modal terms.
  • Hales, Steven D. 2015. Luck: Its Nature and Significance for Human Knowledge and Responsibility, by E.J. Coffman. The Philosophical Quarterly, DOI:10.1093/pq/pqv093.
    • Critical book review of Coffman’s monograph.
  • Hales, Steven. D. 2014. Why every theory of luck is wrong. Noûs, DOI: 10.1111/nous.12076.
    • Hales gives three kind of counterexamples to probabilistic, modal, and lack of control accounts of luck.
  • Hales, Steven. D. & Johnson, Jennifer Adrienne. 2014. Luck attributions and cognitive Bias. Metaphilosophy 45: 509–528.
    • Hales and Johnson conduct an empirical investigation on luck attributions and suggest that the results might indicate that luck is a cognitive illusion.
  • Lackey, Jennifer. 2008. What luck is not. Australasian Journal of Philosophy 86: 255-67.
    • Lackey argues that the conditions of modal and lack of control analyses are neither sufficient nor necessary for luck.
  • Latus, Andrew. 2003. Constitutive luck. Metaphilosophy 34: 460–475.
    • Latus gives a hybrid account of luck that features subjective probabilistic and lack of control conditions and uses the account to show that the concept of constitutive luck is not incoherent.
  • Levy, Neil. 2011. Hard luck: How luck undermines free will and moral responsibility. Oxford University Press.
    • Levy proposes a hybrid account that conjoins a modal condition with a lack of control condition and argues that the epistemic requirements on control are so demanding that are rarely met; he also applies this account to the free will debate.
  • Levy, Neil. 2009. What, and where, luck is: A response to Jennifer Lackey. Australasian Journal of Philosophy 87: 489–497.
    • Levy defends that Lackey’s buried treasure case poses no problem to modal accounts in terms of the distinction between luck and fortune.
  • McKinnon, Rachel. 2014. You make your own luck. Metaphilosophy 45: 558–577.
    • McKinnon gives an answer to the question of what does it mean to say that someone creates her own luck and uses her account of diachronic luck to explain how we evaluate performances.
  • McKinnon, Rachel. 2013. Getting luck properly under control. Metaphilosophy 44: 496–511.
    • McKinnon proposes an account of diachronic luck in terms of the notion of expected value.
  • Milburn, Joe. 2014. Subject-involving luck. Metaphilosophy 45: 578–593.
    • Milburn distinguishes between subject-relative and subject-involving luck and argues that one of the upshots of focusing on the latter is that lack of control accounts of luck become more attractive.
  • Owens, David. 1992. Causes and coincidences. Cambridge University Press.
    • Owens gives an account of coincidences according to which a coincidence is an event whose constituents are nomologically independent of each other.
  • Pritchard, Duncan (2015). Risk. Metaphilosophy 46: 436–461.
    • Pritchard argues that the standard way of conceptualizing risk in probabilistic terms is flawed and proposes an alternative modal conception.
  • Pritchard, Duncan. 2014. The modal account of luck. Metaphilosophy 45: 594–619.
    • Pritchard defends the modal account of luck from several objections.
  • Pritchard, Duncan. 2005. Epistemic luck. Oxford University Press.
    • Pritchard introduces the modal account of luck and gives corresponding accounts of epistemic and moral luck.
  • Pritchard, Duncan, & Smith, Matthew. 2004. The psychology and philosophy of luck. New Ideas in Psychology 22: 1–28.
    • Pritchard and Smith survey psychological research on luck and argue that it supports the modal account of luck.
  • Pritchard, Duncan, & Whittington, Lee John (eds.). 2015. The philosophy of luck. Wiley-Blackwell.
    • A volume with many of the papers contained in this bibliography.
  • Rescher, Nicholas. 2014. The machinations of luck. Metaphilosophy 45: 620–626.
    • Rescher defends an objective probabilistic account of luck.
  • Rescher, Nicholas. 1995. Luck: The brilliant randomness of everyday life. Farrar, Straus and Giroux.
    • Rescher provides an extensive examination of the concept of luck as well as of many other issues surrounding it.
  • Rescher, Nicholas. 1969. The concept of control. In Essays in Philosophical Analysis. University of Pittsburgh Press: 327–354.
    • Rescher provides an extensive examination of the concept of control.
  • Riggs, Wayne D. 2014. Luck, knowledge, and “mere” coincidence. Metaphilosophy 45 :627–639.
    • Riggs advances an account of coincidence and applies it to the theory of knowledge.
  • Riggs, Wayne. 2009. Knowledge, luck, and control. In Haddock, A., Millar, A. & Pritchard, D. (eds.). Epistemic value. Oxford University Press.
    • Riggs proposes a lack of control account of luck and replies to some objections.
  • Riggs, Wayne 2007. Why epistemologists are so down on their luck. Synthese 158: 329–344.
    • Riggs criticizes the modal account of luck and defends a lack of control condition.
  • Steglich-Petersen, Asbjørn 2010. Luck as an epistemic notion. Synthese 176: 361–377.
    • Steglich-Petersen gives an epistemic analysis of luck in terms of the notion of being in a position to know.
  • Stoutenburg, Gregory. 2015. The epistemic analysis of luck. Episteme, DOI:10.1017/epi.2014.35.
    • Stoutenburg gives an evidential account of degrees of luck.
  • Williamson, Timothy. 2009. Probability and danger. The Amherst Lecture in Philosophy 4: 1–35.
    • Williamson compares probabilistic and modal conceptions of safety and risk and discusses how they bear on the theory of knowledge.

 

Author Information

Fernando Broncano-Berrocal
Email: fernando.broncanoberrocal@kuleuven.be
University of Leuven (KU Leuven)
Belgium

Epistemic Justification

We often believe what we are told by our parents, friends, doctors, and news reporters. We often believe what we see, taste, and smell. We hold beliefs about the past, the present, and the future. Do we have a right to hold any of these beliefs? Are any supported by evidence? Should we continue to hold them, or should we discard some? These questions are evaluative. They ask whether our beliefs meet a standard that renders them fitting, right, or reasonable for us to hold. One prominent standard is epistemic justification.

Very generally, justification is the right standing of an action, person, or attitude with respect to some standard of evaluation. For example, a person’s actions might be justified under the law, or a person might be justified before God.

Epistemic justification (from episteme, the Greek word for knowledge) is the right standing of a person’s beliefs with respect to knowledge, though there is some disagreement about what that means precisely. Some argue that right standing refers to whether the beliefs are more likely to be true. Others argue that it refers to whether they are more likely to be knowledge. Still others argue that it refers to whether those beliefs were formed or are held in a responsible or virtuous manner.

Because of its evaluative role, justification is often used synonymously with rationality. There are, however, many types of rationality, some of which are not about a belief’s epistemic status and some of which are not about beliefs at all. So, while it is intuitive to say a justified belief is a rational belief, it is also intuitive to say that a person is rational for holding a justified belief. This article focuses on theories of epistemic justification and sets aside their relationship to rationality.

In addition to being an evaluative concept, many philosophers hold that justification is normative. Having justified beliefs is better, in some sense, than having unjustified beliefs, and determining whether a belief is justified tells us whether we should, should not, or may believe a proposition. But this normative role is controversial, and some philosophers have rejected it for a more naturalistic, or science-based, role. Naturalistic theories focus less on belief-forming decisions—decisions from a subject’s own perspective—and more on describing, from an objective point of view, the relationship between belief-forming mechanisms and reality.

Regardless of whether justification refers to right belief or responsible belief, or whether it plays a normative or naturalistic role, it is still predominantly regarded as essential for knowledge. This article introduces some of the questions that motivate theories of epistemic justification, explains the goals that a successful theory must accomplish, and surveys the most widely discussed versions of these theories.

Table of Contents

  1. Starting Points
    1. The Dilemma of Inferential Justification
    2. Explaining How Beliefs are Justified
    3. Explaining the Role of Justification
    4. Explaining Why Justification is Valuable
    5. Justification and Knowledge
  2. Internalist Foundationalism
    1. Basic Beliefs
    2. Arguments For and Against Foundationalism
  3. Internalist Coherentism
    1. Varieties of Coherence
    2. Objections to Coherentism
  4. Infinitism
    1. Arguments for Infinitism
    2. Objections to Qualified Infinitism
  5. Types of Internalism and Objections
    1. Accessibilism and Mentalism
    2. Objections to Internalism
  6. The Gettier Era
    1. The History of the Gettier Problem
    2. Responses to the Gettier Problem
  7. Externalist Foundationalism
    1. Externalism, Foundationalism, and the DIJ
    2. Reliabilism
    3. Objections to Externalism
  8. Justification as Virtue
    1. Virtue Reliabilism
    2. Virtue Responsibilism
    3. Objections to Virtue Epistemology
  9. The Value of Justification
    1. The Truth Goal
    2. Alternatives to the Truth Goal
    3. Objections to the Polyvalent View
    4. Rejections of the Truth Goal
  10. Conclusion
  11. References and Further Reading

1. Starting Points

Consider your simplest, most obvious beliefs: the color of the sky, the date of your birth, what chocolate tastes like. Are these beliefs justified for you? What would explain the rightness or fittingness of these beliefs? One prominent account of justification is that a belief is justified for a person only if she has a good reason for holding it. If you were to ask me why I believe the sky is blue and I were to answer that I am just guessing or that my horoscope told me, you would likely not consider either a good reason. In either case, I am not justified in believing the sky is blue, even if it really is blue. However, if I were to say, instead, that I remember seeing the sky as blue or that I am currently seeing that it is blue, you would likely think better of my reason. So, having good reasons is a very natural explanation of how our beliefs are justified.

Further, the possibility that my belief that the sky is blue is not justified, even if it is true that the sky is blue, suggests that justification is more than simply having a true belief. All of my beliefs may be true, but if I obtained them accidentally or by faulty reasoning, then they are not justified for me; if I am seeking knowledge, I have no right to hold them. Further still, true belief may not even be necessary for justification. If I understand Newtonian physics, and if Newton’s arguments seem right to me, and if all contemporary physicists testify that Newtonian physics is true, it is plausible to think that my belief that it is true is justified, even if Einstein will eventually show that Newton and I are wrong. We can imagine this was the situation of many physicists in the late 1700s. If this is right, justification is fallible—it is possible to be justified in believing false propositions. Though some philosophers have, in the past, rejected fallibilism about justification, it is now widely accepted. Having good reasons, it turns out, does not guarantee having true beliefs.

But the idea that justification is a matter of having good reasons faces a serious obstacle. Normally, when we give reasons for a belief, we cite other beliefs. Take, for example, the proposition, “The cat is on the mat.” If you believe it and are asked why, you might offer the following beliefs to support it:

1. I see that the cat is on the mat.

2. Seeing that X implies that X.

Together, these seem to constitute a good reason for believing the proposition:

3. The cat is on the mat.

But does this mean that proposition 3 is epistemically justified for you? Even if the combination of propositions 1 and 2 counts as a good reason to believe 3, proposition 3 is not justified unless both 1 and 2 are also justified. Do we have good reasons for believing 1 and 2? If not, then according to the good reasons account of justification, propositions 1 and 2 are unjustified, which means that 3 is unjustified. If we do have good reasons for believing 1 and 2, do we have good reasons for believing those propositions? How long does our chain of good reasons have to be before even one belief is justified? These questions lead to a classic dilemma.

a. The Dilemma of Inferential Justification

For simplicity, let’s focus on proposition 1: I see that the cat is on the mat.

Horn A: If there are no good reasons to believe proposition 1, then proposition 1 is unjustified, which means 3 is unjustified.

Horn B: If there is a good reason to believe proposition 1, say proposition 1a, then either 1a is unjustified or we need another belief, proposition 1b, to justify 1a. If this process continues infinitely, then 1 is ultimately unjustified, and, therefore, 3 is unjustified.

Either way, proposition 3 is unjustified.

Horn A of the dilemma is the problem of skepticism about justification. If our most obvious beliefs are unjustified, then no belief derived from them is justified; and if no belief is justified, we are left with an extreme form of skepticism. Horn B of the dilemma is called the regress problem. If every reason we offer requires a reason that also requires a reason, and so on, infinitely, then no belief is ultimately justified.

Both of these problems assume that all justification involves inferring beliefs from one or more other beliefs, so let’s call these two problems the dilemma of inferential justification (DIJ). And let’s call the assumption that all justification involves inference from other beliefs the inferential assumption (also called the doxastic assumption, Pollock 1986: 19).

Responses to this dilemma typically take one of two forms. On one hand, we might embrace Horn A, which is, in effect, to adopt skepticism and eschew any further attempts to justify our beliefs. This is the classic route of the Pyrrhonian skeptics, such as Sextus Empiricus, and some later Academic skeptics, such as Arcesilaus. (For more on these views, see Ancient Greek Skepticism.)

On the other hand, we might offer an explanation of how beliefs can be justified in spite of the dilemma. In other words, we might offer an account of epistemic justification that resolves the dilemma, either by constructing a third, less problematic option or by showing that Horn B is not as troublesome as philosophers have traditionally supposed. This non-skeptical route is the majority position and the focus of the remainder of this article.

Philosophers tend to agree that any adequate account of epistemic justification—that is, an account that resolves the dilemma—must do at least three things: (1) explain how a belief comes to be justified for a person, (2) explain what role justification plays in our belief systems, and (3) explain what makes justification valuable in a way that is not merely practically or aesthetically valuable.

b. Explaining How Beliefs are Justified

One of the central aims of theories of epistemic justification is to explain how a person’s beliefs come to be justified in a way that resolves the DIJ. Those who accept the inferential assumption argue either that a belief is justified if it coheres with—that is, stands in mutual support with—the whole set of a person’s beliefs (coherentism) or that an infinite chain of sequentially supported beliefs is not as problematic as philosophers have claimed (infinitism).

Among those who reject the inferential assumption, some argue that justification is grounded in special beliefs, called basic beliefs, that are either obviously true or supported by non-belief states, such as perceptions (foundationalism). Others who reject the inferential assumption argue that justification is either a function of the quality of the mechanisms by which beliefs are formed (externalism) or at least partly a function of certain qualities or virtues of the believer (virtue epistemology).

In addition to resolving the DIJ, theories of justification must explain what it is about forming or holding a belief that justifies it in order to explain how a belief is justified. Some argue that justification is a matter of a person’s mental states: a belief is justified only if a person has conscious access to beliefs and evidence that support it (internalism). Others argue that justification is a matter of a belief’s origin or the mechanisms that produce it: a belief is justified only if it was formed in a way that makes the belief likely to be true (externalism), whether through an appropriate connection with the state of affairs the belief is about or through reliable processes. The former view is called internalism because the justifying reasons—whether beliefs, experiences, testimony, and so forth—are internal mental states, that is, states consciously available to a person. The latter view is called externalism because the justifying states are outside a person’s immediate mental access; they are relationships between a person’s belief states and the states of the world outside the believer’s mental states (see Internalism and Externalism in Epistemology).

c. Explaining the Role of Justification

A second central aim of epistemology is to identify and explain the role that justification plays in our belief-forming behavior. Some argue that justification is required for the practical work of having responsible beliefs. Having certain reasons makes it possible for us to choose well which beliefs to form and hold and which to reject. This is called the guidance model of justification. Some philosophers who accept the guidance model, like René Descartes and W. K. Clifford, pair it with a strongly normative role according to which justification is a matter of fulfilling epistemic obligations. This combination is sometimes called the guidance-deontological model of justification, where “deontology” refers to one’s duties with respect to believing. Other epistemologists reject the guidance and guidance-deontological models for more descriptive models. Justification, according to these philosophers, is simply a feature of our psychology, and though our minds form beliefs more effectively under some circumstances than others, the conditions necessary for forming justified beliefs are outside of our access and control. This objective, naturalistic model of justification has it that our understanding of justification should be informed, in large part, by psychology and cognitive science.

d. Explaining Why Justification is Valuable

A third central aim of theories of justification is to explain why justification is epistemically valuable. Some epistemologists argue that justification is crucial for avoiding error and increasing our store of knowledge. Others argue that knowledge is more complicated than attaining true beliefs in the right way and that part of the value of knowledge is that it makes the knower better off. These philosophers are less interested in the truth-goal in its unqualified sense; they are more interested in intellectual virtues that position a person to be a proficient knower, virtues such as intellectual courage and honesty, openness to new evidence, creativity, and humility. Though justification increases the likelihood of knowledge under some circumstances, we may rarely be in those circumstances or may be unable to recognize when we are; nevertheless, these philosophers suggest, there is a fitting way of believing regardless of whether we are in those circumstances.

A minority of epistemologists reject any connection between justification and knowledge or virtue. Instead, they focus either on whether a belief fits into an objective theory about the world or whether a belief is useful for attaining our many and diverse cognitive goals. An example of the former involves focusing solely on the causal relationship between a person’s beliefs and the world; if knowledge is produced directly by the world, the concept of justification drops out (for example, Alvin Goldman, 1967). Other philosophers, whom we might call relativists and pragmatists, argue that epistemic value is best explained in terms of what most concerns us in practice.

Debates surrounding these three primary aims inspire many others. There are questions about the sources of justification: Is all evidence experiential, or is some non-experiential? Are memory and testimony reliable sources of evidence? And there are additional questions about how justification is established and overturned: How strong does a reason have to be before a belief is justified? What sort of contrary, or defeating, reasons can overturn a belief’s justification? In what follows, we look at the strengths and weaknesses of prominent theories of justification in light of the three aims just outlined, leaving these secondary questions to more detailed studies.

e. Justification and Knowledge

The type of knowledge primarily at issue in discussions of justification is knowledge that a proposition is true, or propositional knowledge. Propositional knowledge stands in contrast with knowledge of how to do something, or practical knowledge. (For more on this distinction, see Knowledge.) Traditionally, three conditions must be met in order for a person to know a proposition—say, “The cat is on the mat.”

First, the proposition must be true; there must actually be a state of affairs expressed by the proposition in order for the proposition to be known. Second, that person must believe the proposition, that is, she must mentally assent to its truth. And third, her belief that the proposition is true must be justified for her. Knowledge, according to this traditional account, is justified true belief (JTB). And though philosophers still largely accept that justification is necessary for knowledge, it turns out to be difficult to explain precisely how justification contributes to knowing.

Historically, philosophers regarded the relationship between justification and knowledge as strong. In Plato’s Meno, Socrates suggests that justification “tethers” true belief “with chains of reasons why” (97A-98A, trans. Holbo and Waring, 2002). This idea of tethering came to mean that justification—when one is genuinely justified—guarantees or significantly increases the likelihood that a belief is true, and, therefore, we can tell directly when we know a proposition. But a series of articles in the 1960s and 1970s demonstrated that this strong view is mistaken; justification, even for true beliefs, can be a matter of luck. For example, imagine the following three things are truth: (1) it is three o’clock, (2) the normally reliable clock on the wall reads three o’clock, and (3) you believe it is three o’clock because the clock on the wall says so. But if the clock is broken, even though you are justified in believing it is three o’clock, you are not justified in a way that constitutes knowledge. You got lucky; you looked at the clock at precisely the time it corresponded with reality, but its correspondence was not due to the clock’s reliability. Therefore, your justified true belief seems not to be an instance of knowledge. This sort of example is characteristic of what I call the Gettier Era (§6). During the Gettier Era, philosophers were pressed to revise or reject the traditional relationship.

In response, some have maintained that the relationship between justification and knowledge is strong, but they modify the concept justification in attempt to avoid lucky true beliefs. Others argue that the relationship is weaker than traditionally supposed—something is needed to increase the likelihood that a belief is knowledge, and justification is part of that, but justification is primarily about responsible belief. Still others argue that whether we can tell we are justified is irrelevant; justification is a truth-conducive relationship between our beliefs and the world, and we need not be able to tell, at least not directly, whether we are justified. The Gettier Era (§6) precipitated a number of changes in the conversation about justification’s relationship to knowledge, and these remain important to contemporary discussions of justification. But before we consider these developments, we address the DIJ.

2. Internalist Foundationalism

One way of resolving the DIJ is to reject the inferential assumption, that is, to reject the claim that all justification involves inference from other beliefs. The most prominent way of doing this while avoiding skepticism is to show that all chains of good inference culminate at a unique kind of belief called a basic belief. Basic beliefs are beliefs that need not be inferred from any other beliefs in order to be justified. This approach to resolving the dilemma is called foundationalism because basic beliefs serve as a foundation on which all other justified beliefs are supported; a person’s beliefs are related to one another like the parts of a building: beliefs justified by inference are analogous to the roof and walls, which are in turn supported by foundational basic beliefs (see Figure 1).

Foundationalism comprises a family of views, all of which claim, at minimum, that all justified beliefs are either basic or inferred from other justified beliefs. Classically, foundationalists combine this view with the claims that we can know whether a belief is justified—that is, whether it stands in an evidential chain that starts with a basic belief—and the claim that knowing whether we are justified helps us fulfill our epistemic duties—in other words, we do well when we form or keep beliefs that are well supported and discard or refuse beliefs that are not; we do poorly when we do not.

The view that justification is a matter of having certain internal mental states is called internalism, and the family of views that include both is called internalist foundationalism. There is a further debate among internalists as to whether justification requires simply having certain mental states (propositional justification) or whether justified beliefs must be based on those mental states (doxastic justification). Philosophers who reject internalism are called externalists (see §7 of this article). Another debate among internalists is whether justification helps us to fulfill epistemic duties—that is, it tells us which beliefs are epistemically permissible, obligatory, or impermissible (the deontological conception of justification)—or whether it is simply a descriptive fact about our belief systems. (For an example of the latter, see Conee and Feldman 2004).

figure 1

Figure 1: Simple Foundationalist Justification. The dots represent beliefs; the arrowsrepresent inferential relations.

a. Basic Beliefs

It is one thing to say basic beliefs resolve the DIJ and quite another thing to explain how they do. René Descartes famously argued that some beliefs are basic because they are indubitable. If a belief is genuinely indubitable, Descartes argued, it cannot be false. As it is commonly understood, dubitability is a psychological, not epistemic, matter. It might be indubitable for me that my mother loves me, even if it is not true and even if it is the sort of belief that could be doubted, even perhaps by me. But Descartes used “indubitable” to describe a belief that is clear and distinct, which is supposed to guarantee that the belief is true. (See Harry Frankfurt, 1973 for a fuller discussion of clarity and distinctness.) Other foundationalists have explained how some beliefs might stop the regress in virtue of self-evidence, or their privileged role in our belief-forming systems, or their incorrigibility.

Long before Descartes, simple mathematical propositions, such as 2 + 2 = 4, and logical propositions, such as “no one is taller than herself,” were thought to be so obvious that they could not be false. These propositions, many claimed, are self-evidently true, that is, they need no supporting evidence because any attempt to support them would be weaker than their intuitive truth. Some philosophers include perceptual experiences among self-evident beliefs, experiences such as seeing red and hearing a ringing sound. Even if you misperceive a color or a sound, or misperceive what seems to be colored or what seems to be ringing, you cannot doubt that you are having the experience of seeing redness or hearing ringing.

Another explanation for why some beliefs are basic is that they play a privileged role in our belief-forming systems. A common example of beliefs privileged in this way are those formed on the basis of sensory perception: seeing a red ball, touching what feels like a rough surface, hearing a bell. You could be hallucinating these experiences, so it is not self-evident that there is a ball, bell, or surface to experience. Nevertheless, the world impresses itself on you in this way, and it would be difficult to imagine functioning without any sense perceptions whatsoever; they play a highly privileged role in our belief systems and, therefore, can justify other beliefs (hence the emphasis that scientists have traditionally placed on observation).

Further candidates for basicality are beliefs that are true in virtue of being believed, that is, if you believe them, they are true. For example, propositions about intentional states (in other words, states about a mental state, such as hoping, doubting, thinking, believing, and so forth), logically imply the existence of the subject who is in the state. So, for anyone who, while thinking, believes the proposition “I think” can logically infer “I exist.” Beliefs that if held are true are called incorrigible. Other examples may include beliefs about introspective states such as what you believe or feel or remember. If incorrigible beliefs can be recognized as true without appeal to any other beliefs, they are good candidates for justifying other, non-basic beliefs.

Unfortunately, it is not easy to see how all of our many and various non-basic justified beliefs can be inferred from this relatively small set of basic beliefs, even if we accepted every type of basic belief just mentioned. For example, imagine you have been looking for your laptop computer. When you find it, you form the belief, “There’s my laptop.” Did seeing your computer elicit the basic belief, “I seem to be perceiving a laptop there,” from which you then inferred the belief “There’s my laptop”? Not obviously. Seeing the laptop allowed you directly—without any reasoning at all—to form the belief that you found your laptop.

Examples like these have motivated some foundationalists to expand their accounts of basic beliefs to include a wider variety of experiences. These weaker accounts allow that there are many types of non-inferentially justified beliefs, all of which are at least properly basic, where “properly basic” means a belief that is either basic in the classic sense or that meets some other condition that makes it non-inferentially justified for a person. As long as there are a sufficient number of properly basic beliefs, these philosophers argue, a certain sort of foundationalism remains plausible.

One example of how proper basicality might work is Alvin Plantinga’s (1983; 1993a) argument for the rationality of religious belief. Plantinga’s notion of proper basicality is supposed to be weak enough to avoid problems with classic basic beliefs but strong enough to avoid the DIJ. According to him, if a belief is properly basic for a person, it is rational for that person to accept it without appealing to other reasons. He uses rational instead of justified to distance himself from classical problems. (Sometimes Plantinga puts it even more weakly, such that, if a belief is properly basic for a person, that person is not irrational in holding it.) As an example, Plantinga argues that if a person is raised in a religious community where the central religious claims he hears are corroborated by the community and none of those claims is undermined by contrary experience or argument, he is not violating any epistemic duty in believing that, say, God exists. His experiences and circumstances can “call forth belief in God” in a way that does not require other beliefs and can serve as a reason to accept other beliefs (1983: 81). This is a controversial view, not least because it either changes the discussion from justification to rationality or conflates justification and rationality. Nevertheless, basic beliefs are controversial no matter how they are characterized, and Plantinga’s proper basicality is just one among several. For another attempt to defend classical foundationalism against objections, see Timothy McGrew (1995).

b. Arguments For and Against Foundationalism

Foundationalism has remained competitive in the history of justification largely because of its intuitive advantages over competing views. The most common argument for foundationalism is the positive argument that it explains how we actually form beliefs on the basis of evidence. I believe the sky is blue because I see that it is blue, not because I infer it from other beliefs about the sky. Roderick Chisholm offers a sophisticated version of this argument, concluding that “[t]hinking and believing provide us with paradigm cases of the directly evident” (1966: 28). In addition to this positive argument, foundationalists offer the negative argument that no alternative account—skepticism, coherentism, or infinitism—has the resources to satisfactorily resolve the DIJ, that is, to avoid both skepticism and an infinite regress (see BonJour and Sosa 2003). This is, perhaps, the more powerful of the arguments and merits some attention.

Skepticism motivated epistemologists to inquire into justification in the first place, so the skeptical option is generally considered a loss. As an alternative, coherentists (§3) maintain that a person’s beliefs are justified in virtue of their relationship to the person’s belief set (see Lehrer 1974). If a belief stands or can stand in a consistent, mutually supportive relationship with other beliefs—a “web of belief,” as W. V. O. Quine (1970) calls it—that belief is justified. However, there is reason to believe that, since all beliefs stand in mutually supportive relationships, at least some beliefs (perhaps all) will play an indispensable role in their own support, rendering any coherentist argument viciously circular. Since circular arguments are fallacious, if coherentism entails that justification is circular, coherentism cannot resolve the DIJ.

A more recent alternative to skepticism is infinitism (see §4), according to which all justified beliefs stand in infinite chains of inferential relations (see Klein 2005). Skepticism is avoided because every belief is justified by some other belief. Unfortunately, infinitism requires that we accept one of two questionable assumptions: either that there simply is an infinite number of justifying beliefs available (and to which our minds, in virtue of being finite, do not have access) or that there is some algorithm that, for any belief, B, can direct us to a non-circular justifying belief for B. The problem with the former assumption is that it seems to depend on faith that there is an infinite series of justifiers, which is not obviously better than having no justification at all. And the problem with the latter is that it comes dangerously close to foundationalism, where the algorithm functions as a basic belief. If the infinitist cannot refute these objections, it cannot resolve the DIJ.

These are simple concerns about coherentism and infinitism, and we consider more sophisticated objections in sections 3 and 4. But, if neither coherentism nor infinitism can provide an alternative means of resolving the original dilemma, foundationalism may be the most promising alternative to skepticism. Unfortunately for foundationalists, even if they are right that some account of basic belief would adequately resolve the dilemma of inferential justification, it is not clear that such an account is currently available. Further, there are at least two other serious objections to foundationalism.

First, there is some concern that foundationalism cannot be justified by its own account of justification, that is, foundationalism is self-defeating. Alvin Plantinga (1993b) offers a version of this objection. According to foundationalists, a belief is justified if and only if it is either basic or inferred from other justified beliefs. This criterion, though, is not itself basic on any classical conception of basic beliefs (indubitability, self-evidence, evident to the senses, or incorrigibility), and it is not clear how it could be supported by other justified beliefs.

One straightforward response to this objection is that the arguments above (the positive argument and the negative argument by elimination), do provide, contra Plantinga, inferential support for foundationalism. In fact, Plantinga (1983; 1993) expands his own notion of proper basicality precisely to avoid the self-defeat objection. Further, if sophisticated reasoning strategies like induction could be justified on foundationalist grounds, then foundationalism itself may be justified on such grounds. For example, Laurence BonJour (1998) defends rational insight as a basic source of evidence and then argues that induction is justified by rational insight. If foundationalism is roughly correct and there are arguments grounded in rational insight that justify foundationalism, foundationalism might be vindicated. Of course, there remain concerns about the circularity of such arguments.

Other philosophers use an inference to the best explanation to defend a type of basic evidence, though these views may rightly be regarded as hybrids of foundationalism and coherentism. For example, Earl Conee and Richard Feldman (2008) argue that “[p]erceptual experiences can contribute toward the justification of propositions about the world when the propositions are part of the best explanation of those experiences that is available to the person.” The idea that what have been called basic beliefs are connected with the world and how we are positioned in the world is a better explanation of why we have the evidence we have than traditional accounts of justification. Catherine Z. Elgin (2005) offers a similar account, arguing that, while perceptions have “initial tenability” given their privileged role in our belief formation, they do not obtain this tenability in isolation from our whole evidential context; over time, certain perceptual beliefs have proved themselves to have the plausibility that allows us to privilege them.

A second objection to foundationalism is the meta-justification argument. The idea is that basic beliefs cannot resolve the DIJ because, even if their justification does not depend on other beliefs, it does depend on reasons which themselves require reasons. If I believe a proposition because it is indubitable, then I must have some reason for thinking that indubitable beliefs are likely to be true. If I do not, I am stuck with Horn A, and if I do, I am stuck with Horn B. To demonstrate this problem, Peter Klein (2005) asks us to imagine an argument between Fred and Doris, where Fred has come to what he regards as the basic belief on which his argument depends; call it b.

According to Fred, b has autonomous justification, that is, is a type of basic belief. Doris happens to agree that b is autonomously justified but asks whether beliefs with autonomous warrant are likely to be true. As a foundationalist, the most plausible option for Fred is the following: “He can hold that autonomously warranted propositions are somewhat likely to be true in virtue of the fact that they are autonomously warranted” (2005: 133).

If Fred is right, however, b only works as a justification for the rest of his argument precisely because he has added something to b. What has he added? Namely, that he “has a very good reason for believing b, namely b has F and propositions with F are likely to be true.” These are propositions independent of b that serve to justify b. Klein continues: “Of course Fred, now, could be asked to produce his reasons for thinking that b has F and that basic propositions are somewhat likely to be true in virtue of possessing feature F” (2005: 134). If this is right, basic beliefs do not stop the regress of reasons (see also Smithies 2014).

One response to this criticism comes from Laurence BonJour, who argues that it is plausible to think that understanding b includes a sort of built-in awareness of the content of those additional premises Klein mentions, such that understanding b constitutes, in and of itself, a reason to hold b (BonJour and Sosa 2003: 60-68). If it is possible to have an evidential state that includes, non-inferentially, all the content necessary for having a reason to believe a proposition is true, foundationalists may be able to describe a basic belief that stops the regress and avoids skepticism. But explaining just what this state is remains a point of controversy.

Another response is to construct an inference to the best explanation, as mentioned above in response to the self-defeat objection (Elgin, 2005; Conee and Feldman, 2008). The result, again, is typically a hybrid view, which may be equivalent to giving up foundationalism. Conee and Feldman say their view is closer to a “non-traditional version of coherentism” (2008: 98). And Elgin calls her view “a very weak foundationalism or…a coherence theory” (2005: 166). This raises questions about the merits of coherentism, to which we now turn.

3. Internalist Coherentism

Like foundationalists, coherentists attempt to avoid skepticism while rejecting infinitism. But they find a further problem with foundationalism. Every sensory state (seeing red, smelling cinnamon, and so forth) must be understood in a mental context, that is, one must have a set of background experiences, beliefs, and vocabulary sufficiently large for forming and understanding beliefs. All sensory beliefs, such as “I see red” and “I smell cinnamon,” require an immensely complex set of assumptions about self-reference, seeing, colors, smelling, and scents. This means that individual beliefs are not isolated bits of information that act as bricks in a building; they are nodes of information that depend for their meaning and support on a web of relationships with other beliefs.

Many coherentists accept the inferential assumption and argue that the result is not an infinite regress of inferences, but a non-linear system of support from which justification emerges as a property of the combination of inferences. As Donald Davidson puts it, “[N]othing can count as a reason for holding a belief except another belief” (2000: 156). Other coherentists reject the inferential assumption and argue that the result is a non-linear system of support from which justification emerges as a property of the set as a whole. Keith Lehrer explains: “This does not make the belief self-justified, however, even though it might be non-inferential. The belief is not justified independently of relations to other beliefs. It is justified because of the way it coheres with other beliefs belonging to a system of beliefs” (1990: 89). As we see below (§3.b), some coherentists reject the belief requirement of the inferential assumption, arguing that perceptual experiences can play a justifying role in the set of mental states that includes a person’s beliefs.

Regardless of whether coherentists accept the inferential assumption, they can allow that some beliefs are non-inferentially generated—for example, by experiences, intuitions, hunches, and so forth. But they are committed to the idea that the justification for beliefs generated in these ways depends essentially on their relationship to the person’s complete set of beliefs. Construed in this way, coherentism is specifically a view about justification and should not be confused with coherentism about a truth. Some philosophers have held both coherentism about truth and justification (Blanshard 1939 and Lewis 1946), but many who hold coherentism about justification reject coherentism about truth (see BonJour 1985, ch. 5, and Truth).

a. Varieties of Coherence

Broadly, coherentists argue that a belief is justified just in case it stands in a system of mutually supporting relationships with other beliefs in a person’s system of beliefs. For instance, my belief that the cat is on the mat involves a complicated set of beliefs: I am seeing a cat, I am seeing a mat, I am seeing a cat on a mat, a cat is a particular kind of mammal, a mat is a particular type of floor covering, my vision is generally reliable under normal circumstances, these are normal circumstances, and so forth. It is difficult to imagine arranging these in a linear, foundationalist fashion. In addition, it is not clear whether some of these beliefs are more basic than some others. Nevertheless, they all cohere, which means they are logically consistent with one another and with other beliefs in my belief set, and they mutually support one another. The challenge for coherentists is to explain just what “mutual support” amounts to.

Whereas foundationalists employ the metaphor of a building (or a pyramid, in some cases) to explain justificational relationships, coherentists employ the metaphor of a web (or, in some cases, a raft), according to which, each node (or plank) works alongside the others in a non-linear fashion to constitute a stable, interconnected whole (see Figure 2, as well as Neurath 1932, Quine 1970, and Sosa 1980). There are four candidates for how the web or raft holds together: logical consistency, logical entailment, inductive probability, and explanation.

figure 2

      Figure 2: Simple Coherentist Justification

                P-S represent propositions;

         the arrows represent lines of inference.

The first candidate, logical consistency, is generally regarded as necessary for coherence but too weak to stand on its own. For example, the belief that P and the belief that probably not-P are logically consistent. But they are not coherent; if one of them is true, the other is not likely to be true (BonJour 1985, ch. 5). Therefore, some early coherentists added that the relationship must also include logical entailment. This view, which I will call entailment coherentism, has it that a belief is justified just in case it entails or is entailed by every other belief in a person’s belief set (Blanshard 1939). Most coherentists now reject this relationship as overly strict, primarily because it seems possible to have two very different beliefs, neither of which entails the other and yet which are both justified. For example, consider the beliefs “I am seeing a needle puncture my skin” and “I am feeling pain.” Neither belief entails the other; nevertheless, it is intuitively plausible that both belong to a coherent set of beliefs.

Because of the problems with mere consistency and consistency plus entailment, most coherentists allow that entailment is sufficient for coherence but not necessary. To capture weaker relationships, they expand the notion to include inductive probability. Inductive probability coherentism is the view that a belief is justified just in case it is a member of a set each of whose members is entailed by or made more probable by a subset of the rest. C. I. Lewis, calling this type of justification “congruence,” puts it eloquently: “A set of statements, or a set of supposed facts asserted, will be said to be congruent if and only if they are so related that the antecedent probability of any one of them will be increased if the remainder of the set can be assumed as given premises” (1962). With their emphasis on inferential relations among beliefs, entailment and inductive probability coherentism attempt to resolve the DIJ by capturing the intuitive plausibility of the inferential assumption while avoiding the difficulties with basic beliefs.

Unfortunately, inductive probability coherentism faces problems similar to those that face entailment coherentism. It seems plausible for a person to hold two justified beliefs without the antecedent probability of either increasing the epistemic probability of the other, even when conjoined with other beliefs in the set. Consider, for example, your beliefs that “the Red Sox will win the Pennant” and “John F. Kennedy was shot in 1963.” Both beliefs are reasonably part of a person’s belief system, and yet it is difficult to see how one might contribute to a set of beliefs that makes the other more probable. Second, even if a subset of beliefs in a set increase the probability of each other member, the set might not be sufficiently comprehensive or well-connected with one’s experiences to justify one’s beliefs. Imagine a set of 100 beliefs, any 99 of which render the 100th member more probable than its antecedent probability. This set passes the inductive probability test and is, therefore, coherent on this account, but it includes very few beliefs. This suggests that, in order to maintain coherence, we could arbitrarily expand or contract our set of beliefs at will to avoid loss of rationality. The only guideline is that we preserve strong inductive inferences. Unfortunately, such arbitrary sets ignore important differences in the sources of beliefs; we can imagine two inductively coherent sets, one that includes sensory beliefs and one that does not. Inductive probability coherentism, without further qualification, implies that neither set is more rational than the other. As Catherine Z. Elgin puts it, “A good nineteenth-century novel is highly coherent, but not credible on that account. Even though Middlemarch is far more coherent than our regrettably fragmentary and disjointed views…, the best explanation of its coherence lies in the novelist’s craft, not in the truth…of the story” (2005: 159-60).

A third prominent account of coherence aimed at avoiding this criticism allows that entailment and inductive probability can contribute to coherence but only insofar as they function in a plausible explanation of the set of beliefs. According to this view, known as explanatory coherentism, beliefs are justified just in case they explain or are explained by the other beliefs of the same type (Harman 1986 and Poston 2014). This view is not committed to the inferential assumption and argues that justification is an emergent property of the explanatory relations among beliefs. Catherine Z. Elgin says that “epistemic justification is primarily a property of a suitably comprehensive, coherent account, when the best explanation of coherence is that the account is at least roughly true” (2005: 158). Elgin adds that the beliefs comprising a coherent system “must be mutually consistent, cotenable, and supportive. That is, the components must be reasonable in light of one another” (2005: 158).

Explanatory coherentism takes its motivation from responses to a problem in philosophy of science that was similar to the problem that faces inductive probability coherentism (Neurath 1932 and Hempel 1935). Not every proposition in a scientific theory is derived inferentially from others, and so there is some question as to whether such propositions could be believed justifiably. It turns out, though, that those propositions play an important explanatory role in the theory that organizes evidence and concepts in plausible ways, even if those propositions have no antecedent probability outside of the system. Elgin explains, “For example, although there is no direct evidence of positrons, symmetry considerations show that a physical theory that eschewed them would be significantly less coherent than one that acknowledged them. So physicists’ commitment to positrons is epistemically appropriate” (2005: 164). This suggests that explanations can play a justifying role independently of inferential relations, thus lending plausibility to coherentism.

Explanatory coherence avoids criticisms of earlier accounts in that it (1) maintains that consistency is an important constraint on a belief set, and (2) maintains that inferential relations contribute to explanatory power, while (3) also accounting for the intuitive connection of certain beliefs with sensory evidence and non-inferential coherence relations. Nevertheless, some criticisms have led philosophers like BonJour (1985), Lehrer (1974; 1990), and Poston (2014) to add other interesting and influential conditions to coherence theories, though space prevents us from exploring them here.

b. Objections to Coherentism

There are three prominent objections to coherentism. The first, which we already encountered in §2.b, is called the circularity problem. Since coherentism depends on mutual support relations, every particular belief will likely play an essential role in its own justification, rendering coherentist justification a form of circular argument (see Figure 3).

figure 3

The problem with circular justification is that it putatively undermines the goal of justification, which is to garner support for claim. If a claim is inferred from itself (P à P), the concluding proposition has only as much support as the premise, but that is precisely what we do not know. Therefore, multiplying the inferences between a proposition and an inference to that proposition (for example, (P à Q); (Q à R); (R à S); (S à P)) cannot justify P.

In response, some coherentists argue that the circularity objection oversimplifies the view. While it is true that a belief will almost certainly play a role in its own justification, this is only problematic if we assume the justificational relationship is linear. Properly understood, justification is a property that emerges from non-linear relationships among beliefs, whether inferential or non-inferential. For example, Catharine Z. Elgin tells a story about Meg (adapted from a story by Lewis 1946), whose logic textbook was stolen. There were three witnesses to the theft, but all are unreliable witnesses (one is aloof, one has severe vision problems, and one is a known liar). Nevertheless, all three witnesses agree that the thief had spiked green hair. Despite the fact that no one of the witnesses is reliable, their independent testimony to a single, unique proposition increases the likelihood that the proposition is true. As Elgin puts it, “This [agreement] makes a difference. … Their accord evidently enhances the epistemic standing of the individual reports” (2005: 157). If this is right, the antecedently low probability of the thief’s having spiked green hair can be added to the combined strength of the testimonies to create a justified belief without vicious circularity.

A second objection to coherentism is called the isolation objection. Even if a collection of beliefs could explain, and thereby justify, its members, it is not obvious how this set of beliefs is connected with reality, that is, with the content the beliefs are about. In rejecting basic beliefs, coherentists reject privileging any particular cognitive state in the belief system, such as sensory experiences. All beliefs are treated equally and are evaluated according to whether they cohere with the belief set. But beliefs can cohere with one another regardless of whether their content expresses true propositions about reality. Coherence cannot guarantee that the set is not isolated from reality.

Some coherentists respond to this objection by making special provisions for beliefs that derive from coherence-increasing sources, such as sense experience. (BonJour (1985) calls such beliefs “cognitively spontaneous beliefs.”) This makes the degree of coherence partly a matter of how well the system of beliefs integrates sense perception. Others appeal to more abstract distinctions among types of justification. For example, Keith Lehrer (1986) distinguishes personal justification, which involves the traditional, internalist coherence requirement, from verific justification, which is an externalist requirement on coherence. While objective coherence may be outside a person’s ken, it nevertheless contributes, along with personal justification, to what Lehrer calls complete justification. This externalist requirement helps to ground a person’s system of beliefs in the world those beliefs are supposed to be about.

Another coherentist response to the isolation objection is to allow experience itself, not just beliefs about experience, to figure in the evaluation of coherence. Catherine Z. Elgin (2005) argues that we have good reasons to privilege some perceptual experiences over very coherent sets of beliefs. She argues that this is because perception does not—contra foundationalists—work in isolation from other sorts of evidence. She says, “Only observations we have reason to trust have the power to unseat theories. So it is not an observation in isolation, but an observation backed by reasons that actually discredits the theory” (162). This also explains how we are able to privilege some perceptual experiences over others (say, in unfavorable conditions), though she admits that her view includes “something other than coherence,” and allows that it is a very weak form of foundationalism. For a reply along these lines that maintains a more traditional version of coherentism, see Kvanvig and Riggs (1992).

A third objection is called the plurality objection. Because justification is determined solely by the internal coherence of a person’s beliefs, coherence theory cannot guarantee that there is “one uniquely justified system of beliefs” (BonJour 1985: 107). BonJour explains that this is because “on any plausible conception of coherence, there will always be many, probably infinitely many, different and incompatible systems of belief which are equally coherent” (ibid.). To show just how pernicious this problem is, Lehrer asks us to imagine one set of beliefs comprised of both necessary and contingent beliefs and then to imagine a second set created by negating all the contingent beliefs in the first set (1990: 90). This has the nasty implication that, if coherence is sufficient for justification, then “for any contingent statement a person is completely justified in accepting is such that he is also completely justified in accepting the denial of that statement” (ibid.).

One response to the plurality objection is to invoke a “total evidence” requirement on explanatory and probabilistic relations. While we can arbitrarily construct probabilistically and explanatorily coherent sets, there is a non-trivial sense in which non-belief states explain our beliefs: sensation, testimony, and so forth. A theory of explanation that includes the antecedent probabilities of the beliefs based on this evidence would be more coherent with our total evidence than an arbitrary set of beliefs that ignores them. Recent debates over the relationship between coherence and truth include sophisticated analyses of probabilistic assessments (Klein and Warfield 1994 and Fitelson 2003) and an interesting argument for the impossibility of coherence’s increasing the probability that a belief is true (Olsson 2009), but there is not space to develop these arguments here.

For more on coherentism, see Coherentism in Epistemology.

4. Infinitism

Infinitism is an internalist view that proposes to resolve the dilemma of inferential justification by showing that Horn B of the DIJ, properly construed, is an acceptable option. In fact, argue infinitists, there are no serious problems with an infinite chain of justifying beliefs.

Traditionally, epistemologists have rejected the idea that a belief’s linear chain of justifying beliefs can extend infinitely because it leaves all beliefs ultimately unjustified. Inferential justification is said to transmit justification, not create it; therefore, an infinite chain of justifying beliefs would have no source of support to transmit. Similarly, since one could not hold an infinite number of beliefs or mentally trace an infinitely long chain of beliefs, infinitism betrays a common internalist intuition that a person must be aware of good reasons for holding a belief.

Infinitists claim these criticisms are misguided. In practice, justification is not as tidy as epistemologists would have us believe. The traditional idea that the regress must stop or bottom out in basic beliefs is unrealistic and unnecessary. Few of us attempt to draw inferences long enough to arrive at basic beliefs. We often stop looking for reasons when we are content that we have fulfilled our epistemic responsibility, not because the chain has actually ended (Aikin 2011). Foundationalists and coherentists, then, are relatively unconcerned with ultimate justification in their own epistemic behavior and, therefore, to hold epistemic justification to such high standards renders very few of our beliefs justified. To accommodate this messiness, infinitists might reject the inferential assumption, at least as classically understood. Like coherentists, infinitists may hold that justification is an emergent property of a set of beliefs and that justification comes in degrees such that, the longer the inferential chain, the stronger the degree of justification (Klein 2005).

a. Arguments for Infinitism

There are two main lines of argument for infinitism. The first is that foundationalism and coherentism cannot stop the structure of justification from regressing infinitely. For example, Peter Klein (2005) constructs a version of the meta-justification argument against foundationalism and argues that the most plausible version of coherentism (emergent justification accounts), because of its appeal to a basic assumption about the reliability of coherent sets, is merely a disguised form of foundationalism. If these arguments hit their mark, and if externalism is ruled out, infinitism may be the only non-skeptical option available.

The second main line of argument for infinitism is that the classic objections to infinitism are aimed at overly simplistic versions of the view; they do not threaten suitably qualified versions. For example, Scott Aikin (2009) argues that concerns about the regress arise because of a conflict between two types of intuition: (1) proceduralism, which includes our standard intuitions about good reasons and responsible believing, and (2) egalitarianism, which includes our intuitions that people are generally justified in believing a lot of things (beliefs about how to set DVRs and beliefs about how to get from home to work). Aikin claims that infinitists take the demands of proceduralism more seriously than egalitarian intuitions, maintaining that justification and knowledge are very difficult to attain. The more committed we are to following our chains of evidence, the more likely we are to attain our epistemic goals. However, we often stop far from what even foundationalists would take to be the end of those chains. And at every proposed stopping point, there is an infinite number of justificational questions about the appropriateness of the terms we are using, the reliability of our perceptions and concept attributions, and so forth. If this is right, infinitism may be the most plausible implication of our epistemic intuitions.

Similarly, Peter Klein (2014) argues that infinitism is a minimal thesis about what makes justification valuable, namely, that it renders our beliefs “reason-enhanced.” He says, “Infinitism holds that a belief-state is reason-enhanced whenever S deploys a reason for believing that p. Importantly, S can make a belief-state reason-enhanced even if the basis is another belief-state that is not (yet) reason-enhanced” (2014: 105). If this is right, then the process of inferring can create or produce original epistemic support, and we need not appeal to anything like basic beliefs for ultimate support. Further, infinitists do not object to a chain of inference’s stopping, for instance, when some presuppositions are explicit. For example, reasoning about Euclidean geometry may appropriately stop at Euclid’s axioms when we agree that they are our standard of evaluation. But we can also admit that those axioms can be challenged, and our reasoning could continue indefinitely. Infinitists simply argue that this is a standard feature of all justification.

b. Objections to Qualified Infinitism

Carl Ginet (2005) argues that even qualified infinitism is motivated on spurious grounds. One argument against foundationalism is that, even for basic beliefs, one needs a reason to believe they are true, and this initiates an infinite regress of reasons. Ginet objects, however, that this argument threatens foundationalism only if all reasons are inferential reasons. Of course, this is precisely what foundationalists reject. If some non-belief reasons are justified independently of any additional reasons for thinking they are true, that is, if they are inherently reasonable, the infinitist argument against foundationalism is question-begging.

In response, the infinitist might contend that, even if its critique of foundationalism is flawed, infinitism may yet be the more plausible alternative. If infinitism captures our intuitions about justification as adequately as foundationalism, and if it requires fewer controversial concepts (basic beliefs), infinitism may be an attractive competitor.

Another objection to infinitism is that, given our finite minds, we lack complete access to the infinite set of justifying beliefs. If a person has no access to his reason for belief, then infinitism is no longer internalist and, thereby, loses its means of defusing the DIJ. Of course, the infinitist may concede this and fall back on a mentalist account of epistemic access (see §5.a below). As Ginet puts it: a belief (L) “is available to S as a reason for so believing only if S is disposed, upon entertaining and accepting (L), to believe that the fact that (L) was among his reasons for so believing” (2005: 146). If this is right, a person may have a disposition to recognize further evidence for his justifying beliefs when prompted to do so.

Nevertheless, even this mentalist-enhanced infinitism faces the concern that the process of justification is never complete. An assumption behind the DIJ is that, if for any belief, there is not a reason to believe it is true, that belief and any beliefs inferred from it are unjustified. If this is right, and the justification condition for infinitism is never actually met, then we are left with skepticism.

A variation on this criticism is the idea that inferential justification can only transmit justification and cannot originate it. The idea is that all inference is conditional ((P → Q); (Q → R); (R → S)). Given this set of propositions, is S justified for us? That depends on whether P is justified. Telling us that P is justified by N, (N → P), though, does not answer the question of whether S is justified. We still need to know whether N is justified (Dancy 1985: 55). If this is right, then no matter how long the chain of inference is—even if it is infinite—no belief is justified.

Infinitists may respond to this objection by arguing that the justification condition is not a matter of getting to a final, infinitely large set, but of increasing one’s epistemic reasons for the proposition in question. Peter Klein, using the term “warrant” for “justification,” says that infinitism is like coherentism in this respect. He says, “Infinitism is like the warrant-emergent form of coherentism because it holds that warrant for a questioned proposition emerges as the proposition becomes embedded in a set of propositions” (2005: 135). Further, Klein explains that “warrant increases not because we are getting closer to a basic proposition but rather because we are getting further from the questioned proposition” (137). This amounts to a rejection of the claim that inferential justification can only transmit justification and, therefore, that a justificational chain must be complete in order to be adequate (recall Catherine Z. Elgin’s story about Meg in §3.b above).

A worry for this response is similar to a worry for coherentism. Any criterion that implies the infinite set of beliefs is justified is either part of the set or independent of it, in which case, it, too, needs a justification. If some sort of justification-conferring awareness is built into the increasingly large set, infinitism seems like foundationalism in disguise.

A further worry is that, if infinitists do not require that a person actually have an infinite number of justifying beliefs or perform an infinite number of inferences, then infinitism seems committed to the idea that inference itself can create justification. This, however, seems implausible. Carl Ginet writes, “…acceptable inference preserves justification … [but] there is nothing in the inferential relation itself that contributes to making any of those beliefs justified” (2005: 148-49). If inference cannot produce justification, it is unclear how a belief in an infinite chain of inferences comes to be justified.

For a more detailed treatment of infinitism, see Infinitism in Epistemology.

5. Types of Internalism and Objections

As noted above (§2), the view that justification is something we can determine by directly consulting our mental states is called internalism. This view does not entail that all epistemic concepts are internal. John Greco gives an example to demonstrate the difference: “[S]uppose that someone learns the history of his country from unreliable testimony. Although the person has every reason to believe the books that he reads and the people that teach him, his understanding of history is in fact the result of systematic lies and other sorts of deception” (2005: 259). Objectively speaking, this person’s beliefs are not reliably connected with reality. Subjectively, though, he is following his evidence to their rational conclusion. Should we say this person’s beliefs are justified? Since the reliability of his sources is beyond his ability to evaluate, the internalist says he has fulfilled his epistemic duty: yes, he is justified.

For centuries, there was no serious alternative to internalism. As we will see in §6, the advent of the Gettier case in the 20th century constitutes a serious challenge to internalism, and it contributed to alternative, externalist accounts of knowledge and justification. This move to externalism also led to closer scrutiny of internalism, and new concerns about its adequacy arose. I review just two of these here. But before doing so, it is helpful to distinguish two types of internalism: accessibilism and mentalism.

a. Accessibilism and Mentalism

According to accessibilists, in order for a belief to be justified for a person, that person must have “reflective access” to good reasons for holding that belief. To have reflective access is to be directly mentally aware of reasons for holding a belief. Some accessibilists argue that a person’s access must be occurrent, that is, she must be currently aware of her reasons for holding a belief (Conee and Feldman 2004). Others hold the looser requirement that, as long as a person has had direct access to relevant justifying reason, she is justified in holding the supported belief.

According to mentalists, reflective access may be sufficient for justification, but it is not necessary. All that is necessary for a belief to be justified is that a person has mental states that justify the belief, regardless of whether a person has reflective access to those states. Mentalists allow that some non-reflectively accessible mental states can justify beliefs.

Mentalism is supposed to have several advantages over accessibilism given the standard criticisms of internalism. For example, some have objected to internalism on the grounds that it cannot accommodate intuitive cases of stored or forgotten evidence. If, for example, you are driving and not thinking about whether Washington, D.C. is the capital of the United States, or you have forgotten any evidence for this belief, are you justified in believing that it is? If not, could we say that you know it is the capital? Accessibilists claim that a person must be able to access her evidence for a belief while she is currently thinking about it and presumably without prompting. Few of us, though, hold (or even could hold) a belief with all its attendant reasons in mind at once. Similarly, it seems reasonable to imagine that a person is justified in believing a proposition for which she has forgotten her evidence. Mentalists can handle these cases by claiming that the ability to access stored facts can constitute dispositional justification, and that even in cases of forgotten evidence, it could still be the case that the fact that it is justified is consciously available, either occurrently or dispositionally (Conee and Feldman 2004).

The worry for mentalism is that, in allowing non-occurrent mental states to count as reasons, mentalism betrays its claim to be internalist. For example, there may be a lot of evidence I could have that P is true if I were in the right place at the right time. But the existence of that evidence does not obviously justify P for me since being in such a place might be a matter of luck. Being at the right place at the right time may mean that the evidence that, say, “Washington is the capital,” is in a book nearby that I never happen to read or that the evidence is one of my mental states that I am not currently thinking about, even if I could when prompted. Specifying just what it means for evidence to be available but not occurrent turns out to be quite difficult. Richard Feldman (1988) argues that in neither of these examples am I justified in believing that Washington is the capital and that a mental state counts as evidence if and only if one is currently thinking of P. Feldman embraces the counterintuitive implication that “one does not know things such as that Washington is the capital when one is not thinking of them” (237). Despite these difficulties, the distinction between accessibilism and mentalism plays an important role in the debate over internalism.

For more on accessibilism and mentalism, see §1.c of, Internalism and Externalism in Epistemology.

b. Objections to Internalism

In addition to the Gettier problem (§6), there many other lines of argument that challenge internalism. Here, I review only three. One of these lines is called the access problem. Traditional foundationalists have accepted some version of accessibilism. For example, Roderick Chisholm writes that justification is “internal and immediate in that one can find out directly, by reflection, what one is justified in believing at any time” (1989: 7). But what if the belief P that justifies my current belief Q is tucked far back in the recesses of my memory and would require more time than I currently have to access it? Am I still justified in believing Q? Or worse, imagine that I have forgotten P; there is no possibility that I can directly access it. However, Q seems true to me, I remember that I had good reasons for believing it, and I do not have any reasons to doubt Q now. Am I justified in believing Q in this case?

Without some modification, the internalist must say no in both cases—the relevant evidence is neither immediately nor reflectively available—though intuitively these are normal cases of justified belief. The standard response is two-fold. First, we must admit that justification comes in degrees: having more evidence can increase one’s justification and some evidence is stronger than others. And second, the state of seeming to be justified or remembering that I am justified can, themselves, constitute reasons for belief. Therefore, in these cases, the internalist might respond that, while the justifications are not as strong as we would prefer, they are, nonetheless, based on accessible mental states.

A second, related objection to internalism is what, following John Greco, I will call the etiology problem. Internalism tends to make justification so easy that it is unclear how one is able to distinguish between good and bad reasons. Consider an example from Greco (2005):

Charlie is a wishful thinker and believes that he is about to arrive at his destination on time. He has good reasons for believing this, including his memory of train schedules, maps, the correct time at departure and at various stops, etc. However, none of these things is behind his belief—he does not believe what he does because he has these reasons. Rather, it is his wishful thinking that causes his belief. Accordingly, he would believe that he is about arrive on time even if he were not. (261)

Why is the combination of his beliefs about schedules, maps, and time a better reason for thinking he is about to arrive than wishful thinking? Presumably, it is because those things are reliable indicators of truth, whereas wishful thinking is not. Being a reliable indicator of truth, though, is an external relationship between the belief and the world—something to which Charlie has no access. We can arrive at a similar result from imagining that Charlie does base his beliefs on his beliefs about train schedules, and so forth, but stipulating that he formed those beliefs carelessly and haphazardly, and only accidentally arrived at the correct conclusion. Nevertheless, based on these beliefs, it seems clear to Charlie that the conclusion follows.

An internalist might respond that this objection depends on the mistaken assumption that internal factors exclude empirical evidence. To see how this assumption slips in, consider how an externalist might determine that train schedules are more fitting sources of evidence than wishful thinking. Presumably, externalists would evaluate the past track record of each source of evidence to see which more reliably indicates truth. The act of “reviewing their past track records,” however, involves appealing to internal states about what seems to be their track records and, therefore, is not obviously different from what an internalist would do; one has internal access to evidence that train arrivals correspond more reliably with train schedules than with wishes. By demanding that justification depends only on external features of the belief-forming process, and then appealing to internal features to evaluate external reliability, Greco is not denying that one must have good, accessible reasons for her beliefs; he is simply disguising the internal features by including them in the external conditions (Feldman 2005: 281). Therefore, either objective etiology is essential to justification, and, therefore, since no one has access to it, we are left with skepticism, or subjective access to evidence of reliable etiology is sufficient for justification, and the externalist criticism misses its mark.

Both the access problem and the etiology problem challenge the idea that we can determine whether we are justified by appeal to internal states. But even if this challenge can be answered, internalism is sometimes thought to imply that we can voluntarily control or change what we believe, that is, that we are guided but not determined by our evidence. The view that we have voluntary control over what we believe is called doxastic voluntarism (from the Greek doxa, for “what is given” and sometimes for “what is believed”). The idea is that internalism is intuitive partially because it allows us to take responsibility for our epistemic behavior. In fact, “[n]onvoluntarism is generally taken to rule out responsibility, since one is not responsible for what one does not control” (Adler 2002: 64). Taking responsibility implies we can decide to respond to evidence well or poorly. This suggests a third objection to internalism called the guidance problem. (For presentations of the guidance problem, see John Heil 1983 and William Alston 1989.)

It turns out that it is difficult to control what we believe: try to make yourself believe you are not reading this page or that you are not real. It is unclear what it would take to convince you that such things are true. That kind of shift would seem to require a complete change in your evidence. But if that is right, then our beliefs are tied strongly to factors outside our control; we cannot simply decide what evidence we have or whether to believe on the basis of that evidence. According to this critique, the idea that internalism explains how we take responsibility for our beliefs is misguided.

In response, contemporary internalists tend to accept that our beliefs are largely determined by the evidence we perceive ourselves to have, but they reject the idea that complete or even partial voluntary control is necessary for responsibility. Carl Ginet (2001) argues that our control over our beliefs is limited but that we nonetheless may decide what to believe in those cases where the evidence is indecisive, cases “where the subject has it open to her also to not come to believe it” (74). Further, Earl Conee and Richard Feldman (2004) argue that a person’s beliefs may appropriately fit one’s evidence even if she cannot control whether she forms those beliefs. For instance:

Suppose that a person spontaneously and involuntarily believes that the lights are on in the room, as a result of the familiar sort of completely convincing perceptual evidence. This belief is clearly justified, whether or not the person cannot voluntarily acquire, lose, or modify the cognitive process that led to the belief. (85)

For a more comprehensive treatment of the debate between internalists and externalists, see Internalism and Externalism in Epistemology.

6. The Gettier Era

The idea that justification is the crucial link between true belief and knowledge seems to be implicit in epistemology since Plato. In Theatetus, Socrates gives an example of a jury that has been persuaded by hearsay of a true judgment that can only be known by an eye-witness (201b-c). This example shows that “true judgment” is not the same thing as “knowledge,” and, therefore, that some other element is needed. Theatetus suggests that knowledge is true judgment plus a logos—an account or argument. Socrates considers three ways of giving an account of a true judgment but concludes that none is plausible. Nevertheless, from then until now, philosophers have generally thought something like the Theatetus’s suggestion must be right, and most of those accounts have been internalist. Socrates’s own suggestion, in Plato’s Meno, is that knowledge is a type of remembrance of what is true based on direct experience prior to being born. Descartes tries to close the gap between true belief and knowledge with the apprehension of clarity and distinctness. Kant attempts to bring them together with the transcendental apperception of the conditions for the possibility of veridical perception. In each case, the knower is assumed to have direct access to something that explains when true belief is knowledge.

Unfortunately, a thought experiment developed in the 20th century challenges the idea that any internal criteria can distinguish knowledge from accidentally true belief. This thought experiment was named the Gettier Problem after Edmund Gettier, who introduced the most influential examples in a famously brief 1963 paper. Examples from other philosophers proliferated after Gettier’s publication, but each new instance is standardly called a “Gettier Case.”

a. The History of the Gettier Problem

The idea is that there are cases where all three conditions on knowledge are met—a belief is justified and true—and yet that belief fails to be knowledge. Although some traditional internalists have allowed that a false belief can be justified, they have resisted the idea that a belief’s justification does not contribute to the likelihood of knowing. But if Gettier cases are successful, it is possible to be justified (in the classic internalist sense) in holding a true belief without that belief’s being knowledge.

The the broken clock example in §1 is an early version of this problem, constructed by Bertrand Russell (1948). Here is another example Russell includes alongside his clock case:

There is the man who believes, truly, that the last name of the Prime Minister in 1906 began with a B, but believes this because he believes that Balfour was Prime Minister then, whereas in fact it was Campbell-Bannerman. … Such instances can be multiplied indefinitely, and show that you cannot claim to have known merely because you turned out to be right. (171)

The problem, though, contra Russell, is not merely that such a person turns out to be right; it is that the person’s belief is justified in cases where a belief turns out to be true by luck; justified true belief in these cases does not increase the likelihood that the belief is knowledge. The evidence that justifies the belief is not connected with the truth of the belief in the right way, and, recall from the introduction, believing in the right way is precisely the sort of thing justification is supposed to indicate.

Such cases trace at least as far back as Alexius Meinong (1906), but the most famous are Gettier’s. His cases are interesting because they show that such cases can occur even when our evidence includes logical entailment. In his first example, Gettier asks us to imagine that two men, Smith and Jones, have applied for the same job. Imagine also that Smith has very good reasons for believing: “Jones will get the job” and “Jones has 10 coins in his pocket.” From this, it follows logically that: “The man who will get the job has 10 coins in his pocket,” and Smith forms the belief that this is true. As it turns out, however, Smith has 10 coins in his pocket (though he does not know it) and he will get the job. So, Smith’s belief that the man who will get the job has 10 coins in his pocket is true, and he has good reasons for why this is so, but his reasons are unconnected with the real reasons it is true. Most philosophers have concluded that, since Smith’s true belief is just a matter of luck (and not a function of his reasons’ connection with the state of affairs that make it true), Smith does not know that the man who will get the job has 10 coins in his pocket.

Because of the many possible variations on cases like these, the idea that justification is based on evidence to which we have direct access faces a serious challenge. There is no clear sense in which that sort of evidence always or even regularly increases the likelihood that a belief is knowledge.

b. Responses to the Gettier Problem

Some philosophers have tried to save strong internalist justification from Gettier cases. For example, D. M. Armstrong—although he ultimately defends an externalist theory of justification—argues that Gettier cases can be avoided by adding a requirement that all evidence for a belief must be, not merely justified, but also knowledge. In the Gettier case above, since it is false that Jones will get the job, this belief cannot be knowledge for Smith and, therefore, undermines Smith’s ability to know the man who will get the job has 10 coins in his pocket. (See Feldman 1974 for a counterexample.)

Others weaken the requirements on justification by arguing that, while knowledge may have constraints outside our conscious access, justification is more plausibly about responsible or apt belief than truth. Call this weak internalist justification (see Zagzebski, 1996).

Still others argue that Gettier cases suggest either that justification is simply not an internal matter or that knowledge does not require justification. Those who argue that justification is external claim that whether a belief is justified depends on whether there is a law-like connection (conceptual or physical) between a belief and the state of affairs it is about (Bergmann 2006). This approach is externalist because it explains justification in terms of belief-forming processes outside the mental life of the believer. In adopting externalism, some treat internal mental states as irrelevant for justification, while others argue that internal states can play an indirect and partial role in justification. Ernest Sosa (1991), for example, argues that internal states can contribute to the state of affairs that grounds the reliability of certain belief-forming behaviors.

7. Externalist Foundationalism

Gettier cases, in addition to other challenges to internalism, have led some epistemologists to reject the idea that justification requires an internal condition. In its most minimal form, externalism is the view that internalism is false, that is, that some features external to the mental life of a person play a necessary role in justification (Greco 2005: 258). However, many versions of externalism also explicitly reject internal conditions for justification, at least for non-inferential knowledge. Some philosophers have developed externalist accounts of knowledge that lack any account of justification (compare, Goldman, 1967, though he has since given up this view). The debate between externalists and internalists, though, is primarily about justification. Externalist accounts of justification differ from internalist accounts by challenging the idea that justification is primarily or ultimately about good reasons when good reasons are construed as mental states.

To accommodate the external features that connect beliefs with states of the world, externalists modify what was traditionally meant by justification; rather than appealing to a person’s subjective perspective on her evidence, externalists appeal to the objective features of the belief-forming and -holding behavior. Epistemic standing is not about the reasons a person has; it is about the relationship between a belief and the world, how that belief is formed or how it is maintained, and where the relationship is not a guarantee of truth but a strong indicator of truth, typically because of a causal, lawful, conceptual, or counterfactual connection with the states of affairs the belief is about. The most prominent version of externalism is the view that a belief is justified just in case it is caused by a reliable process, where “reliable” means that the process produces more true beliefs than false.

a. Externalism, Foundationalism, and the DIJ

Externalists agree that, to resolve the DIJ, one needs to avoid infinite regress and skepticism. So, rather than grounding justification in other beliefs (as coherentists do) or in non-belief states (as classical foundationalists do), externalists ground justification and knowledge in the objective way the world contributes to belief formation or maintenance.

Some externalists, like Armstrong (1973) and Goldman (1979), make room for something like basic beliefs, from which something like non-basic beliefs are inferred. This means that contemporary externalists tend to accept the foundationalist structure—some beliefs are produced reliably by non-belief states, and some beliefs can be produced by other beliefs—though they reject the distinction between basic and non-basic beliefs. All belief-forming processes are states external to the knower’s mental states, and whether a belief is justified (and, therefore, knowledge) depends on the reliability of those processes.

Unlike classical foundationalists, who appeal to internal seemings, indubitability, or self-evidence as justifying these states, externalists like Goldman argue that these states are knowledge simply because they stand in a reliable relationship with the world. A non-inferential belief is knowledge when and because it is lawfully (Armstrong) or reliably (Goldman) produced.

b. Reliabilism

The concept of reliability is crucial to externalist theories of justification (in contrast to externalist theories of knowledge, for example, Goldman 1967, 1976 and Armstrong 1973). There are two types of reliabilist theories of justification. According to reliable indicator theories, a belief is justified just in case its reason or ground is a reliable indicator of the belief’s truth (Swain 1981 and Alston 1988). According to process reliabilism, a belief is justified just in case it was causally produced by reliable processes (Goldman 1979 and Bach 1985). Although he focuses primarily on externalist theories of knowledge, D. M. Armstrong’s “thermometer theory of knowledge” explains that certain mental states serve as reliable indicators or signs of knowledge, and therefore make the belief reasonable, or “justifiable.” Comparing non-inferential belief and a thermometer, Armstrong writes:

In some cases, the thermometer-reading will fail to correspond to the temperature of the environment. Such a reading may be compared to non-inferential false belief. In other cases, the reading will correspond to the actual temperature. Such a reading is like non-inferential true belief. (166)

There are a number of important qualifications to Armstrong’s view, but the central point is that a belief is justified independently of whether the person has reasons to believe it: “The subject’s belief is not based on reasons, but it might be said to be reasonable (justifiable), because it is a sign, a completely reliable sign, that the situation believed to exist does in fact exist” (183).

The benefit of Armstrong’s law-like account is that it suggests a counterfactual account of causal relations along the following lines: as long as a person has a means of distinguishing a proposition, P, from a mutually exclusive but very similar proposition Q, then the person is justified in believing P. For example, if Judy and Trudy are twins, and when John sees someone who looks like Judy, he would not mistake Trudy for Judy, then Sam is justified in believing that he sees Judy. “But if Sam frequently mistakes Judy for Trudy, and Trudy for Judy, he presumably does not have any way of distinguishing between them” (Goldman 1976: 778).

Unfortunately, reliable indicator theories tend to be overly strict in their analysis of cases. Goldman asks us to consider Oscar, who is standing in an open field and sees a Dachshund, from which he forms the belief that he sees a dog. As it happens, Oscar often mistakes certain dog breeds for wolves, who frequent the field. If he were to see a wolf, he might easily mistake it for a dog. Now, is his seeing a Dachshund a reliable indicator of seeing a dog? Since Oscar would likely believe he is seeing a dog regardless of whether he is seeing a wolf or a Dachshund, reliable indicator theories (at least Armstrong’s) would say his seeing a Dachshund is not a reliable indicator of seeing a dog. Whether this criticism is ultimately successful or whether it applies to all reliable indicator theories, reliable process theories quickly overshadowed interest in this type of reliabilism.

Process reliabilism is the view that a belief is justified just in case it is produced by a reliable cognitive process, where a cognitive process may include either conscious reasoning processes or unconscious mechanisms. As I formulated it earlier in this article, reliabilism is a necessary and sufficient condition for justification (“just in case”), but some reliabilists formulate weaker versions. Goldman treats it as a sufficient condition (though he argues against the plausibility of alternative sufficient conditions): “If S’s believing p at t results from a reliable cognitive belief-forming process (or set of processes), then S’s belief in p at t is justified,” (1979: 13). Kent Bach treats it as only a necessary condition: “The idea, roughly, is that to be justified a belief must be formed as the result of reliable processes…” (1985: 199). Despite these differences, externalists univocally reject internalist conditions as sufficient for justification. This commitment, however, leaves them open to a number of interesting criticisms.

c. Objections to Externalism

Though externalism, putatively, has the advantage of avoiding the Gettier problem (though this is controversial) and several other skeptical concerns and of capturing some important intuitions about knowledge, it faces several serious criticisms. On the basis of these criticisms, some internalists claim that externalists have simply changed the subject altogether and are not really talking about justification.

One famous criticism of externalism is called the generality problem. Earl Conee and Richard Feldman (1998) present an example to demonstrate the problem:

Suppose that Smith has good vision and is familiar with the visible differences among common species of trees. Smith looks out a house window one sunny afternoon and sees a plainly visible a nearby maple tree. She forms the belief that there is a maple tree near the house. Assuming everything else in the example is normal, this belief is justified and Smith knows that there is a maple tree near the house. Process reliabilist theories reach the right verdict about this case only if it is true that the process that caused Smith’s belief is reliable. (372)

Is it reliable? That depends on which process formed the belief. Was it the unique causal set of events leading to that particular belief? If so, it is not reliable, since token, or one-time, events have no historical track record. Reliabilists respond to this challenge by saying it is the type of process that must be reliable in order for a belief to be justified, not the token. If that is right, then we face the problem of determining which type of process formed the belief. Was it the “visually initiated belief-forming process,” the “process of a retinal image of such-and-such specific characteristics leading to a belief that there is a maple tree nearby,” the “process of relying on a leaf shape to form a tree-classifying judgment,” the “perceptual process of classifying by species a tree located behind a solid obstruction,” or any number of others (373)? There are innumerable options, and even if a combination of types were involved, each type would have to meet reliability conditions. Conee and Feldman conclude, “Without a specification of the relevant type, process reliabilism is radically incomplete” (373).

A second objection to externalism is called the New Evil Demon Problem (NEDP) (Cohen and Lehrer 1983). In Descartes’s original evil demon problem, in order to motivate the problem of skepticism, we are asked to consider the possibility that all our current perceptions are the fictitious construction of a being intent on deceiving us such that all our perceptual and intuitive beliefs are false. Putting the thought experiment to a very different purpose, if the evil demon world is possible, we can imagine two worlds: (1) a non-deceptive world, where our perceptions are reliably produced by the world outside of our minds, and (2) an evil demon world, where there are people just like you and me, who have exactly the same mental states that we do but whose perceptions are systematically unreliable—they track nothing of truth at that world. There are no trees, buildings, bodies, and so forth. Whatever actually exists at that world, those people have no perception of it. According to externalists—process reliabilists, in particular—the beliefs of people in the real world are justified and those of people in the demon world are unjustified, despite the fact that their mental lives are identical. Yet it is difficult to imagine that demon world beliefs about looking both ways before crossing the street and getting a second opinion about a medical diagnosis are unjustified. People who believe such things are acting responsibly from their perspective on their evidence. This suggests that reliabilism is not really about justification at all.

A third objection to externalism is what Ernest Sosa (2001) calls the metaincoherence problem, which attempts to show that a person’s belief can be externally reliable while internally unjustified. In the literature, there are two versions of the metaincoherence problem. The first is what I call first-order metaincoherence, which attempts to show that externalism is insufficient for justification. The second is what I call second-order metaincoherence, which challenges the externalist’s reasons for holding externalism.

One famous example of first-order metaincoherence is a thought experiment given in various forms by Laurence BonJour (1985) and Keith Lehrer (1990). Consider Armstrong’s Thermometer Analogy from above. Imagine there was a human thermometer, that is, someone who “undergoes brain surgery by an experimental surgeon who invents a small device which is both a very accurate thermometer and a computational device capable of generating thoughts” (Lehrer 1990: 163). This person, whom Lehrer names Mr. Truetemp, is unaware of the device despite the fact that it regularly causes him to form reliable beliefs that he unreflectively accepts about the temperature. On a given day, he might reliably form and accept the belief that it is 104 degrees Fahrenheit outside. Is this belief knowledge? Lehrer concludes: “Surely not. He has no idea whether he or his thoughts about the temperature are reliable” (164). BonJour concludes similarly, “Part of one’s epistemic duty is to reflect critically upon one’s beliefs, and such critical reflection precludes believing things to which one has, to one’s knowledge, no reliable means of epistemic access” (1985: 42).

The second-order metaincoherence problem is stated by Barry Stroud (1989):

The scientific ‘externalist’ claims to have good reason to believe that his theory is true. It must be granted that if, in arriving at his theory, he did fulfill the conditions his theory says are sufficient for knowing things about the world, then if that theory is correct, he does in fact know that it is. But still, I want to say, he himself has no reason to think that he does have good reason to think that his theory is correct. (321)

The worry is that, since externalists claim that features of the world outside the mental life of a believer ultimately determine whether a belief is justified, then, if externalism is true, externalists have no reason to believe it is true; in fact, they are committed to believing that whether their belief that it is true is justified is outside their ability to determine from within their own perspective. Again, the belief may be externally reliable, but it is internally unjustified.

If these criticisms hit their mark, epistemologists must make some difficult decisions about which approach—internalism or externalism—has the fewest or least pernicious problems. In the 21st century, much work is underway to address these problems. If one remains unconvinced, there are recent developments that attempt to salvage some of the insights of internalism and externalism. A prominent example involves introducing character traits into the conditions for justification. We turn next to this view, called virtue epistemology.

8. Justification as Virtue

Classical theories of justification that imply a normative or belief-guiding dimension are modeled largely on normative ethical theories, whether teleological, or outcome-based, accounts or deontological, or duty-based, accounts. They ask whether people are rationally obligated to, permitted to, or obligated not to hold particular beliefs given their evidence. These are decision-based theories of rational normativity, as opposed to character-based theories. Just as virtue theory offers a non-decision-based alternative in ethics, it also suggests a non-decision-based alternative in epistemology. The attitudes and circumstances under which people form, maintain, and discard beliefs can be described as virtuous or vicious, and just as decision-based theories in epistemology are concerned with rational obligation (as opposed to moral obligation), character-based theories in epistemology are concerned either with intellectual character (as opposed to moral character), or with cognitive faculties understood as traits of a person (such as reason, perception, introspection, and memory). Of course, in matters of normativity, it is not a simple task to distinguish moral dimensions from rational or intellectual ones, but space prevents us from exploring that relationship here.

Virtue theories of justification hold that part of what justifies a belief is the intellectual traits with which a believer forms or holds the belief. Just as a person’s moral virtues contribute to the goodness of an action (kindness, compassion, honesty), a person’s intellectual virtues contribute to the epistemic goodness of a belief. Virtue theorists, however, are sharply divided as to which intellectual virtues are relevant. One prominent view is that justification is a function of those virtues that enhance reliability, that is, they have a strong external component (Sosa 1980; 2007). This view is known as virtue reliabilism.

A second prominent view is that justification is a function of those intellectual virtues that contribute to more general epistemic goods, including intellectual well-being, social trust, and the righting of epistemic injustice. These virtue responsibilists regard the truth-goal in epistemology very differently than both traditional epistemologists and their virtue reliabilist counterparts (Code 1984; Montmarquet 1993; Zagzebski 2000).

a. Virtue Reliabilism

A prominent version of virtue reliabilism is offered by Ernest Sosa (1980) in attempt to resolve the tension between foundationalists and coherentists. Sosa argues that if beliefs are grounded in truth-conductive intellectual virtues (where truth-conducive is conceived in process reliabilist terms), then foundationalists have empirically stable abilities or acquired habits that help explain the connection between sensory experience and non-inferential belief. Further, reliable virtues help explain how justification emerges from a coherent set of beliefs—coherence is a type of intellectual virtue.

What do these intellectual virtues look like for Sosa? Borrowing an example from his (2007), consider an archer who is aiming at a target. In order to be successful, the archer must have a degree of competence, which Sosa calls “adroitness,” and the shot must be accurate. These features are analogous to the epistemic state of having a true belief (accuracy) that is formed on the basis of good evidence (adroitness). These two features alone, though, are insufficient for the person to believe in the right way. The person must also exercise his adroitness in circumstances that increase his likelihood of having accurate beliefs, that is, his shot must be accurate because it is adroit. Sosa calls this third feature “aptness,” “its being true because competent” (2007: 23). Some of these circumstances will be outside the believer’s control—wind gusts in the archer’s case; causal ties to the world in the epistemic case. But some—for example, the virtues—are within the believer’s control.

Sosa explains:

Aptness depends on just how the adroitness bears on the accuracy. The wind may help some, for example…. If the shot is difficult, however, from a great distance, the shot might still be accurate sufficiently through adroitness to count as apt, though with some help from the wind. (2007: 79)

Notice that the role of the wind is analogous to certain external features of a person’s belief-forming state. Nevertheless, intellectual virtues like those mentioned above can increase one’s adroitness and thereby increase the likelihood of accuracy.

Imagine a person who has good evidence that P but who either does not appeal to that evidence when forming the belief that P, appealing instead to, say, wishful thinking, or who appeals to that evidence carelessly, refusing to consider alternatives or just how strong the evidence is. Despite this person’s having good evidence, her belief is not apt because the belief’s truth was not due to the person’s competence with the evidence.

Because of this external dimension, this branch of virtue epistemology is regarded as a form of reliabilism. Unlike externalist foundationalism, however, the reliability condition is not restricted to belief-forming processes; it is also highly dependent on context. Sosa says:

An archer might manifest sublime skill in a shot that does hit the bull’s-eye. This shot is then both accurate and adroit. But it could still fail to be accurate because adroit. The arrow might be diverted by some wind, for example, so that, if conditions remained normal thereafter, it would miss the target altogether. However, shifting winds might then ease it back on track towards the bull’s-eye. (79)

In epistemic cases, the believer must be suitably virtuous such that, under normal conditions, her beliefs are accurate because they are adroit.

b. Virtue Responsibilism

Sosa’s account has been well-received, though there is disagreement as to whether it is sufficient for solving the problems at issue. One prominent criticism is that Sosa does not take his use of virtues far enough. Rather than serving a more basic truth-goal, some argue that virtues should be conceived as central to the epistemic project.

Lorraine Code (1984) coined the term virtue responsibilism in contrast to Sosa’s reliabilism, and it is the view that justification, or rather, being an intellectually responsible agent, is a matter of acting virtuously in the practice of inquiry. Code argues that epistemic responsibility the central intellectual virtue. Similarly, James Montmarquet argues that, “S is subjectively justified in believing p insofar as S is epistemically virtuous in believing p” (1993: 99). This means that virtue responsibilism is internalist through and through.

Not all virtue responsibilists, however, eschew the truth-goal. As Linda Zagzebski explains, “It would not do any good for a person to be attentive, thorough, and careful unless she was generally on the right track” (2009: 82). But unlike externalist foundationalism, “the right track,” according to virtue epistemologists, does not necessarily include producing more true beliefs than false. There is more than one virtuous outcome, for example, in cases of creativity or inventiveness. It may be that “only 5 per cent of a creative thinker’s original ideas turn out to be true,” Zagzebski explains. “Clearly, their truth conduciveness in the sense of producing a high proportion of true beliefs is much lower than that of the ordinary virtues of careful and sober inquiry, but they are truth conducive in that they are necessary for the advancement of knowledge” (2000: 465). This suggests that the conditions under which a subject is justified are highly contingent on changing context and the goal of our epistemic behaviors. And virtue epistemologists argue that this captures the typical contingency of our epistemic lives.

c. Objections to Virtue Epistemology

In addition to internal disputes between virtue reliabilists and responsibilists, there are more serious concerns with the adequacy of virtue epistemology. Virtue reliabilism faces many of the same criticisms that face traditional reliabilism, including the generality problem, the New Evil Demon Problem, and the meta-incoherence problems. Further, although there is an intuitive sense in which a reliably functioning method of forming beliefs is virtuous (in the Aristotelian sense of “excellence”), it is not clear how virtue reliabilism is substantively different from classical reliabilism. To be sure, virtue responsibilists take special pains to explain the roles of context, luck, and the knower’s aptness in forming beliefs, but these do not seem unavailable to traditional reliabilists.

Similarly, virtue responsibilism faces many of the same problems as virtue ethics. There are questions about which intellectual states count as epistemic virtues (different responsibilists have different lists), whether some virtues should be privileged over others (for example, James Montmarquet (1992) argues that epistemic conscientiousness is the preeminent intellectual virtue), and the ontological status of virtues (whether they are real dispositions or simply heuristics for categorizing types of behavior). There are also serious concerns about some extreme versions of responsibilism that completely disconnect intellectual virtue from truth-seeking, as with Code’s account, rendering discussions of intellectual virtue the province of ethics rather than epistemology.

To alleviate some of these concerns, some virtue epistemologists defend a mixed theory, arguing that an adequate virtue epistemology requires both a reliability and a responsibility condition Greco (2000).

A general concern for both types of virtue epistemology is that virtue theory associates justification too closely with the idea of credit or achievement, whether a person has formed beliefs well. Jennifer Lackey (2007, 2009), for example, argues that if knowledge is produced by the virtuous activity of others (like that of a reliable witness) or if knowledge is innate, then it is not obvious how a person’s belief-forming behavior can be virtuous or vicious, as there is no behavior involved. In the case of the reliable witness, a hearer simply accepts on the basis of the witness’s testimony. In the case of innate knowledge, the knower does nothing to increase the likelihood that her beliefs are reliable; they are reliable for reasons outside her epistemic behavior. If these criticisms are right, virtue epistemology may be unable to explain a range of important types of knowledge.

For a more detailed treatment of virtue epistemology, see Virtue Epistemology.

9. The Value of Justification

Each of the theories of justification reviewed in this article presumes something about the value of justification, that is, about why justification is good or desirable. Traditionally, as in the case of Theatetus noted above, justification is supposed to position us to understand reality, that is, to help us obtain true beliefs for the right reasons. Knowledge, we suppose, is valuable, and justification helps us attain it. However, skeptical arguments, the influence of external factors on our cognition, and the influence of various attitudes on the way we conduct our epistemic behavior suggest that attaining true beliefs for the right reason is a forbidding goal, and it may not be one that we can access internally. Therefore, there is some disagreement as to whether justification should be understood as aimed at truth or some other intellectual goal or set of goals.

a. The Truth Goal

All the theories we have considered presume that justification is a necessary condition for knowledge, though there is much disagreement about what precisely justification contributes to knowledge. Some argue that justification is fundamentally aimed at truth, that is, it increases the likelihood that a belief is true. Laurence BonJour writes, “If epistemic justification were not conducive to truth in this way…then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth” (1985: 8). Others argue that there are a number of epistemic goals other than truth and that in some cases, truth need not be among the values of justification. Jonathan Kvanvig explains:

[I]t might be the case that truth is the primary good that defines the theoretical project of epistemology, yet it might also be the case that cognitive systems aim at a variety of values different from truth. Perhaps, for instance, they typically value well-being, or survival, or perhaps even reproductive success, with truth never really playing much of a role at all. (2005: 285)

Given this disagreement, we can distinguish between what I will call the monovalent view, which takes truth as the sole, or at least fundamental, aim of justification, and the polyvalent view (or, as Kvanvig calls it, the plurality view), which allows that there are a number of aims of justification, not all of which are even indirectly related to truth.

b. Alternatives to the Truth Goal

One motive for preferring the monovalent view is that, if truth is not the primary goal of justification—that is, it connects belief with reality in the right way—then one is left only with goals that are not epistemic, that is, goals that cannot contribute to knowledge. The primary worry is that, in rejecting the truth goal, one is left with pragmatism. In response, those who defend polyvalence argue that, in practice, there are other cognitive goals that are (1) not merely pragmatic, and (2) meet the conditions for successful cognition. Kvanvig explains that “not everyone wants knowledge…and not everyone is motivated by a concern for understanding. … We characterize curiosity as the desire to know, but small children lacking the concept of knowledge display curiosity nonetheless” (2005: 293). Further, much of our epistemic activity, especially in the sciences, is directed toward “making sense of the course of experience and having found an empirically adequate theory” (ibid., 294). Such goals can be produced without appealing to truth at all. If this is right, justification aims at a wider array of cognitive states than knowledge.

Another argument for polyvalence allows that knowledge is the primary aim of justification but that much more is involved in justification than truth. The idea is that, even if one were aware of belief-forming strategies that are conducive to truth (following the evidence where it leads; avoiding fallacies), one might still not be able to use those strategies without having other cognitive aims, namely, intellectual virtues. Following John Dewey, Linda Zagzebski says that “it is not enough to be aware that a process is reliable; a person will not reliably use such a process without certain virtues” (2000: 463). As noted above, virtue responsibilists allow that the goal of having a large number of true beliefs can be superseded by the desire to create something original or inventive. Further still, following strategies that are truth-conducive under some circumstances can lead to pathological epistemic behavior. Amélie Rorty, for example, argues that belief-forming habits become pathological when they continue to be applied in circumstances no longer relevant to their goals (Zagzebski, ibid., 464). If this argument is right, then truth is, at best, an indirect aim of justification, and intellectual virtues like openness, courage, and responsibility may be more important to the epistemic project.

c. Objections to the Polyvalent View

One response to the polyvalent view is to concede that there are apparently many cognitive goals that fall within the purview of epistemology but to argue that all of these are related to truth in a non-trivial way. The goal of having true beliefs is a broad and largely indeterminate goal. According to Marian David, we might fulfill it by believing a truth, by knowing a truth, by having justified beliefs, or by having intellectually virtuous beliefs. All of these goals, argues David, are plausibly truth-oriented in the sense that they derive from, or depend on, a truth goal (David 2005: 303). David supports this claim by asking us to consider which of the following pairs is more plausible:

A1. If you want to have TBs [true beliefs] you ought to have JBs [justified beliefs].

A2. We want to have JBs because we want to have TBs.

B1. If you want to have JBs you ought to have TBs.

B2. We want to have TBs because we want to have JBs. (2005: 303)

David says, “[I]t is obvious that the A’s [sic] are way more plausible than the B’s. Indeed, initially one may even think that the B’s have nothing going for them at all, that they are just false” (ibid.). This intuition, he concludes, tells us that the truth-goal is more fundamental to the epistemic project than anything else, even if one or more other goals depend on it.

Almost all theories of epistemic justification allow that we are fallible, that is, that our justified beliefs, even if formed by reliable processes, may sometimes be false. Nevertheless, this does not detract from the claim that the aim of justification is true belief, so long as it is qualified as true belief held in the right way.

d. Rejections of the Truth Goal

In spite of these arguments, some philosophers explicitly reject the truth goal as essential to justification and cognitive success. Michael Williams (1991), for example, rejects the idea that truth even could be an epistemic goal when conceived of as “knowledge of the world.” Williams argues that in order for us to have knowledge of the world, there must be a unified set of propositions that constitute knowledge of the world. Yet, given competing uses of terms, vague domains of discourse, the failure of theoretical explanations, and the existence of domains of reality we have yet to encode into a discipline, there is not a single, unified reality to study. Williams argues that because of this, we do not necessarily have knowledge of the world:

All we know for sure is that we have various practices of assessment, perhaps sharing certain formal features. It doesn’t follow that they add up to a surveyable whole, to a genuine totality rather than a more or less loose aggregate. Accordingly, it does not follow that a failure to understand knowledge of the world with proper generality points automatically to an intellectual lack. (543)

In other words, our knowledge is not knowledge of the world—that is, access to a unified system of true beliefs, as the classical theory would have it. It is knowledge of concepts in theories putatively about the world, constructed using semantic systems that are evaluated in terms of other semantic systems. If this is, in fact, all there is to knowing, then truth, at least as classically conceived, is not a meaningful goal.

Another philosopher who rejects the truth goal is Stephen Stich (1988; 1990). Stich argues that, given the vast amount of disagreement among novices and experts about what counts as justification, and given the many failures of theories of justification to adequately ground our beliefs in anything other than calibration among groups of putative experts, it is simply unreasonable to believe that our beliefs track anything like truth. Instead, Stich defends pragmatism about justification, that is, justification just is practically successful belief; thus, truth cannot play a meaningful role in the concept of justification.

A response to both views might be that, in each case, the truth goal has not been abandoned but simply redefined or relocated. Correspondence theories of truth take it that propositions are true just in case they express the world as it is. If the world is not expressible propositionally, as Williams seems to suggest, then this type of truth is implausible. Nevertheless, a proposition might be true in virtue of being an implication of a theory, and so, for example, we might adopt a more semantic than ontological theory of truth, and it is not clear whether Williams would reject this sort of truth as the aim of epistemology.

Similarly, someone might object to Stich’s treating pragmatism as if it is not truth-conductive in any relevant sense. If something is useful, it is true that it is useful, even in the correspondence sense. Even if evidence does not operate in a classical representational manner, the success of beliefs in accomplishing our goals is, nevertheless, a truth goal. (See Kornblith 2001 for an argument along these lines.)

10. Conclusion

Epistemic justification is an evaluative concept about the conditions for right or fitting belief. A plausible theory of epistemic justification must explain how beliefs are justified, the role justification plays in knowledge, and the value of justification. A primary motive behind theories of justification is to solve the dilemma of inferential justification. To do this, one might accept the inferential assumption and argue that justification emerges from a set of coherent beliefs (internalist coherentism) or an infinite set of beliefs (infinitism). Alternatively, one might reject the inferential assumption and argue that justification derives from basic beliefs (internalist foundationalism) or through reliable belief-forming processes (externalist reliabilism). If none of these views is ultimately plausible, one might pursue alternative accounts. For example, virtue epistemology introduces character traits to help avoid problems with these classical theories. Other alternatives include hybrid views, such as Conee and Feldman’s (2008), mentioned above, and Susan Haack’s (1993) foundherentism.

11. References and Further Reading

  • Aikin, S. 2009. “Don’t Fear the Regress: Cognitive Values and Epistemic Infinitism.” Think, 23, 55-61.
  • Aikin, S. F. 2011. Epistemology and the Regress Problem. London: Routledge.
  • Alston, W. P. 1988. “An Internalist Externalism,” Synthese, 74, 265-283.
  • Alston, W. P. 1989. Epistemic Justification. Ithaca: Cornell University Press.
  • Armstrong, D. M. 1973. Belief, Truth, and Knowledge. Cambridge: Cambridge University Press.
  • Bach, K. 1985. “A Rationale for Reliabilism.” The Monist, 68, 246-63. Reprinted in S. Bernecker and F. Dretske, eds. 2000. Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 199-213. Cited pages are to this anthology.
  • Bergmann, M. 2006. Justification Without Awareness. New York: Oxford.
  • Blanshard, B. 1939. The Nature of Thought. London: Allen & Unwin.
  • BonJour, L. 1980. “Externalist Theories of Empirical Knowledge.” Midwest Studies in Philosophy 5: Studies in Epistemology. Minneapolis: University of Minnesota Press, 53-73.
  • BonJour, L. 1985. The Structure of Empirical Knowledge. Cambridge: Harvard University Press.
  • BonJour, L. and E. Sosa. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden: Wiley-Blackwell.
  • Chisholm, R. 1966. Theory of Knowledge. Englewood Cliffs: Prentice Hall.
  • Chisholm, R. 1982. “A Version of Foundationalism,” in The Foundations of Knowing, ed. R. Chisholm. Minneapolis: University of Minnesota Press.
  • Chisholm, R. 1989. Theory of Knowledge, 3rd ed. Englewood Cliffs: Prentice Hall.
  • Code, L. 1984. “Toward a ‘Responsibilist’ Epistemology.” Philosophy and Phenomenological Research, 45 (1), 29–50.
  • Cohen, S. and K. Lehrer. 1983. “Justification, Truth, and Knowledge.” Synthese 55 (2), 191-207.
  • Conee, E. and R. Feldman. 2004. Evidentialism. New York: Oxford University Press.
  • Conee, E. and R. Feldman. 1998. “The Generality Problem for Reliabilism,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 372-386. Page numbers are to this anthology.
  • Dancy, J. 1985. Introduction to Contemporary Epistemology. Oxford: Basil Blackwell.
  • David, M. 2005. “Truth as the Primary Epistemic Goal: A Working Hypothesis,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 296-312.
  • Davidson, D. “A Coherence Theory of Truth and Knowledge,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 154-63.
  • Elgin, C. 2005. “Non-foundationalist Epistemology: Holism, Coherence, and Tenability,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 156-67.
  • Feldman, R. 1974. “An Alleged Defect in Gettier Counter-examples.” The Australasian Journal of Philosophy. 52, 68-69.
  • Feldman, R. 1988. “Having Evidence,” in Philosophical Analysis, ed. D. F. Austin. Kluwer Academic Publishers, 83-104.
  • Feldman, R. 2005. “Justification Is Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 270-84.
  • Fitelson, B. 2003. “A Probabilistic Measure of Coherence.” Analysis, 63, 194–199.
  • Frankfurt, H. 1973/2008. Demons, Dreamers, and Madmen: The Defense of Reason in Descartes’s Meditations. Princeton: Princeton University Press.
  • Gettier, E. 1963. “Is Justified True Belief Knowledge?” Analysis. 23, 121-23.
  • Ginet, C. 2001. “Deciding to Believe,” in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford: Oxford University Press, 63-76.
  • Ginet, C. 2005. “Infinitism Is not the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds. Matthias Steup and Ernest Sosa. Malden: Blackwell Publishing, 140-149.
  • Goldman, A. 1967. “A Causal Theory of Knowing.” The Journal of Philosophy, 64, 357-72.
  • Goldman, A. 1976. “Discrimination and Perceptual Knowledge.” The Journal of Philosophy, 73, 771-91.
  • Goldman, A. 1979. “What Is Justified Belief?” in Knowledge and Justification, ed. George S. Pappas. Dordrecht, Holland: D. Reidel Publishing, 1-23.
  • Greco, J. 2005. “Justification Is Not Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and
  • Ernest Sosa. Malden: Blackwell Publishing, 257-70.
  • Haack, S. 1993. Evidence and Inquiry. Malden: Blackwell Publishing.
  • Harman, G. 1986. Change in View. Cambridge: MIT Press.
  • Heil, J. 1983. “Doxastic agency.” Philosophical Studies, 43 (3), 355-364.
  • Hempel, C. 1935. “On the Logical Positivist’s Theory of Truth.” Analysis, 2 (4), 49-59.
  • Klein, P. 2005. “Infinitism Is the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 131-40.
  • Klein P. 2014. “No Final End in Sight,” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 95-115.
  • Klein, P. and T. A. Warfield. 1994, “What Price Coherence?” Analysis, 54, 129–132.
  • Kornblith, H. 2001. Knowledge and Its Place in Nature. Oxford: Oxford University Press.
  • Kvanvig, J. L. and W. D. Riggs. 1992. “Can a Coherence Theory Appeal to Appearance States?” Philosophical Studies, 67, 197-217.
  • Kvanvig, J. 2005. “Truth Is not the Primary Epistemic Goal,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 285-96.
  • Lackey, J. 2007. “Why we don’t deserve credit for everything we know.” Synthese, 158, 345–361.
  • Lackey, J. 2009. “Knowledge and credit,” Philosophical Studies, 142, 27–42.
  • Lehrer, K. 1974. Knowledge. Oxford: Clarendon Oxford Press.
  • Lehrer, K. 1986. “The Coherence Theory of Knowledge.” Philosophical Topics, 14, pp. 5-25.
  • Lehrer K. 1990. Theory of Knowledge. Boulder: Westview Press.
  • Lewis, C. I. 1946. An Analysis of Knowledge and Valuation. LaSalle: Open Court.
  • McGrew, T. 1995. The Foundations of Knowledge. Lanham: Rowman & Littlefield.
  • Meinong, A. 1906. “Über die Erfahrungsgrundlagen unseres Wissens” [“On the Experiential Foundations of Our Knowledge”], in Abhandlungen zur Didaktik und Philosophie der Naturwissenschaften, Band [Vol.] I, Heft [Issue] 6, Berlin: J. Springer. Reprinted in Meinong 1968–78, Vol. V: 367–481.
  • Montmarquet, J. 1987. “Epistemic Virtue.” Mind, 96, 482–497.
  • Neurath, O. 1983/1932, “Protocol Sentences.” In R.S. Cohen and M. Neurath, eds. Philosophical Papers 1913–       1946. Dordrecht: Reidel.
  • Olsson, E. J. 2009. Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
  • Plantinga, A. 1983. “Reason and Belief in God,” in A. Plantinga and N. Wolterstorff, eds. Faith and Rationality. Notre Dame: University of Notre Dame Press, 16-93.
  • Plantinga, A. 1993a. Warranted Christian Belief. New York: Oxford.
  • Plantinga, A. 1993b. Warrant: The Contemporary Debate. New York: Oxford.
  • Pollock, J. 1986. Contemporary Theories of Knowledge. Lanham: Rowman & Littlefield Publishers.
  • Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism. Hampshire: Palgrave Macmillan.
  • Quine, W.V.O. 1970. Web of Belief. Cambridge: Harvard University Press.
  • Russell, B. 1948. Human Knowledge: Its Scope and Value. London: Routledge.
  • Smithies, D. 2014. “Can Foundationalism Solve the Regress Problem?” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 73-94.
  • Sosa, E. 1980. “The Raft and the Pyramid: Coherence Versus Foundations in the Theory of Knowledge.” Midwest Studies in Philosophy, 5 (1), 3–26.
  • Sosa, E. 1991. “Reliabilism and intellectual virtue” in Knowledge in Perspective: Selected Essays in Epistemology. New York: Cambridge University Press, 131-145.
  • Sosa, E. 2001. “Reliabilism and Intellectual Virtue,” in Epistemology: Internalism and Externalism, ed. Hilary Kornblith. Malden: Blackwell Publishers, 147-62.
  • Sosa, E. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Oxford University Press.
  • Stich, S. 1988. “Reflective Equilibrium, Analytic Epistemology, and the Problem of Cognitive Diversity.” Synthese, 74, 391-413.
  • Stich, S. 1990. The Fragmentation of Reason. Cambridge: The MIT Press.
  • Stroud, B. 1989. “Understanding Human Knowledge in General,” in Knowledge and Skepticism, Marjorie Clay and Keith Lehrer, eds. Boulder: Westview, 31-50.  Reprinted in S. Bernecker and F. Dretske, eds. 2000.   Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 307-323. Page numbers to this anthology.
  • Swain, Marshall. 1981. Reasons and Knowledge. Ithaca: Cornell University Press.
  • Williams, M. 2000. “Epistemological Realism,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden:     Blackwell Publishers, 536-555.
  • Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundation of Knowledge. Cambridge: Cambridge University Press.
  • Zagzebski, L. 2000. “Virtues of the Mind,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden: Blackwell Publishers, 457-467.
  • Zagzebski, L. 2009. On Epistemology. Belmont: Wadsworth.

 

Author Information

Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
Broward College
U. S. A.

Laozi (Lao-tzu, fl. 6th cn. B.C.E.)

laoziLaozi is the name of a legendary Daoist philosopher, the alternate title of the early Chinese text better known in the West as the Daodejing, and the moniker of a deity in the pantheon of organized “religious Daoism” that arose during the later Han dynasty (25-220 C.E.). Laozi is the pinyin romanization for the Chinese characters which mean “Old Master.” Laozi is also known as Lao Dan (“Old Dan”) in early Chinese sources (see Romanization systems for Chinese terms). The Zhuangzi (late 4th century B.C.E.) is the first text to use Laozi as a personal name and to identify Laozi and Lao Dan. The earliest materials to mention Laozi are in the Zhuangzi’s Inner Chapters (Chs. 1-7) in the narration of Lao Dan’s funeral in Ch. 3. Two other passages provide support for the linkage of Laozi and Lao Dan (in Ch. 14 and Ch. 27). There are seventeen passages in which Laozi plays a role in the Zhuangzi. Three are in the Inner Chapters, eight occur in chapters 11-14 in the Yellow Emperor sections of the text (chs. 11, 12, 13, 14), five are in chapters likely belonging to Zhuang Zhou’s disciples as the sources (chs. 21, 22, 23, 25, 27), and one is in the final concluding editorial chapter (ch. 33). In the Yellow Emperor sections in which Laozi is the main figure, four passages contain direct attacks on Confucius and the Confucian virtues of ren, yi, and li in the form of dialogues. The sentiments expressed by Laozi in these passages are reminiscent of remarks from the Daodejing and probably date from the period in which that collection was reaching some near final form.  Some of these themes include the advocacy of wu-wei, rejection of discursive reasoning and mind meddling, condemnation of making discriminations, and valorization of forgetting and fasting of the mind. The earliest ascriptions of authorship of the Daodejing to Laozi are in Han Feizi and the Huainanzi.  Over time, Laozi became a principal figure in institutionalized forms of Daoism and he was often associated with the many transformations and incarnations of the dao itself.

Table of Contents

  1. Laozi and Lao Dan in the Zhuangzi
  2. Laozi and the Daodejing
  3. The First Biography and the Establishment of Laozi as the Founder of Daoism
  4. The Ongoing Laozi Myth
  5. References and Further Reading

1. Laozi and Lao Dan in the Zhuangzi

The Zhuangzi gives the following, probably fictional, account of Confucius‘s impression of Laozi:

“Master, you’ve seen Lao Dan—what estimation would you make of him?” Confucius said, “At last I may say that I have seen a dragon—a dragon that coils to show his body at its best, that sprawls out to display his patterns at their best, riding on the breath of the clouds, feeding on the yin and yang. My mouth fell open and I couldn’t close it; my tongue flew up and I couldn’t even stammer. How could I possibly make any estimation of Lao Dan!” Zhuangzi, Ch. 14

Laozi’s relationship to Confucius is a major part of the Zhuangzi‘s picture of the philosopher. Of the seventeen passages mentioning Laozi, Confucius figures as a dialogical partner or subject in nine. While it is clear that Confucius is thought to have a long way to go to become a zhenren (the Zhuangzi‘s way of speaking about the perfected person), Lao Dan seems to feel sorry for Confucius in his reply to Wuzhi “No-Toes” in Ch. 5, The Sign of Virtue Complete. Laozi recommends to Wuzhi that he try to release Confucius from the fetters of his tendency to make rules and human discriminations (for example, right/wrong; beautiful/ugly) and set him free to wander with the dao.

Lao Dan addresses Confucius by his personal name “Qiu” in three passages. Since such a liberty is one that only a person with seniority and authority would take, this style invites us to believe that Confucius was a student of Lao Dan’s and thereby acknowledged Laozi as an authority. In one of these passages in which Lao Dan uses Confucius’s personal name Qiu, he cautions Confucius against clever arguments and making plans and strategies with which to solve life’s problems, telling him that such rhetoricians are simply like nimble monkeys and rat catching dogs who are set aside when unable to perform (Ch. 12, Heaven and Earth). And on another occasion, Qiu claims that he knows the “six classics” thoroughly and that he has tried to persuade 72 kings to their truth, but they have been unmoved. Lao Dan’s reply is, “Good!” He tells Confucius not to occupy himself with such worn out ways, and to instead live the dao himself (Ch.14, Turning of Heaven).

In his later attempt to provide an actual biography of Laozi by Sima Qian (see below), Laozi’s vocation as a librarian figures prominently.  If the ultimate source of this tradition is the Zhuangzi, we should not forget that the context of this record is as a component in the theme that Laozi taught Confucius, who was confused and having no success with his own teachings.  Accordingly, the point of the story that mentions Laozi’s occupation as librarian or an archivist (ch. 13) is that Confucius’ writings, offered to Laozi by Confucius himself, are simply not worthy to be put into a library. We cannot be sure, then, that there is any real memory of Confucius’s occupation being preserved for us, as the story may be an entire fiction meant to make a point about the inadequacy of Confucius’s teachings.

Finally, in Ch.14, Turning of Heaven, Lao Dan makes a direct attack not only on the rules and regulations of Confucius, but also the teachings of the Mohists, and the veneration of the ancient emperors and legendary sages of the past, displaying his preference for experiential oneness with dao to any teaching or tradition of philosophers or great minds of the past.

2. Laozi and the Daodejing

The ways in which expressions of Laozi in the seventeen passages in which he occurs in the Zhuangzi sound like sentiments in the Daodejing (hereafter, DDJ) represent collectively one basis for the traditional association of Laozi as author of the text.  For example, at Laozi’s funeral in Ch. 3, Qin Shi valorizes Laozi by saying that he accomplished much, without appearing to do so, which is a reference both to the Old Master’s rejection of pursuit of fame and power and also praise for his conduct as wu-wei (effortless action) in oneness with dao. Qin Shi’s praise of Laozi is also consistent with Laozi’s teaching to Yangzi Ju in Ch. 7 not to seek fame and power.  Such conduct and attitudes are encouraged strongly in DDJ 2, 7, 22, 24, 51 and 77.  When Laozi tells Wuzhi to return to Confucius and set him free from the disease of problematizing life and tying himself in knots by helping him to empty himself of making discriminations (Zhuangzi ch. 5), this same teaching shows up in the DDJ in many places (for example, chs. 5 and 18).  Likewise, Laozi criticizes Confucius for trying to spread the classics (12 in number in ch. 13 and 6 in ch. 14) instead of valuing the wordless teaching, the DDJ has a ready parallel in Ch. 56.  While Confucius is teaching his disciples to put forth effort and cultivate benevolence (ren) and appropriate conduct (yi),  Laozi tells him that he should be teaching effortless action (wu-wei) in Zhuangzi chs. 13, 14, and 21).  This teaching also shows up in the DDJ (chs. 2, 3, 20, 47, 48, 57, 63, and 64). Finally, if we take Zhuangzi Ch. 33 as an original part of the work, then Lao Dan (Laozi) actually quotes DDJ 28.

In addition to the ways in which Laozi’s teachings in the Zhuangzi sound like those of the DDJ, we should also note that both of the very early classical works known as the Hanfeizi and the Huainanzi contain passages that are direct quotes or unmistakable allusions to teachings in the DDJ and attribute them to Lao Dan or Laozi by name.  Tae Hyun Kim has made a study of these passages in Hanfeizi and the recent English translation of Huainanzi by John Major and others makes it easy to locate these citations (for example, see Huainanzi, 11.3).  All of these connections culminate in Sima Qian’s biography of Laozi (see below) which not only says that Laozi was the author of the DDJ, but explains that it was a written text of Laozi’s teachings given when he departed China to go to the West.  So, by the 1st Cent. B.C.E., it was accepted by tradition and lore that Laozi was the author of the DDJ.

However, the attribution of authorship of the DDJ to Laozi is much more complicated than it first appears.  The DDJ has 81 chapters and about 5,000 Chinese characters, depending on which text is used. Its two major divisions are the dao jing (chs. 1-37) and the de jing (chs. 38-81). But actually, this division probably rests on nothing other than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). Moreover, although the text has been studied by commentators in Chinese history for centuries, the general reverence shown to it, and the long standing tradition that it was the work of the great philosopher Laozi, were two factors militating against any critical literary analysis of its structure. What we know now is that in spite of the view that the text had a single author named Laozi, it is clear to textual critics that the work is a collection of smaller passages edited into sections and not the work of a single hand. Most of these probably circulated orally, perhaps as single teachings or in small collections. Later they were gathered and arranged by an editor.

The internal structure of the DDJ is only one ground for the denial of a single author for the text.  The fact that we also now know there were multiple versions of the DDJ, even as early as 300 B.C.E., also suggests that it is unlikely that a single author wrote just one book that we now know as the DDJ.  Consider that for almost 2,000 years the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who made a complete edition of the DDJ sometime between 226-249 C.E. Although Wang Bi was not a Daoist, the commentary he wrote after collecting and editing the text became a standard interpretive guide, and generally speaking even today scholars depart from his arrangement of the actual text only when they can make a compelling argument for doing so. However, based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we have no doubt that there were several simultaneously circulating versions of the DDJ text that pre-dated Wang Bi’s compilation of what we now call the “received text.”

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries include two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching.

The Guodian find consists of 730 inscribed bamboo slips found in a tomb near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding only to Chapters 1-66. Based on the probable date of the closing of the tomb, the version of the DDJ found within it may date as early as c. 300 B.C.E.

3. The First Biography and the Establishment of Laozi as the Founder of Daoism

We have now arrived at the stage where studies of Laozi’s biography usually begin.

The first known attempt to write a biography of Laozi is in the Shiji (Historical Records) by Sima Qian (145-89 B.C.E.). According to this text, Laozi was a native of Chu, a southern state of the Zhou dynasty.  His surname was Li, and his personal name was Er, and his style name was Dan. Sima Qian reports that Laozi was a historiographer in charge of the archives of Zhou.  Moreover, Sima Qian tells us that Confucius had traveled to see Laozi to learn about the performance of rituals from him.  According to The Book of Rites (Liji), a master known as Lao Dan was an expert on mourning rituals. On four occasions, Confucius (Kongzi, Master Kong) is reported to have responded to questions by appealing to answers given by Lao Dan. The records even say that Confucius once assisted him in a burial service.  Just what date we can put on this record from The Book of Rites is uncertain, but it may have informed Sima Qian’s biography.

According to the biography, during the course of their conversations Laozi told Confucius to give up his prideful ways and seeking of power.  When Confucius returned to his disciples, he told them that he was overwhelmed by the commanding presence of Laozi, which was like that of a mighty dragon.  The biography goes on to say that Laozi cultivated the dao and its de. However, as the state of Zhou continued to decline, Laozi decided to leave China through the Western pass (toward India) and that upon his departure he gave to the keeper of the pass, one Yin Xi, a book divided into two parts, one on dao and one on de, and of 5,000 characters in length.  After that, no one knew what became of him.  This is perhaps the most familiar of the traditions narrated by Sima Qian and it contains the core of most every subsequent biography or hagiography of Laozi of significance.  However, the biography did not end here.  Sima Qian went on to record what other sources said about Laozi.

In the first biography, Sima Qian says some report that Laolaizi came from Chu, was a contemporary of Confucius, and he authored a work in fifteen sections which speaks of the practical uses of the Daoist teachings.  But Sima Qian leaves it undecided whether he thinks Laolaizi should be identified with Laozi, even if he does include this reference in the section on Laozi.

Sima Qian adds another layer to the biography without commenting on the degree of confidence he has in its truthfulness, according to which it is said that Laozi lived 160 years or even 200 years, as a result of cultivating the dao and nurturing his longevity.

An additional tradition included in the first biography is that Dan, the historiographer of Zhou predicted in 479 B.C.E. that Zhou and Qin would break apart and that a new king would arise from Qin.  The point of this tradition is that Dan (Lao Dan?) had the power to predict the political future of the people, including the fragmentation of the Zhou dynasty and the rise of the Qin in about 221 B.C.E. (that is, Qinshihuang, or the first emperor of China). But Sima Qian likewise refuses to identify Laozi with this Dan.

Finally, the first biography concludes with a reference to Laozi’s son and his descendants. Another movement in the evolution of the Laozi story was completed by about 240 B.C.E. This was necessitated by Lao Dan’s association with the grand historiographer Dan during the Zhou, who predicted the rise of the Qin state. This information, along with that of Laozi’s journey to the West, and of the writing of the book for Yin Xi won a favorable position for Laozi during the Qin dynasty. The association of Laozi with a text (the DDJ) that was becoming increasingly significant was important. However, with the demise of the Qin state, some realignment of Laozi’s connection with them was needed. So, Qian’s final remarks about Laozi’s son helped to associate the philosopher’s lineage with the new Han ruling family. The journey to the West component now also had a new force. It explained why Laozi was not presently advising the Han rulers.

Overall, it seems that the earliest biography conforms closely to passages contained in Zhuangzi Chapters 11-14 and 26 in associating Laozi with the archivist or historiographer of Zhou, Laozi’s rebuke of Confucius’s prideful seeking of fame and pursuit of power, and the report that Confucius told his disciples that Laozi was like a great dragon. It is possible, then, that Zhuangzi is thus the ultimate source of Sima Qian’s information.

Sima Qian also says, “Laozi cultivated the dao and its virtue (de).” We recognize of course that “dao and its virtue” is Dao and de and that this phase is meant to solidify Laozi’s association with the Daode jing. What the Zhuangzi, Hanfeizi and Huainanzi only alluded to by putting near quotes from the DDJ in the mouth of Laozi, Sima Qian now makes into an explicit connection.  He even tells us that when the Zhou kingdom began to decline, Laozi decided to leave China and head into the West. When he reached the mountain pass, the keeper of the pass (Yin Xi) insisted that he write down his teachings, so that the people would have them after he left. So, “Laozi wrote a book in two parts, discussing the ideas of the dao and of de in some 5,000 words, and departed. No one knows where he ended his life.” These remarks make an unmistakable connection between what Laozi is said to have delivered to Yin Xi and the two sectional divisions of the DDJ and a very close approximation to its exact number of characters.

Sima Qian classified the Six Schools as Yin-Yang, Confucian, Mohist, Legalists, School of Names, and Daoists. Since his biography located Laozi in a time period predating the Zhuangzi, and the passages in the Zhuangzi seemed to be about a person who lived in the time of Confucius (and not to be simply a literary or traditional invention), then the inference was easy to make that Laozi was the founder of the Daoist school.

4. The Ongoing Laozi Myth

In The Lives of the Immortals (Liexuan zhuan) by Liu Xiang (79-8 B.C.E.) there are separate entries for Laozi and Yin Xi. According to the extension of the story of Laozi’s leaving China through the Western pass found in Liu Xiang’s work, Yin became a disciple of Laozi and begged him to allow him to go to the West as well. Laozi told him that he could come along, but only after he cultivated the dao. Laozi instructed Yin to study hard and await a summons which would be delivered to him in the marketplace in the city of Chengdu. There is now a shrine at the putative location of this site dedicated to “ideal disciple.” Additionally, in Liu Xiang’s text it is clear that Laozi is valorized as the preeminent immortal and as a superior daoshi (fangshi) who had achieved not only immortality through wisdom and the practice of techniques for longevity, but also mastery of the arts associated with the abilities and skills of one who was united with dao (compare the “Spirit Man” living in the Gushe mountains in Zhuangzi ch. 1 and Wang Ni’s remarks on the perfected person or zhenren in ch. 2).

Another important stage in the development of Laozi’s place in Chinese philosophical history occurred when Emperor Huan (147-167 C.E.) built a palace on the traditional site of Laozi’s birthplace and authorized veneration and sacrifice to Laozi. The “Inscription to Laozi” (Laozi ming) written by Pian Shao in c. 166 C.E. as a commemorative marker for the site goes well beyond Sima Qian’s biography. It makes the first apotheosis of Laozi into a deity. The text makes reference to the many cosmic metamorphoses of Laozi, portraying him as having been counselor to the great sage kings of China’s misty pre-history. Accordingly, during this period of the 2nd and 3rd centuries, the elite at the imperial court divinized Laozi and regarded him as an embodiment or incarnation of the dao, a kind of cosmic emperor who knew how to bring things into perfect harmony and peace by acting in wu-wei.

The Daoist cosmological belief in the powers of beings who experienced unity with the dao to effect transformation of their bodies and powers (for example, Huzi in Zhuangzi, ch.7) was the philosophical underpinning of the work, Classic on the Transformations of Laozi (Laozi bianhua jing, late 100s C.E., available now in a Dunhuang manuscript dating 612 C.E.). This work reflects some of the ideas in Pian Shao’s inscription, but takes them even further. It tells how Laozi transformed into his own mother and gave birth to himself, taking quite literally comments in the DDJ where the dao is portrayed as the mother of all things (DDJ, ch. 1). The work associates Laozi with various manifestations or incarnations of the dao itself.  In this text there is a complete apotheosis of Laozi into a numinal divinity. “Laozi rests in the great beginning, wanders in the great origin, floats through dark, numinous emptiness…He joins serene darkness before its opening, is present in original chaos before the beginnings of time….Alone and without relation, he has existed since before heaven and earth. Living deeply hidden, he always returns to be. Gone, the primordial; Present, a man” (Quoted in Kohn, “Myth,” 47). The final passage in this work is an address given by Laozi predicting his reappearance and promising liberation from trouble and the overthrow of the Han dynasty, an allusion that helps us fix the probable date of origin for the work.   The millennial cults of the second century believed Laozi was a messianic figure who appeared to their leaders and gave them instructions and revelations (for example, the hagiography of Zhang Daoling, founder of the Celestial Master Zhengyi movement contained in the 5th century work, Taiping Guangji 8).

The period of the Celestial Masters (c. 142-260 C.E.) produced documents enhancing the myth of Laozi who came then to be called Laojun (Lord Lao) or Taishang Laojun (Most High Lord Lao). Laojun could manifest himself in any time of unrest and bring Great Peace (taiping). Yet, the Celestial Masters never claimed that Laojun had done so in their day. Instead of such a direct manifestation, the Celestial Masters practitioners taught that Laojun transmitted to them talismans, registers, and new scriptures in the form of texts to guide the creation of communities of heavenly peace.  One work, very likely from the late 3rd or early 4th century C.E. entitled The Hundred and Eight Precepts Spoken by Lord Lao (Laojun shuo yibai bashi jie) became the earliest set of behavioral guides for Celestial Masters communities.  According to the text, Laozi delivered these precepts after returning from India and finding the people in a state of corruption.

During the reign of Emperor Huidi of the Western Jin dynasty (290-306 C.E.), Wang Fu, a master within the Daoist sectarian group known as the Celestial Masters, often debated with the Buddhist monk Bo Yuan about philosophical beliefs.  As a result of these exchanges, scholarly consensus holds that Wang Fu compiled a one scroll work entitled Classic of the Conversion of the Barbarians (Huahu jing, c. 300 C.E.).  The work is also known by the title The Supreme Numinous Treasure’s Sublime Classic on Laozi’s Conversion of the Barbarians (Taishang lingbao Laozi huahu miaojing). Perhaps the most inflammatory claim of this work was its teaching that when Laozi left China through the Western pass he went to India, where he transmorphed into the historical Buddha and converted the barbarians. The basic implication of the book was that Buddhism was actually only a form of Daoism.  This work inflamed Buddhists for decades.  In fact, both of the Tang Emperors Gaozong (649-683 C.E.) and Zhongzong (705-710 C.E.) gave imperial orders to prohibit its distribution. However, as bitter contention continued between Buddhism and Daoism, the Daoists actually expanded the Classic of the Conversion of the Barbarians, so that by 700 C.E. it was ten scrolls in length.  Four of these were recovered in the Dunhuang cache of manuscripts.  The much extended work came to include the account that Laozi entered the mouth of a queen in India and the next year was born from her right arm-pit to become the Buddha. He walked immediately after his birth, and “from then on Buddhist teaching came to flourish.”  To those familiar with the hagiographies of the Buddha, virtually all of this birth account is recognizable as associated with Buddha, not Laozi.

In the course of the production of polemical writings on the Buddhist side of the debate, attempts were made to turn the tables on the Daoists.  Laozi was portrayed as a bodhisattva or disciple of the Buddha sent to convert the Chinese.  This theory had other desirable extensions from a Buddhist viewpoint, because it was also applied to Confucius, enabling Buddhist rhetoricians to hold that Confucius was an avatar of Buddhism and that Confucianism was actually a form of distorted Buddhism.

Most later writings about Laozi continued to base their appeals to Laozi’s authority on his ongoing transformations, but they likewise provide evidence of the growing tension between Daoism and Buddhism. The first mythological account of Laozi’s birth is in the Classic of the Inner Explanation of the Three Heavens (Santian neijie jing), a Celestial Master work dated about 420 C.E. In this text, Laozi has three births: as the manifestation of the dao from pure energy to become a deity in heaven; in human form as the ancient philosopher author of the DDJ; and as the Buddha after his journey to the West. In the first birth, his mother is known as “The Jade Maiden of Mystery and Wonder.” In his second, he is born to a human woman known as Mother Li. This was an eighty-one year pregnancy, after which he was born from her left armpit (there is a tradition that Buddha had been born from his mother’s right arm pit). At birth he had white hair and so he was called laozi (here meaning something more like lao haizi or Old Child). This birth is set in the time of the Shang dynasty, several centuries before the date Sima Qian reports. But the purpose of such a move in the Laozi legend is to allow him time to travel to the West and then become the Buddha. The third birth takes place in India as the Buddha.

In the Yuan dynasty (1285 C.E.), Emperor Shizu ordered the burning of the Daoist canon of texts, and according to lore, the first writing destroyed was the greatly extended version of Classic of the Conversion of the Barbarians in ten or more scrolls.  Once again, though, the text and its story of Laozi seemed quite resilient. It reappeared in the form of an illustrated work entitled Eighty-one Transformations of Lord Lao (Laojun bashiyi hua tushuo).  The Buddhist thinker Xiangmai wrote a detailed, but polemical, history of this text and few scholars trust its reliability.  Whether the Eighty-one Transformations of Lord Lao still survives is arguable, although a work entitled Eighty-one Transformations of the Most High Lord Lao of Mysterious Origin of the Golden Portal (Jinque xuanyuan Taishang Laojun baishiyi hua tushuo) with illustrations and dating to 1598 is held in the Museum fur Volkerkunde in Berlin.  The version in Berlin provides an illustration for each of Laozi’s transformations, each accompanied by a short text. The first few depict his existence in cosmic time.  It is not until the 11th transformation that he enters historical time during the era of Fu Xi by the name Yuhuazi.  In his 34th transformation, Laozi sends Yin Xi to explain the sutras to the Indian barbarians.  The 58th transformation is Laozi’s appearance in the clouds to Zhang Daoling, the founder of the Celestial Master Zhengyi sect of Daoism that still exists today.

Ge Hong’s (283-343 C.E.) The Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is arguably the most important Daoist philosophical work of the Jin dynasty. In this text, Ge Hong reports that in a state of visualization he saw Laozi, seven feet tall, with cloudlike garments of five colors, wearing a multi-tiered cap and carrying a sharp sword. According to Ge Hong, Laozi had a prominent nose, long eyebrows, and an elongated head. This physiological type was the template for portraying immortals in Daoist art.  Whereas Liu Xiang’s Collected Biographies of the Immortals (Liexian zhuan, c. 18 B.C.E.) reports that Laozi was born during the Shang dynasty, served as an archivist under the Zhou, was a teacher of Confucius, and later made his way to the West just as said in Sima Qian’s standard biography, Ge Hong also collected and edited the Biographies of Immortals (Shenxian zhuan).  According to the article on Laozi, Ge Hong praises Laozi’s practice of stillness and wu-wei, but he also represents Laozi as a master of the techniques of immortality and the efficacy of external alchemy, herbs and control of qi.  He attributes to Laozi what is called the alchemy of the nine cinnabars and eight minerals, as well as a vast knowledge of herbology and dietetics.  Ge Hong also tells a story about one Xu Jia who was a retainer of Laozi.  In the story, Laozi keeps Xu Jia alive by means of a powerful talisman placed in Xu’s mouth.  Its removal causes Xu’s death.  When replaced, Xu Jia lives again.  In all this, Laozi is portrayed as a master of life and death by means of talismanic power, a practice used by the Celestial Masters and continued by Daoist masters as late as the Ming dynasty, if not into the present era.

Other reported manifestations of Laozi gave authority to new Daoist lineages or modifications of practice.  For example, the Daoist master Kou Qianzhi reported a revelation received from Laozi in 415 C.E. which was a “New Code” for Daoist practitioners and communities.  He wrote down the revelation in a text that became known as Classic on Precepts of Lord Lao Recited to the Melody of the Clouds (Laojun yinsong jiejing).  This text contains 36 moral precepts each of which trace their authority to the introductory phrase, “Lord Lao said….”  Textual traces are not the only sources for the traditions and views of Laozi in Chinese philosophical history. Yoshiko Kamitsuka has done a study of how views about Laozi changed and been reflected in material culture, especially sculpture and inscription.

Laozi was also often looked to for political validation.  Throughout most of the Tang dynasty (618-907 C.E.), Laozi was regarded as the protector of the state because of the tradition that both the Tang ruling family and Laozi shared the surname Li and because of many reports of auspicious appearances of Laozi at the inauguration of the Tang dynasty in which he pledged his support during the rise and solidification of the ruling bureaucracy.

The hagiography of Laozi has continued to develop, down to the present day. There are even traditions that various natural geographic landmarks and features are the enduring imprint of Lord Lao on China and his face can be seen in them. It is more likely, of course, that Laozi’s immortality is in the mark made by the philosophical movement he has come to represent and the culture it created.

5. References and Further Reading

  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, William. (2005). “The Composite Nature of Early Chinese Texts.” In Text and Ritual in Early China, ed. Martin Kern. 50-78. Seattle: University of Washington Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Giles, Lionel. (1948). A Gallery of Chinese Immortals. London: John Murray.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. [1998 (1986)], “The Origins of the Legend of Lao Tan.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 23-41. Albany: State University of New York Press.
  • Hansen, Chad. (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kamitsuka, Yoshiko, (1998). “Lao-Tzu in Six Dynasties Taoist Sculpture.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 63-89. Albany: State University of New York Press.
  • Kim, Tae Hyun. (2010). “Other Laozi Parallels in the Hanfeizi An Alternative Approach to the Textual History of the Laozi and Early Chinese Thought.” Sino-Platonic Papers 199 (March 2010), ed. Victor H. Mair. Philadelphia: University of Pennsylvania Press.
  • Kohn, Livia (2008). “Laojun yinsong jiejing [Classic on Precepts of Lord Lao, Recited to the Melody in the Clouds].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Kohn, Livia, (1998). “The Lao-Tzu Myth.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 41-63. Albany: State University of New York Press.
  • Kohn, Livia, (1996). “Laozi: Ancient Philosopher, Master of Longevity, and Taoist God.” In Religions of China in Practice, ed. Donald S. Lopez, 52-63. Princeton: Princeton University Press.
  • Kohn, Livia and LaFargue, Michael. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold (2002) Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Nylan, Michael and Csikzentmihalyi, Mark. (2003). “Constructing Lineages and Inventing Traditions through Exemplary Figures in Early China.” T’oung Pao 89: 1-41.
  • Penny, Benjamin (2008). “Laojun bashiyi huatu [Eighty-one Transformations of Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Penny, Benjamin (2008). “Laojun shuo yibai bashi jie [The 180 Precepts Spoken by Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Smith, Kidder (2003). “Sima Tan and the Invention of Daoism, ‘Legalism,’ et cetera.” The Journal of Asian Studies 62.1: 129-156.
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press.
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Daoist Philosophy

daoAlong with Confucianism, “Daoism” (sometimes called “Taoism“) is one of the two great indigenous philosophical traditions of China. As an English term, Daoism corresponds to both Daojia (“Dao family” or “school of the Dao”), an early Han dynasty (c. 100s B.C.E.) term which describes so-called “philosophical” texts and thinkers such as Laozi and Zhuangzi, and Daojiao (“teaching of the Dao”), which describes various so-called “religious” movements dating from the late Han dynasty (c. 100s C.E.) onward.  Thus, “Daoism” encompasses thought and practice that sometimes are viewed as “philosophical,” as “religious,” or as a combination of both.  While modern scholars, especially those in the West, have been preoccupied with classifying Daoist material as either “philosophical” or “religious,” historically Daoists themselves have been uninterested in such categories and dichotomies.  Instead, they have preferred to focus on understanding the nature of reality, increasing their longevity, ordering life morally, practicing rulership, and regulating consciousness and diet.  Fundamental Daoist ideas and concerns include wuwei (“effortless action”), ziran (“naturalness”), how to become a shengren (“sage”) or zhenren (“perfected person”), and the ineffable, mysterious Dao (“Way”) itself.

Table of Contents

  1. What is Daoism?
  2. Classical Sources for Our Understanding of Daoism
  3. Is Daoism a Philosophy or a Religion?
  4. The Daodejing
  5. Fundamental Concepts in the Daodejing
  6. The Zhuangzi
  7. Basic Concepts in the Zhuangzi
  8. Daoism and Confucianism
  9. Daoism in the Han
  10. Celestial Masters Daoism
  11. Neo-Daoism
  12. Shangqing and Lingbao Daoist Movements
  13. Tang Daoism
  14. The Three Teachings
  15. The “Destruction” of Daoism
  16. References and Further Reading

1. What is Daoism?

Strictly speaking there was no Daoism before the literati of the Han dynasty (c. 200 B.C.E.) tried to organize the writings and ideas that represented the major intellectual alternatives available. The name daojia, “Dao family” or “school of the dao” was a creation of the historian Sima Tan (d. 110 B.C.E.) in his Shi ji (Records of the Historian) written in the 2nd century B.C.E. and later completed by his son, Sima Qian (145-86 B.C.E.). In Sima Qian’s classification, the Daoists are listed as one of the Six Schools: Yin-Yang, Confucian, Mohist, Legalist, School of Names, and Daoists. So, Daoism was a retroactive grouping of ideas and writings which were already at least one to two centuries old, and which may or may not have been ancestral to various post-classical religious movements, all self-identified as daojiao (“teaching of the dao“), beginning with the reception of revelations from the deified Laozi by the Celestial Masters (Tianshi) lineage founder, Zhang Daoling, in 142 C.E.This article privileges the formative influence of early texts, such as the Daodejing and the Zhuangzi, but accepts contemporary Daoists’ assertion of continuity between classical and post-classical, “philosophical” and “religious” movements and texts.

2. Classical Sources for Our Understanding of Daoism

Daoism does not name a tradition constituted by a founding thinker, even though the common belief is that a teacher named Laozi originated the school and wrote its major work, called the Daodejing, also sometimes known as the Laozi. The tradition is also called “Lao-Zhuang” philosophy, referring to what are commonly regarded as its two classical and most influential texts: the Daodejing or Laozi (3rd Cn. B.C.E.) and the Zhuangzi (4th-3rd Cn. B.C.E.). However, various streams of thought and practice were passed along by masters (daoshi) before these texts were finalized. There are two major source issues to be considered when forming a position on the origins of Daoism. 1) What evidence is there for beliefs and practices later associated with the kind of Daoism  recognized by Sima Qian prior to the formation of the two classical texts? 2) What is the best reconstruction of the classical textual tradition upon which later Daoism was based?

With regard to the first question, Isabelle Robinet thinks that the classical texts are only the most lasting evidence of a movement she associates with a set of writings and practices associated with the Songs of Chu (Chuci), and that she identifies as the Chuci movement. This movement reflects a culture in which male and female masters variously called fangshi, daoshi, zhenren, or daoren practiced techniques of longevity and used diet and meditative stillness anto create a way of life that attracted disciples and resulted in wisdom teachings. While Robinet’s interpretation is controversial, there are undeniable connections between the Songs of Chu and later Daoist ideas. Some examples include a coincidence of names of immortals (sages), a commitment to the pursuit of physical immortality, a belief in the epistemic value of stillness and quietude, abstinence from grains, breathing and sexual practices used to regulate internal energy (qi), and the use of ritual dances that resemble those still done by Daoist masters (the step of Yu).

In addition to the controversial connection to the Songs of Chu, the Guanzi (350-250 B.C.E.) is a text older than both the Daodejing and probably all of the Zhuangzi, except the “inner chapters” (see below). The Guanzi  is a very important work of 76 “chapters.” Three of the chapters of the Guanzi are called the Neiye, a title which can mean “inner cultivation.” The self-cultivation practices and teachings put forward in this material may be fruitfully linked to several other important works: the Daodejing; the Zhuangzi; a Han dynasty Daoist work called the Huainanzi; and an early commentary on the Daodejing called the Xiang’er. Indeed, there is a strong meditative trend in the Daoism of late imperial China known as the “inner alchemy” tradition and the views of the Neiye seem to be in the background of this movement. Two other chapters of the Guanzi are called Xin shu (Heart-mind book). The Xin shu connects the ideas of quietude and stillness found in both the Daodejing and Zhuangzi to longevity practices. The idea of dao in these chapters is very much like that of the classical works. Its image of the sage resembles that of the Zhuangzi. It uses the same term (zheng) that Zhuangzi uses for the corrections a sage must make in his body, the pacification of the heart-mind, and the concentration and control of internal energy (qi). These practices are called “holding onto the One,” “keeping the One,” “obtaining the One,” all of which are phrases also associated with the Daodejing (chs. 10, 22, 39).

The Songs of Chu and Guanzi still represent texts which are themselves creations of actual practitioners of Daoist teachings and sentiments, just as do the Daodejing and Zhuangzi.  Who these persons were we do not know with certainty.  It is possible that we do have the names, remarks, and practices of some of these individuals (daoshi) embodied in the passages of the Zhuangzi. For example, in Chs. 1-7 alone, Xu You, Ch.1; Lianshu, Ch.1; Ziqi Ch. 2; Wang Ni, Ch. 2; Changwuzi, Ch. 2; Qu Boyu, Ch. 4; Carpenter Shi, Ch. 4; Bohun Wuren, Ch. 5; Nu Y, Ch. 6; Sizi, Yuzi, Lizi, Laizi, Ch. 6; Zi Sanghu, Meng Zifan, Zi Qinzan, Ch. 6; Yuzi and Sangzi, Ch. 6; Wang Ni and Putizi, Ch. 7; Jie Yu, Ch. 7; Lao Dan, Ch. 7; Huzi, Ch. 7).

As for a reasonable reconstruction of the textual tradition upon which Daoism is based, we should not try to think of this task so simply as determining the relationship between the Daodejing and the Zhuangzi, such as which text was first and which came later. These texts are composite. The Zhuangzi, for example, repeats in very similar form sayings and ideas  found in the Daodejing, especially in the essay composing Zhuangzi Chs. 8-10. However, we are not certain whether this means that whomever was the source of this material in the Zhuangzi knew the Daodejing and quoted it, or if they both drew from a common source, or even if the Daodejing in some way depended on the Zhuangzi. In fact, one theory about the legendary figure Laozi is that he was created first in the Zhuangzi and later became associated with the Daodejing. There are seventeen passages in which Laozi (a.k.a. Lao Dan) plays a role in the Zhuangzi and he is not mentioned by name in the Daodejing.

Based on what we know now, we could offer the following summary of the sources of early Daoism. Stage One: Zhuang Zhou’s “inner chapters” (chs. 1-7) of the Zhuangzi (c. 350 B.C.E.) and some components of the Guanzi, including perhaps both the Neiye and the Xin shu. Stage Two: The essay in Chs. 8-10 of the Zhuangzi  and some collections of material which represent versions of our final redaction of the Daodejing, as well as Chs. 17-28 of the Zhuangzi representing materials likely gathered by Zhuang Zhou’s disciples. Stage Three: the “Yellow Emperor” (Huang-Lao) manuscripts from Mawangdui and of the Zhuangzi (Chs. 11-19, and 22), and the text known as the Huainanzi (c. 139 B.C.E.).

3. Is Daoism a Philosophy or a Religion?

In the late 1970s Western and comparative philosophers began to point out that an important dimension of the historical context of Daoism was being overlooked because the previous generation of scholars had ignored or even disparaged connections between the classical texts and Daoist religious belief and practice not previously thought to have developed until the 2nd century C.E. We have to lay some of the responsibility for a prejudice against Daoism as a religion and the privileging of its earliest forms as a pure philosophy at the feet of the eminent translators and philosophers Wing-Tsit Chan and James Legge, who both spoke of Daoist religion as a degeneration of a pristine Daoist philosophy arising from the time of the Celestial Masters (see below) in the late Han period. Chan and Legge were instrumental architects in the West of the view that Daoist philosophy (daojia) and Daoist religion (daojiao) are entirely different traditions.

Actually, our interest in trying to separate philosophy and religion in Daoism is more revealing of the Western frame of reference we use than of Daoism itself. Daoist ideas fermented among master teachers who had a holistic view of life. These daoshi (Daoist masters) did not compartmentalize practices by which they sought to influence the forces of reality, increase their longevity, have interaction with realities not apparent to our normal way of seeing things, and order life morally and by rulership. They offered insights we might call philosophical aphorisms. But they also practid meditative stillness and emptiness to gain knowledge, engaged in physical exercises to increase the flow of inner energy (qi), studied nature for diet and remedy to foster longevity, practiced rituals related to their view that reality had many layers and forms with whom/which humans could interact, wrote talismans and practiced divination, engaged in spellbinding of “ghosts,” led small communities, and advised rulers on all these subjects. The masters transmitted their teachings, some of them only to disciples and adepts, but gradually these teachings became more widely available as is evidenced in the very creation of the Daodejing and Zhuangzi themselves.

The anti-supernaturalist and anti-dualist agendas that provoked Westerners to separate philosophy and religion, dating at least to the classical Greek period of philosophy was not part of the preoccupation of Daoists. Accordingly, the question whether Daoism is a philosophy or a religion is not one we can ask without imposing a set of understandings, presuppositions, and qualifications that do not apply to Daoism. But the hybrid nature of Daoism is not a reason to discount the importance of Daoist thought. Quite to the contrary, it may be one of the most significant ideas classical Daoism can contribute to the study of philosophy in the present age.

4. The Daodejing

The Daodejing (hereafter, DDJ) is divided into 81 “chapters” consisting of slightly over 5,000 Chinese characters, depending on which text is used. In its received form from Wang Bi (see below), the two major divisions of the text are the dao jing (chs. 1-37) and the de jing (chs. 38-81). Actually, this division probably rests on little else than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). The text is a collection of short aphorisms that were not arranged to develop any systematic argument. The long standing tradition about the authorship of the text is that the “founder” of Daoism, known as Laozi gave it to Yin Xi, the guardian of the pass through the mountains that he used to go from China to the West (i.e., India) in some unknown date in the distant past. But the text is actually a composite of collected materials, most of which probably originally circulated orally perhaps even in single aphorisms or small collections. These were then redacted as someone might string pearls into a necklace. Although D.C. Lau and Michael LaFargue had made preliminary literary and redaction critical studies of the texts, these are still insufficient to generate any consensus about whether the text was composed using smaller written collections or who were the probable editors.

For almost 2,000 years, the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who used a complete edition of the DDJ sometime between 226-249 CE. Although Wang Bi was not a Daoist, his commentary became a standard interpretive guide, and generally speaking even today scholars depart from it only when they can make a compelling argument for doing so. Based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we are certain that there were several simultaneously circulating versions of the Daodejing text as early as c. 300 B.C.E.

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries consist of two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching (1989). Contemporary scholarship associates the Mawangdui versions with a type of Daoism known as the Way of the Yellow Emperor and the Old Master (Huanglao Dao).

The Guodian find consists of 730 inscribed bamboo slips found near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding to Chapters 1-66. It may date as early as c. 300 B.C.E. If this is a correct date, then the Daodejing was already extant in a written form when the “inner chapters” (see below) of the Zhuangzi were composed. These slips contain more significant variants from the Wang Bi than do the Mawangdui versions. A complete translation and study of the Guodian cache has been published by Scott Cook (2013).

5. Fundamental Concepts in the Daodejing

The term Dao means a road, and is often translated as “the Way.” This is because sometimes dao is used as a nominative (that is, “the dao”) and other times as a verb (i.e. daoing). Dao is the process of reality itself, the way things come together, while still transforming. All this reflects the deep seated Chinese belief that change is the most basic character of things. In the Yi jing (Classic of Change) the patterns of this change are symbolized by figures standing for 64 relations of correlative forces and known as the hexagrams. Dao is the alteration of these forces, most often simply stated as yin and yang. The Xici is a commentary on the Yi jing formed in about the same period as the DDJ. It takes the taiji (Great Ultimate) as the source of correlative change and associates it with the dao. The contrast is not between what things are or that something is or is not, but between chaos (hundun) and the way reality is ordering (de). Yet, reality is not ordering into one unified whole. It is the 10,000 things (wanwu). There is the dao but not “the World” or “the cosmos” in a Western sense.

The Daodejing teaches that humans cannot fathom the Dao, because any name we give to it cannot capture it. It is beyond what we can express in language (ch.1). Those who experience oneness with dao, known as “obtaining dao,” will be enabled to wu-weiWu-wei is a difficult notion to translate. Yet, it is generally agreed that the traditional rendering of it as “nonaction” or “no action” is incorrect. Those who wu wei do act. Daoism is not a philosophy of “doing nothing.” Wu-wei means something like “act naturally,” “effortless action,” or “nonwillful action.” The point is that there is no need for human tampering with the flow of reality. Wu-wei should be our way of life, because the dao always benefits, it does not harm (ch. 81) The way of heaven (dao of tian) is always on the side of good (ch. 79) and virtue (de) comes forth from the dao alone (ch. 21). What causes this natural embedding of good and benefit in the dao is vague and elusive (ch. 35), not even the sages understand it (ch. 76). But the world is a reality that is filled with spiritual force, just as a sacred image used in religious ritual might be inhabited by numinal power (ch. 29). The dao occupies the place in reality that is analogous to the part of a family’s house set aside for the altar for venerating the ancestors and gods (the ao of the house, ch. 62). When we think that life’s occurrences seem unfair (a human discrimination), we should remember that heaven’s (tian) net misses nothing, it leaves nothing undone (ch. 37)

A central theme of the Daodejing is that correlatives are the expressions of the movement of dao. Correlatives in Chinese philosophy are not opposites, mutually excluding each other. They represent the ebb and flow of the forces of reality: yin/yang, male/female; excess/defect; leading/following; active/passive. As one approaches the fullness of yin, yang begins to horizon and emerge and vice versa. Its teachings on correlation often suggest to interpreters that the DDJ is filled with paradoxes. For example, ch. 22 says, “Those who are crooked will be perfected. Those who are bent will be straight. Those who are empty will be full.” While these appear paradoxical, they are probably better understood as correlational in meaning. The DDJ says, “straightforward words seem paradoxical,” implying, however, that they are not (ch. 78).

What is the image of the ideal person, the sage (sheng ren), or  the perfected person (zhen ren) in the DDJ? Well, sages wu-wei, (chs. 2, 63). They act effortlessly and spontaneously as one with dao and in so doing, they “virtue” (de) without deliberation or volitional challenge. In this respect, they are like newborn infants, who move naturally, without planning and reliance on the structures given to them by culture and society (ch. 15). The DDJ tells us that sages empty themselves, becoming void of the discriminations  used in conventional language and culture. Sages concentrate their internal energies (qi). They clean their vision (ch. 10). They manifest naturalness and plainness, becoming like uncarved wood (pu) (ch. 19). They live naturally and free from desires rooted in the discriminations that human society makes (ch. 37) They settle themselves and know how to be content (ch. 46). The DDJ makes use of some very famous analogies to drive home its point. Sages know the value of emptiness as illustrated by how emptiness is used in a bowl, door, window, valley or canyon (ch. 11). They preserve the female (yin), meaning that they know how to be receptive to dao and its power (de) and are not unbalanced favoring assertion and action (yang) (ch. 28). They shoulder yin and embrace yang, blend internal energies (qi) and thereby attain harmony (he) (ch. 42). Those following the dao do not strive, tamper, or seek to control their own lives (ch. 64). They do not endeavor to help life along (ch. 55), or use their heart-mind (xin) to “solve” or “figure out” life’s apparent knots and entanglements (ch. 55). Indeed, the DDJ cautions that those who would try to do something with the world will fail, they will actually ruin both themselves and the world (ch. 29). Sages do not engage in disputes and arguing, or try to prove their point (chs. 22, 81). They are pliable and supple, not rigid and resistive (chs. 76, 78). They are like water (ch. 8), finding their own place, overcoming the hard and strong by suppleness (ch. 36). Sages act with no expectation of reward (chs. 2, 51). They put themselves last and yet come first (ch. 7). They never make a display of themselves, (chs. 72, 22). They do not brag or boast, (chs. 22, 24) and they do not linger after their work is done (ch. 77). They leave no trace (ch. 27). Because they embody dao in practice, they have longevity (ch. 16). They create peace (ch. 32). Creatures do not harm them (chs. 50, 55). Soldiers do not kill them (ch. 50). Heaven (tian) protects the sage and the sage’s spirit becomes invincible (ch. 67).

Among the most controversial of the teachings in the DDJ are those directly associated with rulers. Recent scholarship is moving toward a consensus that the persons who developed and collected the teachings of the DDJ played some role in advising civil administration, but they may also have been practitioners of ritual arts and what we would call religious rites. Be that as it may, many of the aphorisms directed toward rulers in the DDJ seem puzzling at first sight. According to the DDJ, the proper ruler keeps the people without knowledge, (ch. 65), fills their bellies, opens their hearts and empties them of desires (ch. 3). A sagely ruler reduces the size of the state and keeps the population small. Even though the ruler possesses weapons, they are not used (ch. 80). The ruler does not seek prominence. The ruler is a shadowy presence, never standing out (chs. 17, 66). When the ruler’s work is done, the people say they are content (ch. 17). This picture of rulership in the DDJ is all the more interesting when we remember that the philosopher and legalist political theorist named Han Feizi used the DDJ as a guide for the unification of China. Han Feizi was the foremost counselor of the first emperor of China, Qin Shihuangdi (r. 221-206 B.C.E.). However, it is a pity that the emperor used the DDJ’s admonitions to “fill the bellies and empty the minds” of the people to justify his program of destroying all books not related to medicine, astronomy or agriculture. When the DDJ says that rulers keep the people without knowledge, it probably means that they do not encourage human knowledge as the highest form of knowing but rather they encourage the people to “obtain oneness with the dao.”

6. The Zhuangzi

The second of the two most important classical texts of Daoism is the Zhuangzi. This text is a collection of stories and remembered as well as imaginary conversations.  The text is well known for its creativity and skillful use of language. Within the text we find longer and shorter treatises, stories, poetry, and aphorisms. The Zhuangzi may date as early as the 4th century B.C.E. and according to imperial bibliographies of a later date, the Zhuangzi originally had 52 “chapters.” These were reduced to 33 by Guo Xiang in the 3rd century C.E., although he seems to have had the 52 chapter text available to him.  Ronnie Littlejohn has argued that the later work Liezi may contain some passages from the so-called “Lost Zhuangzi” 52 chapter version. Unlike the Daodejing which is ascribed to the mythological Laozi, the Zhuangzi may actually contain materials from a teacher known as Zhuang Zhou who lived between 370-300 B.C.E. Chapters 1-7 are those most often ascribed to Zhuangzi himself (which is a title meaning “Master Zhuang”) and these are known as the “inner chapters.” The remaining 26 chapters had other origins and they sometimes take different points of view from the Inner Chapters. Although there are several versions of how the remainder of the Zhuangzi may be divided, one that is gaining currency is Chs. 1-7 (Inner Chapters), Chs. 8-10 (the “Daode” essay), Chs. 11-16 and parts of 18, 19, and 22 (Yellow Emperor Chapters), and Chs. 17-28 (Zhuang Zhou’s Disciples’ material), with the remains of the text attributable to the final redactor.

7. Basic Concepts in the Zhuangzi

Zhuangzi taught that a set of practices, including meditative stillness, helped one achieve unity with the dao and become a “perfected person” (zhenren). The way to this state is not the result of a withdrawal from life. However, it does require disengaging or emptying oneself of conventional values and the demarcations made by society. In Chapter 23 of the Zhuangzi, aNanrong Chu inquiring of the character Laozi about the solution to his life’s worries was answered promptly: “Why did you come with all this crowd of people?” The man looked around and confirmed he was standing alone, but Laozi meant that his problems were the result of all the baggage of ideas and conventional opinions he lugged about with him. This baggage must be discarded before anyone can be zhenren, move in wu-wei and express profound virtue (de).

Like the DDJ, Zhuangzi also valorizes wu-wei, especially in the Inner Chapters, the Yellow Emperor sections on rulership, and the Zhuangzi disciples’ materials in Ch. 19. For its examples of such living the Zhuangzi turns to analogies of craftsmen, athletes (swimmers), ferrymen, cicada-catching men, woodcarvers, and even butchers. One of the most famous stories in the text is that of Ding the Butcher, who learned what it means to wu wei through the perfection of his craft. When asked about his great skill, Ding says, “What I care about is dao, which goes beyond skill. When I first began cutting up oxen, all I could see was the ox itself. After three years I no longer saw the whole ox. And now—now I go at it by spirit and don’t look with my eyes. Perception and understanding have come to a stop and spirit moves where it wants. I go along with the natural makeup, strike in the big hollows, guide the knife through the big openings, and follow things as they are. So I never touch the smallest ligament or tendon, much less a main joint. A good cook changes his knife once a year—because he cuts. A mediocre cook changes his knife once a month—because he hacks. I’ve had this knife of mine for nineteen years and I’ve cut up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone. There are spaces between the joints, and the blade of the knife has really no thickness….[I] move the knife with the greatest subtlety, until—flop! The whole thing comes apart like a clod of earth crumbling to the ground.” (Ch. 3, The Secret of Caring for Life)  The recurring point of all of the stories in Zhuangzi about wu-wei is that such spontaneous and effortless conduct as displayed by these many examples has the same feel as acting in wu-wei.  The point is not that wu-wei results from skill development.  Wu-wei is not a cultivated skill. It is a gift of oneness with dao.  The Zhuangzi’s teachings on wu-wei are closely related to the text’s consistent rejection of the use of reason and argument as means to dao (chs. 2; 12, 17, 19).

Persons who exemplify such understanding are called sages, zhenren, and immortals. Zhuangzi describes the Daoist sage in such a way as to suggest that such a person possesses extraordinary powers. Just as the DDJ said that creatures do not harm the sages, the Zhuangzi also has a passage teaching that the zhenren exhibits wondrous powers, frees people from illness and is able to make the harvest plentiful (ch.1).  Zhenren are “spirit like” (shen yi), cannot be burned by fire, do not feel cold in the freezing forests, and life and death have no effect on them (ch. 2).  Just how we should take such remarks is not without controversy.  To be sure, many Daoist in history took them literally and an entire tradition of the transcendents or immortals (xian) was collected in text and lore.

Zhuangzi is drawing on a set of beliefs about master teachers that were probably regarded as literal by many, although some think he meant these to be taken metaphorically. For example, when Zhuangzi says that the sage cannot be harmed or made to suffer by anything that life presents, does he mean this to be taken as saying that the zhenren is physically invincible? Or, does he mean that the sage has so freed himself from all conventional understandings that he refuses to recognize poverty as any more or less desirable than affluence, to recognize blindness as worse than sight, to recognize death as any less desirable than life? As the Zhuangzi says in Chapter One, Free and Easy Wandering, “There is nothing that can harm this man.” This is also the theme of Chapter Two, On Making All Things Equal. In this chapter people are urged to “make all things one,” meaning that they should recognize that reality is one. It is a human judgment that what happens is beautiful or ugly, right or wrong, fortunate or not. The sage knows all things are one (equal) and does not judge. Our lives are snarled and jumbled so long as we make conventional discriminations, but when we set them aside, we appear to others as extraordinary and enchanted.

An important theme in the Zhuangzi is the use of immortals to illustrate various points. Did Zhuangzi believe some persons physically lived forever? Well, many Daoists did believe this. Did Zhuangzi believe that our substance was eternal and only our form changed? Almost certainly Zhuangzi thought that we were in a constant state of process, changing from one form into another (see the exchange between Master Lai and Master Li in Ch. 6, The Great and Venerable Teacher). In Daoism, immortality is the result of what may be described as a wu xing transformation. Wu xing means “five phases” and it refers to the Chinese understanding of reality according to which all things are in some state of combined correlation of qi as wood, fire, water, metal, and earth. This was not exclusively a “Daoist” physics. It underlay all Chinese “science” of the classical period, although Daoists certainly made use of it. Zhuangzi wants to teach us how to engage in transformation through stillness, breathing, and experience of numinal power (see ch. 6). And yet, perhaps Zhuangzi’s teachings on immortality mean that the person who is free of discrimination makes no difference between life and death. In the words of Lady Li in Ch. 2, “How do I know that the dead do not wonder why they ever longed for life?”

Huangdi (the Yellow Emperor) is the most prominent immortal mentioned in the text of the Zhuangzi and he is a main character in the sections of the book called “the Yellow Emperor Chapters” noted above. He has long been venerated in Chinese history as a cultural exemplar and the inventor of civilized human life. Daoism is filled with other accounts designed to show that those who learn to live according to the according to the dao have long lives. Pengzu, one of the characters in the Zhuangzi, is said to have lived eight hundred years. The most prominent female immortal is Xiwangmu (Queen Mother of the West), who was believed to reign over the sacred and mysterious Mount Kunlun.

The passages containing stories of the Yellow Emperor in Zhuangzi provide a window into the views of rulership in the text.  On the one hand, the Inner Chapters (chs. 1-7) reject the role of ruler as a viable vocation for a zhenren and consistently criticize the futility of government and politics (ch. 7).  On the other hand, the Yellow Emperor materials in Chs. 11-13 present rulership as valuable, so long as the ruler is acts by wu-wei.  This second position is also that taken in the work entitled the Huainanzi (see below).

The Daoists did not think of immortality as a gift from a god, or an achievement in the religious sense commonly thought of in the West. It was a result of finding harmony with the dao, expressed through wisdom, meditation, and wu-wei. Persons who had such knowledge were reputed to live in the mountains, thus the character for xian (immortal) is made up of two components, the one being shan “mountain” and the other being ren “person.” Undoubtedly, some removal to the mountains was a part of the journey to becoming a zhenren “true person.” Because Daoists believed that nature and our own bodies were correlations of each other, they even imagined their bodies as mountains inhabited by immortals. The struggle to wu-wei was an effort to become immortal, to be born anew, to grow the embryo of immortality inside. A part of the disciplines of Daoism included imitation of the animals of nature, because they were thought to act without the intention and willfulness that characterized human decision making. Physical exercises included animal dances (wu qin xi) and movements designed to enable the unrestricted flow of the cosmic life force from which all things are made (qi). These movements designed to channel the flow of qi became associated with what came to be called tai qi or qi gong. Daoists practiced breathing exercises, used herbs and other pharmacological substances, and they employed an instruction booklet for sexual positions and intercourse, all designed to enhance the flow of qi energy. They even practiced external alchemy, using burners to modify the composition of cinnabar into mercury and made potions to drink and pills to ingest for the purpose of adding longevity. Many Daoist practitioners died as a result of these alchemical substances, and even a few Emperors who followed their instructions lost their lives as well, Qinshihuang being the most famous.

The attitude and practices necessary to the pursuit of immortality made this life all the more significant. Butcher Ding is a master butcher because his qi is in harmony with the dao. Daoist practices were meant for everyone, regardless of their origin, gender, social position, or wealth. However, Daoism was a complete philosophy of life and not an easy way to learn.

When superior persons learn the Dao, they practice it with zest.

When average persons learn of the Dao, they are indifferent.

When petty persons learn of the Dao, they laugh loudly.

If they did not laugh, it would not be worthy of being the Dao.DDJ, 41

8. Daoism and Confucianism

Arguably, Daoism shared some emphases with classical Confucianism such as a this-worldly concern for the concrete details of life rather than speculation about abstractions and ideals. Nevertheless, it largely represented an alternative and critical tradition divergent from that of Confucius and his followers. While many of these criticisms are subtle, some seem very clear.

One of the most fundamental teachings of the DDJ is that human discriminations, such as those made in law, morality (good, bad) and aesthetics (beauty, ugly) actually create the troubles and problems  humans experience, they do not solve them (ch. 3a). The clear implication is that the person following the dao must cease ordering his life according to human-made distinctions (ch. 19). Indeed, it is only when the dao recedes in its influence that these demarcations emerge (chs. 18; 38), because they are a form of disease (ch. 74). In contrast, Daoists believe that the dao is untangling the knots of life, blunting the sharp edges of relationships and problems, and turning down the light on painful occurrences (ch. 4). So, it is best to practice wu-wei in all endeavors, to act naturally and not willfully try to oppose or tamper with how reality is moving or try to control it by human discriminations.

Confucius and his followers wanted to change the world and be proactive in setting things straight. They wanted to tamper, orchestrate, plan, educate, develop, and propose solutions. Daoists, on the other hand, take their hands off of life when Confucians want their fingerprints on everything. Imagine this comparison. If the Daoist goal is to become like a piece of unhewn and natural wood, the goal of the Confucians is to become a carved sculpture. The Daoists put the piece before us just as it is found in its naturalness, and the Confucians polish it, shape it, and decorate it. This line of criticism is made very explicitly in the essay which makes up Zhuangzi Chs. 8-10.

Confucians think they can engineer reality, understand it, name it, control it. But the Daoists think that such endeavors are the source of our frustration and fragmentation (DDJ, chs. 57, 72). They believe the Confucians create a gulf between humans and nature that weakens and destroys us. Indeed, as far as the Daoists are concerned, the Confucian project is like a cancer that saps our very life. This is a fundamental difference in how these two great philosophical traditions think persons should approach life, and as shown above it is a consistent difference found also between the Zhuangzi and Confucianism.

The Yellow Emperor sections of the Zhuangzi in Chs. 12, 13 and 14 contain five text blocks in which Laozi is portrayed in dialogue with Confucius and according to which he is pictured as Confucius’ master and teacher.  These materials provide a direct access into the Daoist criticism of the Confucian project.

9. Daoism in the Han

The teachings that were later called Daoism were closely associated with a stream of thought called Huanglao Dao (Yellow Emperor-Laozi Dao) in the 3rd and 2nd cn. B.C.E. The thought world transmitted in this stream is what Sima Tan meant by Daojia. The Huanglao school is best understood as a lineage of Daoist practitioners mostly residing in the state of Qi (modern Shandong area). Huangdi was the name for the Yellow Emperor, from whom the rulers of Qi said they were descended. When Emperor Wu, the sixth sovereign of the Han dynasty (r. 140-87 B.C.E.) elevated Confucianism to the status of the official state ideology and training in it became mandatory for all bureaucratic officials, the tension with Daoism became more evident. And yet, at court, people still sought longevity and looked to Daoist masters for the secrets necessary for achieving it. Wu continued to engage in many Daoist practices, including the use of alchemy, climbing sacred Taishan (Mt. Tai), and presenting talismanic petitions to heaven. Liu An, the Prince of Huainan and a nephew of Wu, is associated with the production of the work called the Masters of Huainan (Huainanzi, 180-122 B.C.E.).  This is a highly synthetic work formed at what is known as the Huainan academy and greatly influenced by Yellow Emperor Daoism.  John Major and a team of translators published the first complete English version of this text (2010).  The text was an attempt to merge cosmology, Confucian ideals, and a political theory using “quotes” attributed to the Yellow Emperor, although the statements actually parallel closely the Daodejing and the Zhuangzi. All this is of added significance because in the later Han work, Laozi binahua jing (Book of the Transformations of Laozi) the Chinese physics that persons and objects change forms was employed in order to identify Laozi with the Yellow Emperor.

10. Celestial Masters Daoism

Even though Emperor Wu forced Daoist practitioners from court, Daoist teachings found a fertile ground in which to grow in the environment of discontent with the policies of the Han rulers and bureaucrats. Popular uprisings sprouted. The Yellow Turban movement tried to overthrow Han imperial authority in the name of the Yellow Emperor and promised to establish the Way of Great Peace (Tai ping). Indeed, the basic moral and philosophical text that provided the intellectual justification of this movement was the Classic of Great Peace (Taiping jing), provided in an English version by Barbara Hendrischke. The present version of this work in the Daoist canon is a later and altered iteration of the original text dating about 166 CE and attributed to transnormal revelations experienced by Zhang Jiao.

Easily the most important of the Daoist trends at the end of the Han period was the wudou mi dao (Way of Five Bushels of Rice) movement, best known as the Way of the Celestial Masters (tianshi dao). This movement is traceable to a Daoist hermit named Zhang ling, also known as Zhang Daoling, who resided on a mountain near modern Chengdu in Sichuan. According to an account in Ge Hong’s Biographies of Spirit Immortals, Laozi appeared to Zhang (c. 142 CE) and gave him a commission to announce the soon end of the world and the coming age of Great Peace (taiping). The revelation said that those who followed Zhang would become part of the Orthodox One Covenant with the Powers of the Universe (Zhengyi meng wei). Zhang began the movement that culminated in a Celestial Master state. The administrators of this state were called libationers (ji jiu), because they performed religious rites, as well as political duties. They taught that personal illness and civil mishap were owing to the mismanagement of the forces of the body and nature. The libationers taught a strict form of morality and displayed registers of numinal powers they could access and control. Libationers were moral investigators, standing in for a greater celestial bureaucracy. The Celestial Master state developed against the background of the decline of the later Han dynasty. Indeed, when the empire finally decayed, the Celestial Master government was the only order in much of southern China.

When the Wei dynastic rulers became uncomfortable with the Celestial Masters’ power, they broke up the power centers of the movement. But this backfired because it actually served to disperse Celestial Masters followers throughout China. Many of the refugees settled near X’ian in and around the site of Louguan tai. The movement remained strong because its leaders had assembled a canon of texts [Statutory Texts of the One and Orthodox (Zhengyi fawen)]. This group of writings included philosophical, political, and ritual texts. It became a fundamental part of the later authorized Daoist canon.

11. Neo-Daoism

The resurgence of Daoism after the Han dynasty is often known as Neo-Daoism. As a result, Confucian scholars sought to annotate and reinterpret their own classical texts to move them toward greater compatibility with Daoism, and they even wrote commentaries on Daoist works.  A new type of Confucianism known simply as the Way of Mysterious Learning (Xuanxue) emerged. It is represented by a set of scholars, including some of the most prominent thinkers of the period: Wang Bi (226-249), He Yan (d. 249), Xiang Xiu (223?-300), Guo Xiang (d. 312) and Pei Wei (267-300).  In general, these scholars share in common an effort to reinterpret the social and moral understanding of Confucianism in ways to make it more compatible with Daoist philosophy. In fact, for many interpreters, the extent to which Daoist influence is evident in the texts of these writers has led some scholars to call this movement ‘Neo-Daoism.’ Wang Bi and Guo Xiang who wrote commentaries respectively on the Daodejing and the Zhuangzi, were the most important voices in this development. Traditionally, the famous “Seven Sages of the Bamboo Grove” (Zhulin qixian) have also been associated with the new Daoist way of life that expressed itself in culture and not merely in mountain retreats. These thinkers included landscape painters, calligraphers, poets, and musicians.

Among the philosophers of this period, the great representative of Daoism in southern China was Ge Hong (283-343 CE). He practiced not only philosophical reflection, but also external alchemy, manipulating mineral substances such as mercury and cinnabar in an effort to gain immortality. His work the Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is the most important Daoist philosophical work of this period. For him, longevity and immortality are not the same, the former is only the first step to the latter.

12. Shangqing and Lingbao Daoist Movements

After the invasion of China by nomads from Central Asia, Daoists of the Celestial Master tradition who had been living in the north were forced to migrate into southern China, where Ge Hong’s version of Daoism was strong. The mixture of these two traditions is represented in the writings of the Xu family. The Xu family was an aristocratic group from what is today the city of Nanjing. Seeking Daoist philosophical wisdom and the long life it promised, many of them moved to Mao Shan Mountain, near the city. There they claimed to receive revelations from immortals, who dictated new wisdom and morality texts to them. Yang Xi was the most prominent medium recipient of the Maoshan revelations (360-370 CE). These revelations came from spirits who were local heroes named the Mao brothers, but they had been transformed into deities. Yang Xi’s writings formed the basis for Highest Purity (Shangqing) Daoism. The writings were extraordinarily well done and even the calligraphy in which they were written was beautiful.

The importance of these texts philosophically speaking is to be found in their idealization of the quest for immortality and transference of the material practices of the alchemical science of Ge Hong into a form of reflective meditation. In fact, the Shangqing school of Daoism is the beginning of the tradition known as “inner alchemy” (neidan), an individual mystical pursuit of wisdom.

Some thirty years after the Maoshan revelations, a descendent of Ge Hong, named Ge Chaofu went into a mediumistic trance and authored a set of texts called the Numinous Treasure (Lingbao) teachings. These works were ritual recitation texts similar to Buddhist sutras, and indeed they borrowed heavily from Buddhism. At first, the Shangqing and Lingbao texts belonged to the general stream of the Celestial Masters and were not considered separate sects or movements within Daoism, although later lineages of masters emphasized the uniqueness of their teachings.

13. Tang Daoism

As the Lingbao texts illustrate, Daoism acted as a receiving structure for Buddhism. Many early translators of Buddhist texts used Daoist terms to render Indian ideas. Some Buddhists saw Laozi as an avatar of Shakyamuni (the Buddha), and some Daoists understood Shakyamuni as a manifestation of the dao, which also means he was a manifestation of Laozi. An often made generalization is that Buddhism held north China in the 4th and 5th centuries, and Daoism the south. But gradually this intellectual currency actually reversed. Daoism grew in scope and impact throughout China.

By the time of the Tang dynasty (618-906 CE) Daoism was the intellectual philosophy that underwrote the national understanding. The imperial family claimed to descend from Li (by lore, the family of Laozi). Laozi was venerated by royal decree. Officials received Daoist initiation as Masters of its philosophy, rituals, and practices. A major center for Daoist studies was created at Dragon and Tiger Mountain (longhu shan), chosen both for its feng shui and because of its strategic location at the intersection of numerous southern China trade routes. The Celestial Masters who held leadership at Dragon and Tiger Mountain were later called “Daoist popes” by Christian missionaries because they had considerable political power.

In aesthetics, two great Daoist intellectuals worked during the Tang. Wu Daozi developed the rules for Daoist painting and Li Bai became its most famous poet. Interestingly, Daoist alchemists invented gunpowder during the Tang. The earliest block-print book on a scientific subject is a Daoist work entitled Xuanjie lu (850 CE). As Buddhism gradually grew stronger during the Tang, Daoist and Confucian intellectuals sought to initiate a conversation with it. The Buddhism that resulted was a reformed version known as Chan (Zen in Japan).

14. The Three Teachings

During the Five Dynasties (907-960 CE) and Song periods (960-1279 CE) Confucianism enjoyed a resurgence and Daoists found their place by teaching that principal thinkers of their tradition were Confucian scholars as well. Most notable among these was Lu Dongbin, a legendary Daoist immortal that many believed was originally a Confucian teacher.

Daoism became a complete philosophy of life, reaching into religion, social action, and individual health and physical well-being. A huge network of Daoist temples known by the name Dongyue Miao (also called tianqing guan) was created through the empire, with a miao in virtually every town of any size. The Daoist masters who served these temples were often appointed as government officials. They also gave medical, moral, and philosophical advice, and led religious rituals, dedicated especially to the Lord of the Sacred Mountain of the East named Taishan. Daoist masters had wide authority. All this was obvious in the temple iconography. Taishan was represented as the emperor, the City God (cheng huang) was a high official, and the Earth God was portrayed as a prosperous peasant. Daoism of this period integrated the Three Teachings (sanjiao) of China: Confucianism, Buddhism, and Daoism. This process of synthesis continued throughout the Song and into the period of the Ming Dynasty.

Such a wide dispersal of Daoist thought and practice, taken together with its interest in merging Confucianism and Buddhism, eventually created a fragmented ideology. Into this confusion came Wang Zhe (1113-1170 CE), the founder of Quanzhen (Complete Perfection) Daoism. It was Wang’s goal to bring the three teachings into a single great synthesis. For the first time, Daoist teachers adopted monastic forms of life, created monasteries, and organized themselves in ways they saw in Buddhism. This version of Daoist thought interpreted the classical texts of the DDJ and the Zhuangzi to call for a rejection of the body and material world. The Quanzhen order became powerful as the main partner of the Mongols (Yuan dynasty), who gave their patronage to its expansion. Less frequently, the Mongol emperors favored the Celestial Masters and their leader at Dragon and Tiger Mountain in an effort to undermine the power of the Quanzhen leaders. For example, the Zhengyi (Celestial Master) master of Beijing in the 1220s was Zhang Liusun. Under patronage he was allowed to build a Dongyue Miao in the city in 1223 and make it the unofficial town hall of the capital. But by the time of Khubilai Khan (r. 1260-1294) the Buddhists were used against all Daoists. The Khan ordered all Daoist books except the DDJ to be destroyed in 1281, and he closed the Quanzhen monastery in the city known as White Cloud Monastery (Baiyun Guan).

When the Ming (1368-1644) dynasty emerged, the Mongols were expulsed, and Chinese rule was restored. The emperors sponsored the creation of the first complete Daoist Canon (Daozang), which was edited between 1408 and 1445. This was an eclectic collection, including many Buddhist and Confucian related texts. Daoist influence reached its zenith.

15. The “Destruction” of Daoism

The Manchurian tribes that became rulers of China in 1644 and founded the Qing dynasty were already under the influence of conservative Confucian exiles. They stripped the Celestial Master of Dragon Tiger Mountain of his power at court. Only Quanzhen was tolerated. White Cloud Monastery (Baiyun Guan)) was reopened, and a new lineage of thinkers was organized. They called themselves the Dragon Gate lineage (Longmen pai). In the 1780s, the Western traders arrived, and so did Christian missionaries. In 1849, the Hakka people of Guangxi province, among China’s poorest citizens, rose in revolt. They followed Hong Xiuquan, who claimed to be Jesus’ younger brother. This millennial movement built on a strange version of Chinese Christianity sought to establish the Heavenly Kingdom of Peace (taiping). As the Taiping swept throughout southern China, they destroyed Buddhist and Daoist temples and texts wherever they found them. The Taiping army completely raised the Daoist complexes on Dragon Tiger Mountain. During most of the 20th century the drive to eradicate Daoist influence has continued. In the 1920s, the “New Life” movement drafted students to go out on Sundays to destroy Daoist statues and texts. Accordingly, by the year 1926 only two copies of the Daoist Canon (Daozang) existed and Daoist philosophical heritage was in great jeopardy. But permission was granted to copy the canon kept at the White Cloud Monastery, and so the texts were preserved for the world. There are 1120 titles in this collection in 5,305 volumes. Much of this material has yet to receive scholarly attention and very little of it has been translated into any Western language.

The Cultural Revolution (1966-1976) attempted to complete the destruction of Daoism. Masters were killed or “re-educated.” Entire lineages were broken up and their texts were destroyed. The miaos were closed, burned, and turned into military barracks. At one time, there were 300 Daoist sites in Beijing alone, now there are only a handful. However, Daoism is not dead. It survives as a vibrant philosophical system and way of life as is evidenced by the revival of its practice and study in several new University institutes in the People’s Republic.

16. References and Further Reading

  • Ames, Roger and Hall, David. (2003). Daodejing: “Making This Life Significant” A Philosophical Translation. New York: Ballantine Books.
  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, Judith M. (1987). A Survey of Taoist Literature: Tenth to Seventeenth Centuries, China Research Monograph 32. Berkeley: University of California Press.
  • Chan, Alan. (1991). Two Visions of the Way: A Translation and Study of the Heshanggong and Wang Bi Commentaries on the Laozi. Albany: State University of New York Press.
  • Cook, Scott (2013). The Bamboo Texts of the Guodian: A Study & Complete Translation. New York: Cornell University East Asia Program.
  • Coutinho, Steve (2014). An Introduction to Daoist Philosophies.  New York: Columbia University Press.
  • Creel, Herrlee G. (1970). What is Taoism? Chicago: University of Chicago Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Girardot, Norman J. (1983). Myth and Meaning in Early Taoism: The Theme of Chaos (hun-tun). Berkeley: University of California Press.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. (1979). “How much of the Chuang-tzu Did Chuang-tzu Write?” Journal of the American Academy of Religion, Vol. 47, No. 3.
  • Hansen, Chad (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Hendrischke, Barbara (2015, reprint ed.). The Scripture on Great Peace: The Taiping jing and the Beginnings of Daoism. Berkeley: The University of California Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Hochsmann, Hyun and Yang Guorong, trans. (2007). Zhuangzi. New York: Pearson.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kjellberg, Paul and Ivanhoe, Philip J., eds. (1996) Essays on Skepticism, Relativism, and Ethics in the Zhuangzi. Albany: State University of New York.
  • Kleeman, Terry (1998). Great Perfection: Religion and Ethnicity in a Chinese Millenial Kingdom. Honolulu: University of Hawaii Press.
  • Kohn, Livia, ed. (2004). Daoism Handbook, 2 vols. Boston: Brill.
  • Kohn, Livia (2009). Introducing Daoism. London: Routledge.
  • Kohn, Livia (2014). Zhuangzi: Text and Context.  St. Petersburg: Three Pines Press.
  • Kohn, Livia and LaFargue, Michael., eds. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold., eds. (2002). Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Komjathy, Louis (2014). Daoism: A Guide for the Perplexed. London: Bloomsbury.
  • LaFargue, Michael. (1992). The Tao of the Tao-te-ching. Albany: State University of New York Press.
  • Lin, Paul J. (1977). A Translation of Lao-tzu’s Tao-te-ching and Wang Pi’s Commentary. Ann Arbor: University of Michigan.
  • Lau, D.C. (1982). Chinese Classics: Tao Te Ching. Hong Kong: Hong Kong University Press.
  • Littlejohn, Ronnie (2010). Daoism: An Introduction. London: I.B. Tauris.
  • Littlejohn, Ronnie (2011). “The Liezi’s Use of the Lost Zhuangzi.” Riding the Wind with Liezi: New Perspectives on the Daoist Classic. Eds. Ronnie Littlejohn and Jeffrey Dippmann. Albany: State University of New York.
  • Lynn, Richard John. (1999). The Classic of the Way and Virtue: A New Translation of the Tao-Te Ching of Laozi as Interpreted by Wang Bi. New York: Columbia University Press.
  • Mair, Victor, ed. (2010). Experimental Essays on Zhuangzi. St. Petersburg: Three Pines Press. New edition of University of Hawai’i, 1983.
  • Mair, Victor. (1990). Tao Te Ching: The Classic Book of Integrity and the Way. New York: Bantam Press.
  • Mair, Victor (1994). Wandering on the Way: Early Taoist Tales and Parables of Chuang Tzu. Honolulu: University of Hawai’i Press.
  • Major, John, Queen, Sarah, Set Meyer, Andrew, and Roth, Harold, trans. (2010). The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press.
  • Maspero, Henri. (1981). Taoism and Chinese Religion. Amherst: University of Massachusetts Press.
  • Miller, James (2003). Daoism: A Short Introduction.  Oxford: Oxford University Press.
  • Moeller, Hans-Georg (2004). Daoism Explained: From the Dream of the Butterfly to the Fishnet Allegory. Chicago: Open Court.
  • Robinet, Isabelle. (1997). Taoism: Growth of a Religion. Stanford: Stanford University Press.
  • Roth, Harold (1999). Original Tao: Inward Training (Nei-yeh) and the Foundations of Taoist Mysticism. New York: Columbia University Press.
  • Roth, Harold D. (1992). The Textual History of the Huai Nanzi. Ann Arbor: Association of Asian Studies.
  • Roth, Harold D. (1991). “Who Compiled the Chuang Tzu?” In Chinese Texts and Philosophical Contexts, ed. Henry Rosemont, 84-95. La Salle: Open Court.
  • Schipper, Kristofer. (1993). The Taoist Body Berkeley: University of California Press.
  • Slingerland, Edward, (2003). Effortless Action: Wu-Wei As Conceptual Metaphor and Spiritual Ideal in Early China. New York: Oxford University Press.
  • Waley, Arthur (1934). The Way and Its Power: A Study of the Tao Te Ching and its Place in Chinese Thought. London: Allen & Unwin
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Slavoj Žižek (1949 —)

philosopher, portraitSlavoj Žižek is a Slovenian-born political philosopher and cultural critic. He was described by British literary theorist, Terry Eagleton, as the “most formidably brilliant” recent theorist to have emerged from Continental Europe.

Žižek’s work is infamously idiosyncratic. It features striking dialectical reversals of received common sense; a ubiquitous sense of humor; a patented disrespect towards the modern distinction between high and low culture; and the examination of examples taken from the most diverse cultural and political fields. Yet Žižek’s work, as he warns us, has a very serious philosophical content and intention. He challenges many of the founding assumptions of today’s left-liberal academy, including the elevation of difference or otherness to ends in themselves, the reading of the Western Enlightenment as implicitly totalitarian, and the pervasive skepticism towards any context-transcendent notions of truth or the good.

One feature of Žižek’s work is its singular philosophical and political reconsideration of German idealism (Kant, Schelling and Hegel). Žižek has also reinvigorated the challenging psychoanalytic theory of Jacques Lacan, controversially reading him as a thinker who carries forward founding modernist commitments to the Cartesian subject and the liberating potential of self-reflective agency, if not self-transparency. Žižek’s works since 1997 have become more and more explicitly political, contesting the widespread consensus that we live in a post-ideological or post-political world, and defending the possibility of lasting changes to the new world order of globalization, the end of history, or the war on terror.

This article explains Žižek’s philosophy as a systematic, if unusually presented, whole; and it clarifies the technical language Žižek uses, which he takes from Lacanian psychoanalysis, Marxism, and German idealism. In line with how Žižek presents his own work, this article starts by examining Žižek’s descriptive political philosophy. It then examines the Lacanian-Hegelian ontology that underlies Žižek’s political philosophy. The final part addresses Žižek’s practical philosophy, and the ethical philosophy he draws from this ontology.

Table of Contents

  1. Biography
  2. Žižek’s Political Philosophy
  3. Criticism of Ideology as “False Consciousness”
  4. Ideological Cynicism and Belief
  5. Jouissance as Political Factor
  6. The Reflective Logic of Ideological Judgments (or How the King is King)
  7. Sublime Objects of Ideology
  8. Žižek’s Fundamental Ontology
  9. The Fundamental Fantasy & the Split Law
  10. Excursus: Žižek’s Typology of Ideological Regimes
  11. Kettle Logic, or Desire and Theodicy
  12. Fantasy as the Fantasy of Origins
  13. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)
  14. From Ontology to Ethics—Žižek’s Reclaiming of the Subject
  15. Žižek’s Subject, Fantasy, and the Objet Petit a
  16. The Objet Petit a & the Virtuality of Reality
  17. Forced Choice & Ideological Tautologies
  18. The Substance is Subject, the Other Does Not Exist
  19. The Ethical Act Traversing the Fantasy
  20. Conclusion
  21. References and Further Reading
    1. Primary Literature (Books by Žižek)
    2. Secondary Literature (Texts on Žižek)

1. Biography

Slavoj Žižek was born in 1949 in Ljubljana, Slovenia. He grew up in the comparative cultural freedom of the former Yugoslavia’s self-managing socialism. Here—significantly for his work— Žižek was exposed to the films, popular culture and theory of the noncommunist West. Žižek completed his PhD at Ljubljana in 1981 on German Idealism, and between 1981 and 1985 studied in Paris under Jacques AlainMiller, Lacan’s son-in-law. In this period, Žižek wrote a second dissertation, a Lacanian reading of Hegel, Marx and Kripke. In the late 1980s, Žižek returned to Slovenia where he wrote newspaper columns for the Slovenian weekly “Mladina,” and cofounded the Slovenian Liberal Democratic Party. In 1990, he ran for a seat on the four-member collective Slovenian presidency, narrowly missing office. Žižek’s first published book in English, The Sublime Object of Ideology, appeared in 1989. Since then, Žižek has published over a dozen books, edited several collections, published numerous philosophical and political articles, and maintained a tireless speaking schedule. His earlier works are of the type “Introductions to Lacan through popular culture / Hitchcock / Hollywood …” Since at least 1997, however, Žižek’s work has taken on an increasingly engaged political tenor, culminating in books on September 11 and the Iraq war. As well as being visiting professor at the Department of Psychoanalysis, Universite ParisVIII in 1982-3 and 1985-6, Žižek has lectured at the Cardozo Law School, Columbia, Princeton, the New School for Social Research, the University of Michigan, Ann Arbor, and Georgetown. He is currently a returning faculty member of the European Graduate School, and founder and president of the Society for Theoretical Psychoanalysis, Ljubljana.

2. Žižek’s Political Philosophy

a. Criticism of Ideology as “False Consciousness”

In a way that is oddly reminiscent of Nietzsche, Žižek generally presents his work in a polemical fashion, knowingly striking out against the grain of accepted opinion. One untimely feature of Žižek’s work is his continuing defense and use of the unfashionable term “ideology.” According to the classical Marxist definition, ideologies are discourses that promote false ideas (or “false consciousness”) in subjects about the political regimes they live in. Nevertheless, because these ideas are believed by the subjects to be true, they assist in the reproduction of the existing status quo, in an exact instance of what Umberto Eco dubs “the force of the fake.” To critique ideology, according to this position, it is sufficient to unearth the truth(s) the ideologies conceal from the subject’s knowledge. Then, so the theory runs, subjects will become aware of the political shortcomings of their current regimes, and be able and moved to better them. As Žižek takes up in his earlier works, this classical Marxian notion of ideology has come under theoretical attack in a number of ways. First, to criticize a discourse as ideological implies access to a Truth about political things the Truth that the ideologies, as false, would conceal. But it has been widely disputed in the humanities that there could ever be any One such theoretically accessible Truth. Secondly, the notion of ideology is held to be irrelevant to describe contemporary sociopolitical life, because of the increased importance of what Jurgen Habermas calls “mediasteered subsystems” (the market, public and private bureaucracies), and also because of the widespread cynicism of today’s subjects towards political authorities. For ideologies to have political importance, critics comment, subjects would have to have a level of faith in public institutions, ideals and politicians which today’s liberal-cosmopolitan subjects lack. The widespread notoriety of left-leaning authors like Michael Moore of Noam Chomsky, as one example, bears witness to how subjects today can know very well what Moore claims is the “awful truth,” and yet act as if they did not know.

Žižek agrees with critics about this “false consciousness” model of ideology. Yet he insists that we are not living in a post-ideological world, as figures as different as Tony Blair, Daniel Bell or Richard Rorty have claimed. Žižek proposes instead that in order to understand today’s politics we need a different notion of ideology. In a typically bold reversal, Žižek’s position is that today’s widespread consensus that our world is post-ideological gives voice to what he calls the “archideological” fantasy. Since “ideology” since Marx has carried a pejorative sense, no one who taken in by such an ideology has ever believed that they were so duped, Žižek comments. If the term “ideology” has any meaning at all, ideological positions are always what people impute to Others (for today’s left, for example, the political right are the dupes of one or another noble lie about natural community; for the right, the left are the dupes of well-meaning but utopian egalitarianism bound to lead to economic and moral collapse, and so forth). For subjects to believe in an ideology, it must have been presented to them, and been accepted, as non-ideological indeed, as True and Right, and what anyone sensible would believe. As we shall see in 2e, Žižek is alert to the realist insight that there is no more effective political gesture than to declare some contestable matter above political contestation. Just as the third way is said to be post-ideological or national security is claimed to be extra-political, so Žižek argues that ideologies are always presented by their proponents as being discourses about Things too sacred to profane by politics. Hence, Žižek’s bold opening in The Sublime Object of Ideology is to claim that today ideology has not so much disappeared from the political landscape as come into its own. It is exactly because of this success, Žižek argues, that ideology has also been able to be dismissed in accepted political and theoretical opinion.

b. Ideological Cynicism and Belief

Today’s typical first world subjects, according to Žižek, are the dupes of what he calls “ideological cynicism.” Drawing on the German political theorist Sloterdijk, Žižek contends that the formula describing the operation of ideology today is not “they do not know it, but they are doing it”, as it was for Marx. It is “they know it, but they are doing it anyway.” If this looks like nonsense from the classical Marxist perspective, Žižek’s position is that nevertheless this cynicism indicates the deeper efficacy of political ideology per se. Ideologies, as political discourses, are there to secure the voluntary consent—or what La Boétie called servitude volontaire of people about contestable political policies or arrangements. Yet, Žižek argues, subjects will only voluntarily agree to follow one or other such arrangement if they believe that, in doing so, they are expressing their free subjectivity, and might have done otherwise.

However false such a sense of freedom is, Žižek insists that it is nevertheless a political instance of what Hegel called an essential appearance. Althusser’s understanding of ideological identification suggests that an individual is wholly “interpellated” into a place within a political system by the system’s dominant ideology and ideological state apparatuses. Contesting this notion by drawing on Lacanian psychoanalysis, however, Žižek argues that it is a mistake to think that, for a political position to win peoples’ support, it needs to effectively brainwash them into thoughtless automatons. Rather, Žižek maintains that any successful political ideology always allows subjects to have and to cherish a conscious distance towards its explicit ideals and prescriptions—or what he calls, in a further technical term, “ideological disidentification.”

Again bringing the psychoanalytic theory of Lacan to bear in political theory, Žižek argues that the attitude of subjects towards authority revealed by today’s ideological cynicism resembles the fetishist’s attitude towards his fetish. The fetishist’s attitude towards his fetish has the peculiar form of a disavowal: “I know well that (for example) the shoe is only a shoe, but nevertheless, I still need my partner to wear the shoe in order to enjoy.” According to Žižek, the attitude of political subjects towards political authority evinces the same logical form: “I know well that (for example) Bob Hawke / Bill Clinton / the Party / the market does not always act justly, but I still act as though I did not know that this is the case.” In Althusser’s famous “Ideology and Ideological State Apparatuses,” Althusser staged a kind of primal scene of ideology, the moment when a policeman (as bearer of authority) says “hey you!” to an individual, and the individual recognizes himself as the addressee of this call. In the “180 degree turn” of the individual towards this Other who has addressed him, the individual becomes a political subject, Althusser says. Žižek’s central technical notion of the “big Other” [grand Autre] closely resembles—to the extent that it is not modelled on Althusser’s notion of the Subject (capital “S”) in the name of which public authorities (like the police) can legitimately call subjects to account within a regime—for example, “God” in a theocracy, “the Party” under Stalinism, or “the People” in today’s China. As the central chapter of The Sublime Object of Ideology specifies, ideologies for Žižek work to identify individuals with such important or rallying political terms as these, which Žižek calls “master signifiers.” The strange but decisive thing about these pivotal political words, according to Žižek, is that no one knows exactly what they mean or refer to, or has ever seen with their own eyes the sacred objects which they seem to name (for example: God, the Nation, or the People). This is one reason why Žižek, in the technical language he inherits (via Lacan) from structuralism, says that the most important words in any political doctrine are “signifiers without a signified” (that is, words that do not refer to any clear and distinct concept or demonstrable object).

This claim of Žižek’s is connected to two other central ideas in his work:

  • First: Žižek adapts the psychoanalytic notion that individuals are always “split” subjects, divided between the levels of their conscious awareness and the unconscious. Žižek contends throughout his work that subjects are always divided between what they consciously know and can say about political things, and a set of more or less unconscious beliefs they hold concerning individuals in authority, and the regime in which they live (see 3a). Even if people cannot say clearly and distinctly why they support some political leader or policy, for Žižek no less than for Edmund Burke, this fact is not politically decisive, as we will see in 2e below.
  • Second: Žižek makes a crucial distinction between knowledge and belief. Exactly where and because subjects do not know, for example, what “the essence” of “their people” is, the scope and nature of their beliefs on such matters is politically decisive, according to Žižek (again, see 2e below).

Žižek’s understanding of political belief is modelled on Lacan’s understanding of transference in psychoanalysis. The belief or “supposition” of the analysand in psychoanalysis is that the Other (his analyst) knows the meaning of his symptoms. This is obviously a false belief, at the start of the analytic process. But it is only through holding this false belief about the analyst that the work of analysis can proceed, and the transferential belief can become true (when the analyst does become able to interpret the symptoms). Žižek argues that this strange intersubjective or dialectical logic of belief in clinical psychoanalysis also what characterizes peoples’ political beliefs. Belief is always “belief through the Other,” Žižek argues. If subjects do not know the exact meaning of those “master signifiers” with which they political identify, this is because their political belief is mediated through their identifications with others. Although they each themselves “do not know what they do” (which is the title one of Žižek’s books [Žižek, 2002]), the deepest level of their belief is maintained through the belief that nevertheless there are Others who do know. A number of features of political life are cast into new relief given this psychoanalytic understanding, Žižek claims:

  • First, Žižek contends that the key political function of holders of public office is to occupy the place of what he calls, after Lacan, “the Other supposed to know.” Žižek cites the example of priests reciting mass in Latin before an uncomprehending laity, who believe that the priests know the meaning of the words, and for whom this is sufficient to keep the faith. Far from presenting an exception to the way political authority works, for Žižek this scenario reveals the universal rule of how political consensus is formed.
  • Second, and in connection with this, Žižek contends that political power is primarily “symbolic” in its nature. What he means by this further technical term is that the roles, masks, or mandates that public authorities bear is more important politically than the true “reality” of the individuals in question (whether they are unintelligent, unfaithful to their wives, good family women, and soforth). According to Žižek, for example, fashionable liberal criticisms of George W. Bush the man are irrelevant to understanding or evaluating his political power. It is the office or place an individual occupies in their political system (or “big Other”) that ensures the political force of their words, and the belief of subjects in their authority. This is why Žižek maintains that the resort of a political leader or regime to “the real of violence” (such as war or police action) amounts to a confession of its weakness as a political regime. Žižek sometimes puts this thought by saying that people believe through the big Other, or that the big Other believes for them, despite what they might inwardly think or cynically say.

c. Jouissance as Political Factor

A further key point that Žižek takes from Louis Althusser’s later work on ideology is Althusser’s emphasis on the “materiality” of ideology, its embodiment in institutions and peoples’ everyday practices and lives. Žižek’s realist position is that all the ideas in the world can have no lasting political effect unless they come to inform institutions and subjects’ day-to-day lives. In The Sublime Object of Ideology, Žižek cites Blaise Pascal’s advice that doubting subjects should get down on their knees and pray, and then they will believe. Pascal’s position is not any kind of simple proto-behaviorism, according to Žižek. The deeper message of Pascal’s directive, he asserts, is to suggest that once subjects have come to believe through praying, they will also retrospectively see that they got down on their knees because they always believed, without knowing it. In this way, in fact, Žižek can be read as a consistent critic not only of the importance of knowledge in the formation of political consensus, but also of the importance of “inwardness” in politics per se in the tradition of the younger Carl Schmitt.

Prior political philosophy has placed too little emphasis, Žižek asserts, on communities’ cultural practices that involve what he calls “inherent transgression.” These are practices sanctioned by a culture that nevertheless allow subjects some experience of what is usually exceptional to or prohibited in their everyday lives as civilized political subjects—things like sex, death, defecation, or violence. Such experiences involve what Žižek calls jouissance, another technical term he takes from Lacanian psychoanalysis. Jouissance is usually translated from the French as “enjoyment.” As opposed to what we talk of in English as “pleasure”, though, jouissance is an always sexualized, always transgressive enjoyment, at the limits of what subjects can experience or talk about in public. Žižek argues that subjects’ experiences of the events and practices wherein their political culture organizes its specific relations to jouissance (in first world nations, for example, specific sports, types of alcohol or drugs, music, festivals, films) are as close as they will get to knowing the deeper Truth intimated for them by their regime’s master signifiers: “nation”, “God”, “our way of life,” and so forth (see 2b above). Žižek, like Burke, argues that it is such ostensibly nonpolitical and culturally specific practices as these that irreplaceably single out any political community from its others and enemies. Or, as one of Žižek’s chapter titles in Tarrying With the Negative puts it, where and although subjects do not know their Nation, they “enjoy (jouis) their nation as themselves.”

d. The Reflective Logic of Ideological Judgments (or How the King is King)

According to Žižek, like and after Althusser, ideologies are thus political discourses whose primary function is not to make correct theoretical statements about political reality (as Marx’s “false consciousness” model implies), but to orient subjects’ lived relations to and within this reality. If a political ideology’s descriptive propositions turn out to be true (for example: “capitalism exploits the workers,” “Saddam was a dictator,” “the Spanish are the national enemy,” and so forth), this does not in any way reduce their ideological character, in Žižek’s estimation. This is because this character concerns the political issue of how subjects’ belief in these propositions, instead of those of opponents, positions subjects on the leading political issues of the day. For Žižek, political speech is primarily about securing a lived sense of unity or community between subjects, something like what Kant called sensus communis or Rousseau the general will. If political propositions seemingly do describe things in the world, Žižek’s position is that we nevertheless need always to understand them as Marx understood the exchange value of commodities—as “a relation between people being concealed behind a relation between things.” Or again: just as Kant thought that the proposition “this is beautiful” really expresses a subject’s reflective sense of commonality with all other subjects capable of being similarly affected by the object, so Žižek argues that propositions like “Go Spain!” or “the King will never stop working to secure our future” are what Kant called reflective judgments, which tell us as much or more about the subject’s lived relation to political reality as about this reality itself.

If ideological statements are thus performative utterances that produce political effects by their being stated, Žižek in fact holds that they are a strange species of performative utterance overlooked by speech act theory. Just because, when subjects say “the Queen is the Queen!” they are at one level reaffirming their allegiance to a political regime, Žižek at the same time holds that this does not mean that this regime could survive without appearing to rest on such deeper Truths about the way the world is. As we saw in 2b, Žižek maintains that political ideologies always present themselves as naming such deeper, extra-political Truths. Ideological judgments, according to Žižek, are thus performative utterances which, in order to perform their salutary political work, must yet appear to be objective descriptions of the way the world is (exactly as when a chairman says “this meeting is closed!” only thereby bringing this state of affairs into effect). In Sublime Object of Ideology, Žižek cites Marx’s analysis of being a King in Das Capital to illustrate his meaning. A King is only King because his subjects loyally think and act like he is King (think of the tragedy of Lear). Yet, at the same time, the people will only believe he is King if they believe that this is a deeper Truth about which they can do nothing.

e. Sublime Objects of Ideology

In line with Žižek’s ideas of “ideological disidentification” and “jouissance as a political factor” (see 2b and 2c above) and in a clear comparison with Derrida’s deconstruction, arguably the unifying thought in Žižek’s political philosophy is that regimes can only secure a sense of collective identity if their governing ideologies afford subjects an understanding of how their regime relates to what exceeds, supplements or challenges its identity. This is why Kant’s analytic of the sublime in The Critique of Judgment, as an analysis of an experience in which the subject’s identity is challenged, is of the highest theoretical interest for Žižek. Kant’s analytic of the sublime isolates two moments to its experience, as Žižek observes. In the first moment, the size or force of an object painfully impresses upon the subject the limitation of its perceptual capabilities. In a second moment, however, a “representation” arises where “we would least expect it,” which takes as its object the subject’s own failure to perceptually take the object in. This representation resignifies the subject’s perceptual failure as indirect testimony about the inadequacy of human perception as such to attain to what Kant calls Ideas of Reason (in Kant’s system, God, the Universe as a Whole, Freedom, the Good).

According to Žižek, all successful political ideologies necessarily refer to and turn around sublime objects posited by political ideologies. These sublime objects are what political subjects take it that their regime’s ideologies’ central words mean or name extraordinary Things like God, the Fuhrer, the King, in whose name they will (if necessary) transgress ordinary moral laws and lay down their lives. When a subject believes in a political ideology, as we saw in 2b above, Žižek argues that this does not mean that they know the Truth about the objects which its key terms seemingly name—indeed, Žižek will finally contest that such a Truth exists (see 3c, d). Nevertheless, by drawing on a parallel with Kant on the sublime, Žižek makes a further and more radical point. Just as in the experience of the sublime, Kant’s subject resignifies its failure to grasp the sublime object as indirect testimony to a wholly “supersensible” faculty within herself (Reason), so Žižek argues that the inability of subjects to explain the nature of what they believe in politically does not indicate any disloyalty or abnormality. What political ideologies do, precisely, is provide subjects with a way of seeing the world according to which such an inability can appear as testimony to how Transcendent or Great their Nation, God, Freedom, and so forth is—surely far above the ordinary or profane things of the world. In Žižek’s Lacanian terms, these things are Real (capital “R”) Things (capital “T”), precisely insofar as they in this way stand out from the reality of ordinary things and events.

In the struggle of competing political ideologies, Žižek hence agrees with Ernesto Laclau and Chantal Mouffe, the aim of each is to elevate their particular political perspective (about what is just, best, and so forth) to the point where it can lay claim to name, give voice to or to represent the political whole (for example: the nation). In order to achieve this political feat, Žižek argues, each group must succeed in identifying its perspective with the extra-political, sublime objects accepted within the culture as giving body to this whole (for example: “the national interest,” “the dictatorship of the proletariat”). Or else, it must supplant the previous ideologies’ sublime objects with new such objects. In the absolute monarchies, as Ernst Kantorowicz argued, the King’s so called “second” or “symbolic” body exemplified paradigmatically such sublime political objects as the unquestionable font of political authority (the particular individual who was King was contestable, but not the sovereign’s role itself). Žižek’s critique of Stalinism, in a comparable way, turns upon the thought that “the Party” had this sublime political status in Stalinist ideology. Class struggle in this society did not end, Žižek contends, despite Stalinist propaganda. It was only displaced from a struggle between two classes (for example, bourgeois versus proletarian) to one between “the Party” as representative of the people or the whole and all who disagreed with it, ideologically positioned as “traitors” or “enemies of the people.”

3. Žižek’s Fundamental Ontology

a. The Fundamental Fantasy & the Split Law

For Žižek, as we have seen, no political regime can sustain the political consensus upon which it depends, unless its predominant ideology affords subjects a sense both of individual distance or freedom with regard to its explicit prescriptions (2b), and that the regime is grounded in some larger or “sublime” Truth (2e). Žižek’s political philosophy identifies interconnected instances of these dialectical ideas: his notion of “ideological disidentification” (2b); his contention that ideologies must accommodate subjects’ transgressive experiences of jouissance (2c); and his conception of exceptional or sublime objects of ideology (2e). Arguably the central notion in Žižek’s political philosophy intersects with Žižek’s notion of “ideological fantasy”. “Ideological fantasy” is Žižek’s technical name for the deepest framework of belief that structures how political subjects, and/or a political community, comes to terms with what exceeds its norms and boundaries, in the various registers we examined above.

Like many of Žižek’s key notions, Žižek’s notion of the ideological fantasy is a political adaptation of an idea from Lacanian psychoanalysis: specifically, Lacan’s structuralist rereading of Freud’s psychoanalytic understanding of unconscious fantasy. As for Lacan, so for Žižek, the civilizing of subjects necessitates their founding sacrifice (or “castration”) of jouissance, enacted in the name of sociopolitical Law. Subjects, to the extent that they are civilized, are “cut” from the primal object of their desire. Instead, they are forced by social Law to pursue this special, lost Thing in Žižek’s technical term, the “objet petit a” (see 4a, 4b) by observing their societies’ linguistically mediated conventions, deferring satisfaction, and accepting sexual and generational difference. Subjects’ “fundamental fantasies,” according to Lacan, are unconscious structures which allow them to accept the traumatic loss involved in this founding sacrifice. They turn around a narrative about the lost object, and how it was lost (see 3d). In particular, the fundamental fantasy of a subject resignifies the founding repression of jouissance by Law—which, according to Lacan, is necessary if the individual is to become a speaking subject—as if it were a merely contingent, avoidable occurrence. In the fantasy, that is, what for Žižek is a constitutive event for the subject, is renarrated as the historical action of some exceptional individual (in Enjoy Your Symptom! the pre-Oedipal “anal father”). Equally, the jouissance the subject considers itself to have lost is posited by the fantasy as having been taken from it by this persecutory “Other supposed to enjoy” (see 3b).

In the notion of ideological fantasy, Žižek takes this psychoanalytic framework and applies it to the understanding of the constitution of political groups. If after Plato, political theory concerns the Laws of a regime, the Laws for Žižek are always split or double in kind. Each political regime has a body of more or less explicit, usually written Laws which demand that subjects forego jouissance in the name of the greater good, and according to the letter of its proscriptions (for example, the US or French constitutions). Žižek identifies this level of the Law with the Freudian ego ideal. But Žižek argues that, in order to be effective, a regime’s explicit Laws must also harbor and conceal a darker underside, a set of more or less unspoken rules which, far from simply repressing jouissance, implicate subjects in a guilty enjoyment in repression itself, which Žižek likens to the “pleasure-in-pain” associated with the experience of Kant’s sublime (see 2d). The Freudian superego, for Žižek, names the psychical agency of the Law, as it is misrepresented and sustained by subjects’ fantasmatic imaginings of a persecutory Other supposed to enjoy (like the archetypal villain in noir films). This darker underside of the Law, Žižek agrees with Lacan, is at its base a constant imperative to subjects to jouis!, by engaging in the “inherent transgressions” of their sociopolitical community (see 2b).

Žižek’s notion of the split in the Law in this way intersects directly with his notion of ideological disidentification examined in 2b. While political subjects maintain a conscious sense of freedom from the explicit norms of their culture, Žižek contends, this disidentification is grounded in their unconscious attachment to the Law as superego, itself an agency of enjoyment. If Althusser famously denied the importance of what people “have on their consciences” in the explanation of how political ideologies work, then for Žižek the role of guilt—as the way in which the subject enjoys his subjection to the laws—is vital to understanding subjects’ political commitments. Individuals will only turn around when the Law hails them, Žižek argues, insofar as they are finally subjects also of the unconscious belief that the “big Other” has access to the jouissance they have lost as subjects of the Law, and which they can accordingly reattain through their political allegiance (see 2b). It is this belief, what could be termed this “political economy of jouissance,” that the fundamental fantasies underlying political regimes’ worldviews are there to structure in subjects.

b. Excursus: Žižek’s Typology of Ideological Regimes

With these terms of Žižek’s Lacanian ontology in place, it becomes possible to lay out Žižek’s theoretical understanding of the differences between different types of ideological-political regimes. Žižek’s works maintain a lasting distinction between modern and premodern political regimes, which he contends are grounded in fundamentally different ways of organizing subjects’ relations to Law and jouissance (3a). In Žižek’s Lacanian terms, premodern ideological regimes exemplified what Lacan calls in Seminar XVII the discourse of the master. In these authoritarian regimes, the word and will of the King or master (in Žižek’s mathemes, S1) was sovereign—the source of political authority, with no questions asked. Her/His subjects, in turn, are supposed to know (S2) the edicts of the sovereign and the Law (as the classical legal notion has it, “ignorance is no excuse”). In this arrangement, while jouissance and fantasy are political factors, as Žižek argues, regimes’ quasi-transgressive practices remain exceptional to the political arena, glimpsed only in such carnivalesque events as festivals or the types of public punishment Michel Foucault (for example) describes in the introduction to Discipline and Punish.

Žižek agrees with both Foucault and Marx that modern political regimes exert a form of power that is both less visible and more far-reaching than that of the regimes they replaced. Modern regimes, both liberal capitalist and totalitarian, for Žižek, are no longer predominantly characterized by the Lacanian discourse of the master. Given that the Oedipal complex is associated by him with this older type of political authority, Žižek agrees with the Frankfurt School theorists that, contra Deleuze and Guattari, today’s subjectivity as such is already post- or anti-Oedipal. Indeed, in Plague of Fantasies and The Ticklish Subject, Žižek contends that the characteristic discontents of today’s political world—from religious fundamentalism to the resurgence of racism in the first world—are not archaic remnants of, or protests against traditional authoritarian structures, but the pathological effects of new forms of social organization. For Žižek, the defining agency in modern political regimes is knowledge (or, in his Lacanian mathemes, S2). The enlightenment represented the unprecedented political venture to replace belief in authority as the basis of polity with human reason and knowledge. As Schmitt also complained, the legitimacy of modern authorities is grounded not in the self-grounding decision of the sovereign. It is grounded in the ability of authorities to muster coherent chains of reasons to subjects about why they are fit to govern. Modern regimes hence always claim to speak not out of ignorance of what subjects deeply enjoy (“I don’t care what you want; just do what I say!”) but in the very name of subjects’ freedom and enjoyment.

Whether fascist or communist, Žižek argues in his early books, totalitarian (as opposed to authoritarian) regimes justified their rule by final reference to quasi-scientific metanarratives. These metanarratives—a narrative concerning racial struggle in Nazism, or the Laws of History in Stalinism—each claimed to know the deeper Truth about what subjects want, and accordingly could both justify the most striking transgressions of ordinary morality, and justify these transgressions by reference to subjects’ jouissance. The most disturbing or perverse features of these regimes can only be explained by reference to the key place of knowledge in these regimes. Žižek describes, for instance, the truly Catch 22esque logic of the Soviet show trials, wherein it was not enough for subjects to be condemned by the authorities as enemies, but they were made to avow their “objective” error in opposing the party as agent of the laws of history.

Žižek’s statements on today’s liberal capitalism are complex, if not in mutual tension. At times, Žižek tries to formalize the economic generation of surplus value as a meaningfully “hysterical” social arrangement. Yet Žižek predominantly argues, that the market driven consumerism of later capitalist subjects is characterized by a marketing discourse which—like totalitarian ideologies—does not appeal to subjects in the name of any collective cause justifying individuals’ sacrifice of jouissance. Instead, as social conservatives criticize, it musters the quasi-scientific discourses of marketing and public relations, or (increasingly) Eastern religion, in order to recommend products to subjects as necessary means in the liberal pursuit of happiness and self-fulfillment. In line with this change, Žižek contends in The Ticklish Subject that the paradigmatic type of leader today is not some inaccessible boss but the uncannily familiar figure of Bill Gates—more like a little brother than the traditional father or master. Again: for Žižek it is deeply telling that at the same time as the nuclear family is being eroded in the first world, other institutions, from the so-called “nanny” welfare state to private corporations, are increasingly becoming “familiarized” (with self-help sessions for employees, company days, casual days, and so forth).

c. Kettle Logic, or Desire and Theodicy

We saw how Žižek claims that the truth of political ideologies concerns what they do, not what they say (2d). At the level of what political ideologies say, Žižek maintains, a Lacanian critical theory maintains that ideologies must be finally inconsistent. Freud famously talked of the example of a man who returns a borrowed kettle back to its owner broken. The man adduces mutually inconsistent excuses which are united only in terms of his ignoble desire to evade responsibility for breaking the kettle: he never borrowed the kettle, the kettle was already broken when he borrowed it, and when he gave the kettle back it was not really broken anyway. As Žižek reads political ideologies, they function in the same way in the political field—this is the sense of the subtitle of his 2004 Iraq: The Borrowed Kettle. As we saw in 2d, Žižek maintains that the end of political ideologies is to secure and defend the idea of the polity as a wholly unified community. When political strife, uncertainty or division occur, political ideologies and the fundamental fantasies upon which they lean (3a) operate to resignify this political discontent so that the political ideal of community can be sustained, and to deny the possibility that this discontent might signal a fundamental injustice or flaw within the regime. In what amounts to a kind of political theodicy, Žižek’s work points to a number of logically inconsistent ideological responses to political discontents, which are united only by the desire that informs them, like Freud’s “kettle logic”:

  1. Saying that these divisions are politically unimportant, transient or merely apparent.
    Or, if this explanation fails:
  2. Saying that the political divisions are in any case contingent to the ordinary run of events, so that if their cause is removed or destroyed, things will return to normal.
    Or, more perilously:
  3. Saying that the divisions or problems are deserved by the people for the sake of the greater good (in Australia in the 90s, for example, we experienced “the recession we had to have”), or as punishment for their betrayal of the national Thing.

Žižek’s view of the political functioning of sublime objects of ideology can be charted exactly in terms of this political theodicy. (see 2e) We saw in 3a, how Žižek argues that subjects’ fantasy is what allows them to come to terms with the loss of jouissance fundamental to being social or political animals. Žižek centrally maintains that such narrative attempts at political self-understanding—whether of individuals or political regimes—are ultimately unable to achieve these ends, except at the price of telling inconsistencies.

As Žižek highlights in his analyses of the political discontents in former Yugoslavia following the fall of communism, each national or political community tends to claim that its sublime Thing is inalienable, and hence utterly incapable of being understood or destroyed by enemies. Nevertheless, the invariable correlative of this emphasis on the inalienable nature of one’s Thing, Žižek argues in Tarrying with the Negative (1993), is the notion that It is simultaneously deeply fragile if not under active threat. For Žižek, this mutual inconsistency is only theoretically resolvable if, despite first appearances, we posit a materialist teaching that says that the “substance” seemingly named by political regimes’ key rallying terms (see 2e) is only sustained in their lived communal practices (as we say when someone does not get a joke, “you had to be there”). Yet political ideologies, as such, cannot avow this possibility (see 2d). Instead, ideological fantasies posit various exemplars of a persecutory enemy or, as Žižek says, “the Other of the Other” to whom the explanation of political disunity or discontent can be traced. If only this other or enemy could be removed, the political fantasy contends, the regime would be fully equitable and just. Historical examples of such figures of the enemy include “the Jew” in Nazi ideology, or the “petty bourgeois” in Stalinism.

Again: a type of “kettle logic” applies to the way these enemies are represented in political ideologies, according to Žižek. “The Jew” in Nazi ideology, for example, was an inconsistent condensation of features of both the ruling capitalist class (money grabbing, exploitation of the poor) and of the proletariat (dirtiness, sexual promiscuity, communism). The only consistency this figure has, that is, is precisely as a condensation of everything that Nazi ideology’s Aryan Volksgemeinschaft (roughly, “national community”) was constructed in response and political opposition to.

d. Fantasy as the Fantasy of Origins

In a way that has drawn some critics (Bellamy, Sharpe) to question how finally political Žižek’s political philosophy is, Žižek’s critique of ideology ultimately turns on a set of fundamental ontological propositions about the necessary limitations of any linguistic or symbolic system. These propositions concern the widely known paradoxes that bedevil any attempt by a semantic system to explain its own limits, and/or how it came into being. If what preceded the system was radically different from what subsequently emerged, how could the system have emerged from it, and how can the system come to terms with it at all? If we name the limits of what the system can understand, do not we, in that very gesture, presuppose some knowledge of what is beyond these limits, if only enough to say what the system is not? The only manner in which we can explain the origin of language is within language, Žižek notes in For They Know Not What They Do. Yet we hence presuppose, again in the very act of the explanation, the very thing we were hoping to explain. Similarly, to take the example from political philosophy of Hobbes’ explanation of the origin of sociopolitical order, the only way we can explain the origin of the social contract is by presupposing that Hobbes’ wholly pre-social men nevertheless possessed in some way the very social abilities to communicate and make pacts that Hobbes’ position is supposed to explain.

For Žižek, fantasy as such is always fundamentally the fantasy of (one’s) origins. In Freud’s “Wolf Man” case, to cite the psychoanalytic example Žižek cites in For They Know Not What They Do, the primal scene of parental coitus is the Wolf Man’s attempt to come to terms with his own origin—or to answer the infant’s perennial question “where did I come from?” The problem here is this: who could the spectacle of this primal scene have been staged for or seen by, if it really transpired before the genesis of the subject that it would explain (see 3e, 4e)? The only answer is that the Wolf Man has imaginatively transposed himself back into the primal scene if only as an impassive object-gaze—whose historical occurrence he had yet hoped would explain his origin as an individual.

Žižek’s argument is that, in the same way, political or ideological systems cannot and do not avoid deep inconsistencies. No less than Machiavelli, Žižek is acutely aware that the act that founds a body of Law is never itself legal, according to the very order of Law it sets in place. He cites Bertolt Brecht: “what is the robbing of a bank, compared to the founding of a bank?” What fantasy does, in this register, is to try to historically renarrativize the founding political act as if it were or had been legal—an impossible application of the Law before the Law had itself come into being. No less than the Wolf Man’s false transposition of himself back into the primal scene that was to explain his origin, Žižek argues that the attempt of any political regime to explain its own origins in a political myth that denies the fundamental, extralegal violence of these origins is fundamentally false. (Žižek uses the example of the liberal myth of primitive accumulation to illustrate his position in For They Know Not What They Do, but we could cite here Plato’s myth of the reversed cosmos in the Laws and Statesman, or historical cases like the idea of terra nullius in colonial Australia).

e. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)

In a series of places, Žižek situates his ontological position in terms of a striking reading of Immanuel Kant’s practical philosophy. Žižek argues that in “Religion Within the Bounds of Reason Alone” Kant showed that he was aware of these paradoxes that necessarily attend any attempt to narrate the origins of the Law. The Judeo-Christian myth of the fall succumbs to precisely these paradoxes, as Kant analyses: if Adam and Eve were purely innocent, how could they have been tempted?; if their temptation was wholly the fault of the tempter, why then has God punished humans with the weight of original sin?; but if Adam and Eve were not purely innocent when the snake lured them, in what sense was this a fall at all? According to Žižek, Kant’s text also provides us with theoretical parameters which allow us to explain and avoid these paradoxes. The problems for the mythical narrative, Kant argues, hail from its nature as a narrative—or how it tries to render in a historical story what he argues is truly a logical or transcendental priority. For Kant, human beings are, as such, radically evil. They have always already chosen to assert their own self-conceit above the moral Law. This choice of radical evil, however, is not itself a historical choice either for individuals or for the species, for Kant. This choice is what underlies and opens up the space for all such historical choices. However, as Žižek argues, Kant withdraws from the strictly diabolical implications of this position. The key place in which this withdrawal is enacted is in the postulates of The Critique of Practical Reason, wherein Kant defends the immortality of the soul as a likely story on the basis of our moral experience. Because of radical evil, Kant argues, it is impossible for humans to ever act purely out of duty in this life—this is what Kant thinks our irremovable sense of moral guilt attests. But because people can never act purely in this life, Kant suggests, it is surely reasonable to hope and even to postulate that the soul lives on after death, striving ever closer towards the perfection of its will.

Žižek’s contention is that this argument does not prove the immortality of a disembodied soul. It proves the immortality of an embodied individual soul, always struggling guiltily against its selfish corporeal impulses (this, incidentally, is one reason why Žižek argues, after Lacan, that de Sade is the truth of Kant). In order to make his proof even plausible, Žižek notes, Kant has to tacitly smuggle the spatiotemporal parameters of embodied earthly existence into the postulated hereafter so that the guilty subject can continue endlessly to struggle against his radically evil nature towards good. In this way, though, Kant himself has to speak as if he knew what things are like on the other side of death—which is to say, from the impossible, because impossibly neutral, perspective of someone able to impassively see the spectacle of the immortal subject striving guiltily towards the good (see 4d). But in this way, also, Žižek argues that Kant enacts exactly the type of fantasmatic operation his reading of the fall (as a) narrative declaims, and which represents in nuce the basis operation also of all political ideologies.

4. From Ontology to Ethics—Žižek’s Reclaiming of the Subject

a. Žižek’s Subject, Fantasy, and the Objet Petit a

Perhaps Žižek’s most radical challenge to accepted theoretical opinion is his defense of the modern, Cartesian subject. Žižek knowingly and polemically positions his writings against virtually all other contemporary theorists, with the significant exception of Alain Badiou. Yet for Žižek, the Cartesian subject is not reducible to the fully self-assured “master and possessor of nature” of Descartes’ Discourses. It is what Žižek calls in “Kant With (Or Against) Kant,” an out of joint ontological excess or clinamen. Žižek takes his bearings here as elsewhere from a Lacanian reading of Kant, and the latter’s critique of Descartes’ cogito ergo sum. In the “Transcendental Dialectic” in The Critique of Pure Reason, Kant criticized Descartes’ argument that the self-guaranteeing “I think” of the cogito must be a thinking thing (res cogitans). For Kant (as for Žižek), while the “I think” must be capable of accompanying all of the subject’s perceptions, this does not mean that it is itself such a substantial object. The subject that sees objects in the world cannot see itself seeing, Žižek notes, any more than a person can jump over her own shadow. To the extent that a subject can reflectively see itself, it sees itself not as a subject but as one more represented object, what Kant calls the “empirical self” or what Žižek calls the “self” (versus the subject) in The Plague of Fantasies. The subject knows that it is something, Žižek argues. But it does not and can never know what Thing it is “in the Real”, as he puts it (see 2e). This is why it must seek clues to its identity in its social and political life, asking the question of others (and of the big Other (see 2b)) which Žižek argues defines the subject as such: che voui? (what do you want from me?). In Tarrying With the Negative, Žižek hence reads the Director’s Cut of Ridley Scott’s Bladerunner as revelatory of the Truth of the subject. Within this version of the film, as Žižek emphasizes, the main character Deckard literally does not know what he is—a robot that perceives itself to be human. According to Žižek, the subject is a “crack” in the universal field or substance of being, not a knowable thing (see 4d). This is why Žižek repeatedly cites in his books the disturbing passage from the young Hegel describing the modern subject not as the “light” of the modern enlightenment, but “this night, this empty nothing …”

It is crucial to Žižek’s position, though, that Žižek denies the apparent implication of this that the subject is some kind of supersensible entity, for example, an immaterial and immortal soul, and so forth. The subject is not a special type of Thing outside of the phenomenal reality we can experience, for Žižek. As we saw in 1e above, such an idea would in fact reproduce in philosophy the type of thinking which, he argues, characterizes political ideologies and the subject’s fundamental fantasy (see 3a). It is more like a fold or crease in the surface of this reality, as Žižek puts it in Tarrying With the Negative, the point within the substance of reality wherein that substance is able to look at itself, and see itself as alien to itself. According to Žižek, Hegel and Lacan add to Kant’s reading of the subject as the empty “I think” that accompanies any individual’s experience the caveat that, because objects thus appear to a subject, they always appear in an incomplete or biased way. Žižek’s “formula” of the fundamental fantasy (see 2a, 2d) “$ <> a” tries to formalize exactly this thought. Its meaning is that the subject ($), in its fundamental fantasy, misrecognizes itself as a special object (the objet petit a or lost object (see 2a)) within the field of objects that it perceives. In terms which unite this psychoanalytic notion with Žižek’s political philosophy, we can say that the objet petit a is exactly a sublime object (2e). It is an object that is elevated or, in Freudian terms, “sublimated” by the subject to the point where it stands as a metonymic representative of the jouissance the subject unconsciously fantasizes was taken from her/him at castration (3a). It hence functions as the object-cause of the subject’s desire that exceptional “little piece of the Real” that s/he seeks out in all of her/his love relationships. Its psychoanalytic paradigms are, to cite the title of a collection Žižek edited, “the voice and gaze as love objects”. Examples of the voice as object petit a include the persecutor’s voice in paranoia, or the very silence that some TV advertisements now use, and which captures our attention by making us wonder whether we may not have missed something. The preeminent Lacanian illustration of the gaze as object petit a is the anamorphotic skull at the foot of Holbein’s Ambassadors, which can only be seen by a subject who looks at it awry, or from an angle. Importantly, then, neither the voice nor the gaze as objet petit a attest to the subject’s sovereign ability to wholly objectify (and hence control) the world it surveys. In the auditory and visual fields (respectively), the voice and the gaze as objet petit a represent objects like Kant’s sublime things that the subject cannot wholly get its head around, as we say. The fact that they can only be seen or heard from particular perspectives indicates exactly how the subject’s biased perspective—and so his/her desire, what s/he wants—has an effect on what s/he is able to see. They thereby bear witness to how s/he is not wholly outside of the reality s/he sees. Even the most mundane but telling example of this subjective objet petit a of Lacanian theory is someone in love, of whom we commonly say that they are able to see in their lover something special, an “X factor,” which others are utterly blind to. In the political field, similarly—and as we saw in part 2c—subjects of a particular political community will claim that others cannot understand their regime’s sublime objects. Indeed, as Žižek comments about the resurgence of racism across the first world today, it is often precisely the strangeness of others’ particular ethnic or national Things that animates subjects’ hatred towards them.

b. The Objet Petit a & the Virtuality of Reality

In Žižek’s theory, the objet petit a stands as the exact opposite of the object of the modern sciences, that can only be seen clearly and distinctly if it is approached wholly impersonally. If the objet petit a is not looked at from a particular, subjective perspective—or, in the words of one of Žižek’s titles, by “looking awry” —it cannot be seen at all. This is why Žižek believes this psychoanalytic notion can be used to structure our understanding of the sublime objects postulated by ideologies in the political field, which as we saw in 3c show themselves to be finally inconsistent when they are looked at dispassionately. What Žižek’s Lacanian critique of ideology aims to do is to demonstrate such inconsistencies, and thereby to show us that the objects most central to our political beliefs are Things whose very sublime appearance conceals from us our active agency in constructing and sustaining them. (We will return to this thought in 4d and 4e below).

Žižek argues that the first place that the objet petit a appeared in the history of Western philosophy was with Kant’s notion of the transcendental object in The Critique of Pure Reason. Analyzing this Kantian notion allows us to elaborate more precisely the ontological status of the objet petit a. Kant defines the transcendental object as “the completely indeterminate thought of an object in general.” Like the objet petit a, then, Kant’s transcendental object is not a normal phenomenal object, although it has a very specific function in Kant’s epistemological conception of the subject. The avowedly anti-Humean function of this Kantian positing in the “Transcendental Deduction” is to ensure that the purely formal categories of the subject’s understanding can actually affect and indeed structure the manifold of the subject’s sensuous intuition. As Žižek stresses, that is, the transcendental object functions in Kant’s epistemology to guarantee that sense will continue to emerge for the subject, no matter what particular objects s/he might encounter.

We saw in 3c how Žižek argues that ideologies adduce ultimately inconsistent reasons to support the same goal of political unity. According to Žižek, as we can now elaborate, this is because the deepest political function of sublime objects of ideology is to ensure that the political world will make sense for subjects no matter what events transpire, in a way that he directly compares with Kant’s transcendental object. No matter what evidence someone might produce that all Jewish people are not acquisitive, capitalist, cunning, for example, a true Nazi will be able to immediately resignify this evidence by reference to his ideological notion of “the Jew”: “surely it is part of their cunning to appear as though they are not truly cunning,” and so forth. Importantly, it follows for Žižek that political community is always, in its very structure, an anticipated community. Subjects’ sense of political belonging is always mediated, according to him, by their shared belief in their regime’s key words or master signifiers. But these are words whose only “meaning” lies finally in their function, which is to guarantee that there will (continue to) be meaning. There is, Žižek argues, ultimately no actual, Real Thing better than the other real things subjects encounter that these words name (2e). It is only by acting as if there were such a Thing that community is maintained. This is why Žižek specifies in The Indivisible Reminder that political identification can only be, “at its most basic, identification with the very gesture of identification”:

…the coordination [between subjects in a political community] concerns not the level of the signified [of some positive shared concern] but the level of the signifier. [In political ideologies], undecidability with regard to the signified (do others really intend the same as me?) converts into an exceptional signifier, the empty Master-Signifier, the signifier-without-signified. ‘Nation’, ‘Democracy’, ‘Socialism’ and other Causes stand for that ‘something’ about which we are never sure what, exactly, it is – the point is, rather, that identifying with the Nation we signal our acceptance of what others accept, with a Master-Signifier which serves as the rallying point for all the others. (Žižek, 1996: 142)

This is the sense also in which Žižek claims in Plague of Fantasies that today’s virtual reality is “not virtual enough.” It is not virtual enough because the many options it offers subjects to enjoy (jouis) are transgressive or exotic possibilities. VR leaves nothing to the imagination or, in Žižek’s Lacanian terms, to fantasy. Fantasy, as we saw in 2a, operates to structure subjects’ beliefs about the jouissance which must remain only the stuff of imagination, purely “virtual” for subjects of the social law. For Žižek, then, it is identification with this law, as mediated via subjects’ anticipatory identifications with what they suppose others believe, that involves true virtuality.

c. Forced Choice & Ideological Tautologies

As 4b confirms (and as we commented in 1c), Žižek’s political philosophy turns around the idea that the central words of political ideologues are at base “signifiers without signified,” words that only appear to refer to exceptional Things, and which thereby facilitate the identification between subjects. As Žižek argues, these sublime objects of ideology have exactly the ontological status of what Kant called “transcendental illusions”—illusions whose semblance conceals that there is nothing behind them to conceal. Ideological subjects do not know what they do when they believe in them, Žižek contends. Yet, through the presupposition that the Other(s) know (2c), and their participation in the practices involving inherent transgression of their political community (2c), they “identify with the very gesture of identification” (4b). Hence, their belief, coupled with these practices, is politically efficient.

One of Žižek’s most difficult, but also deepest, claims is that the particular sublime objects of ideology with which subjects identify in different regimes (the Nation, the People, and so forth) each give particular form to a meta-law (law about all other laws) that binds any political community as such. This is the meta-law that says simply that subjects must obey all the other laws. In 2b above, we saw how Žižek holds that political ideologies must allow subjects the sense of subjective distance from their explicit directives. Žižek’s critical position is that this apparent freedom ideologies thereby allow subjects is finally a lure. Like the choice offered Yossarian by the “catch 22” of Joseph Heller’s novel, the only option truly available to political subjects is to continue to abide by the laws. No regime can survive if it waives this meta-law. The Sublime Object of Ideology hence cites with approval Kafka’s comment that it is not required that subjects think the law is just, only that it is necessary. Yet no regime, despite Kafka, can directly avow its own basis in such naked self-assertion without risking the loss of all legitimacy, Žižek agrees with Plato. This is why it must ground itself in ideological fantasies (3a) which at once sustain subjects’ sense of individual freedom (2c) and the sense that the regime itself is grounded extra-politically in the Real, and some transcendent, higher Good (2e).

This thought underlies the importance Žižek accords in For They Know Not What They Do to Hegel’s difficult notion of tautology as the highest instance of contradiction in The Science of Logic. If you push a subject hard enough about why they abide by the laws of their regime, Žižek holds that their responses will inevitably devolve into some logical variant of Exodus 3:14’s “I am that I am” statements of the form “because the Law (God / the People/ the Nation) is … the Law (God / the People / the Nation)”. In such tautological statements, our expectation that the predicates in the second half of the sentence will add something new to the (logical) subject given at its beginning is “contradicted,” Hegel argues. There is indeed something even sinister when someone utters such a sentence in response to our enquiries, Žižek notes—as if, when (for example) “the Law” is repeated dumbly as its own predicate (“because the law is the law”), it intimates the uncanny dimension of jouissance the law as ego ideal usually proscribes (3a). What this uncanny effect of sense attests to, Žižek argues in For They Know Not What They Do, is the usually “primordially repressed” force of the universal meta-law (that everyone must obey the laws) being expressed in the different, particular languages of political regimes: “because the People are the People,” “because the Nation is the Nation”, and so forth.

Žižek’s ideology critique hence contends that all political regimes’ ideologies always devolve finally around a set of such tautological propositions concerning their particular sublime objects. In The Sublime Object of Ideology, Žižek gives the example of a key Stalinist proposition: “the people always support the party.” On its surface, this proposition looks like a proposition that asserts something about the world, and which might be susceptible of disproof: perhaps there are some Soviet citizens who do not support the party, or who disagree with this or that of the party’s policies. What such an approach misses, however, is how in this ideology, what is referred to as “the people” in fact means “all those who support the party.” In Stalinism, that is, “the party” is the fetishized particular that stands for the people’s true interests (see 1e). Hence, the sentence “the people always support the party” is a concealed form of tautology. Any apparent people who in fact do not support the party by that fact alone are no longer “people” within Stalinist ideology.

d. The Substance is Subject, the Other Does Not Exist

In 4b, we saw how Žižek argues that political identification is identification with the gesture of identification. In 4c, we saw how the ultimate foundation of a regimes’ laws is a tautologous assertion of the bare political fact that there is law. What unites these two positions is the idea that the sublime objects of a political regime and the ideological fantasies that give narratives about their content conceal from subjects the absence of any final ground for Law beyond the fact of its own assertion, and the fact that subjects take it to be authoritative. Here as elsewhere, Žižek’s work surprisingly approaches leading motifs in the political philosophy of Carl Schmitt.

Importantly, once this position is stated, we can also begin to see how Žižek’s post-Marxist project of a critique of ideology intersects with his philosophical defense of the Cartesian subject. At several points in his oeuvre, Žižek cites Hegel’s statement in the “Introduction” to the Phenomenology of Spirit that “the substance is subject” as a rubric that describes the core of his own political philosophy. According to Žižek, critics have misread this statement by taking it to repeat the founding, triumphalist idea of modern subjectivity as such—namely, that the subject can master all of nature or “substance.” Žižek contends, controversially, that Hegel’s claim ought to be read in a directly opposing sense. For him, it indicates the truth that there can be no dominant political regime or, in Hegel’s terms, no “social substance” that does not depend for its authority upon the active, indeed finally anticipatory (4c) investment of subjects in it. Like the malign computer machines in The Matrix that literally run off the human jouissance they drain from deluded subjects, for Žižek the big Other of any political regime does not exist as a self-sustaining substance. It must ceaselessly run on the belief and actions of its subjects, and their jouissance (2c)—or, to recur to the example we looked at in 2d, the King will not be the King, for Žižek, unless he has his subjects. It is certainly telling that the leading examples of ideological tautology For They know What They Do discusses invoke precisely some subject’s will or decision as when a parent says to a child “do this … because I said so,” or when people do something “… because the King said so,” which means that no more questions can be asked.

In 4a, we saw how Žižek denies that the subject, because it is not itself a perceptible object, belongs to an order of being wholly outside of the order of experience. To elevate such a wholly Other order would, he argues, reproduce the elementary operation of the fundamental fantasy. We can now add to this thought the further position that the Cartesian subject is, according to Žižek, is finally nothing other than the irreducible point of active agency responsible for the always minimally precipitous political gesture of laying down a regime’s law. For Žižek, accordingly, the critical question to be asked of any theoretical or political position that posits some exceptional Beyond, as we saw in his reading of Kant (2e) is: from which subject-position do you speak when you claim a knowledge of this Beyond? As we saw in 2e, Žižek’s Lacanian answer is that the perspective that one always presupposes when one speaks in this manner is one that is always “superegoic” (see 2a)—tied to what he terms in Metastases of Enjoyment a “malevolently neutral” God’s eye view from nowhere. It is deeply revealing, from Žižek’s perspective, that the very perspective which allows the Kantian subject in the “dynamic sublime” to resignify its own finitude as itself a source of pleasure-in-pain (jouissance) is precisely one which identifies with the supersensible moral Law, before which the sensuous subject remains irredeemably guilty, infinitely striving to pay off its moral debt. As Žižek cites Hegel’s Phenomenology of Spirit:

It is manifest that beyond the so-called curtain [of phenomena] which is supposed to conceal the inner world, there is nothing to be seen unless we go behind it ourselves as much in order that we may see, as that there may be something behind there which can be seen. (Žižek, 1989: 196, emphasis added)

In other words, Žižek’s final position about the sublime objects of political regimes’ ideologies is that these belief inspiring objects are so many ways in which the subject misrecognizes its own active capacity to challenge existing laws, and to found new laws altogether. Žižek repeatedly argues that the most uncanny or abyssal Thing in the world is the subject’s own active subjectivity—which is why he also repeatedly cites the Eastern saying that “Thou art that.” It is finally the singularity of the subject’s own active agency that subjects misperceive in fantasies concerning the sublime objects of their regimes’ ideologies, in the face of which they can do nothing but reverentially abide by the rules. In this way, it is worth noting, Žižek’s work can claim a heritage not only of Hegel, but also from the Left Hegelians, and Marx’s and Feuerbach’s critiques of religion.

e. The Ethical Act Traversing the Fantasy

Žižek’s technical term for the process whereby we can come to recognize how the sublime objects of our political regimes’ ideologies are, like Marx’s commodities, fetish objects that conceal from subjects their own political agency is “traversing of the fantasy.” Traversing the fantasy, for Žižek, is at once the political subject’s deepest form of self-recognition, and the basis for his own radical political position or defense of the possibility of such positions. Žižek’s entire theoretical work directs us towards this “traversing of the fantasy” in the many different fields on which he has written, and despite the widespread consensus at the beginning of the new century that fundamental political change is no longer possible or desirable.

Insofar as political ideologies for Žižek, like for Althusser (see 2c), remain viable only because of the ongoing practices and beliefs of political subjects, this traversal of fantasy must always involve an active, practical intervention in the political world, which changes a regime’s political institutions. As for Kant, so for Žižek, the practical bearing of critical reason comes first, in his critique of ideology, and last, in his advocacy of the possibility of political change. Žižek hence also repeatedly speaks of traversing the fantasy in terms of an “Act” (capital “A”), which differs from normal human speech and action. Everyday speech and action typically does not challenge the framing sociopolitical parameters within which it takes place, Žižek observes. By contrast, what he means by an Act is an action which “touches the Real” (as he says) of what a sociopolitical regime has politically repressed or wiped its hands of, and which it cannot publicly avow without risking fundamental political damage (see 2c). In this way, the Žižekian Act extends and changes the very political and ideological parameters of what is permitted within a regime, in the hope of bringing into being new parameters in the light of which its own justice will be able to be retrospectively seen. This is the point of significant parallel with Alain Badiou’s work, whose influence Žižek has increasingly avowed in his more recent books. Notably, as Žižek specifies in The Indivisible Remainder, the Act as what it is effectively repeats the very act that he claims founds all political regimes as such, namely, the excessive, law founding gesture we examined in 4c. Just as the current political regime originated in a founding gesture excessive with regard to the laws it set in place, Žižek argues, so too can this political regime itself be superseded, and a new one replace it. In his reading of Walter Benjamin’s “Theses on the Philosophy of History” in The Sublime Object of Ideology, Žižek indeed argues that such a new Act also effectively repeats all previous, failed attempts at changing an existing political regime, which otherwise would be consigned forever to historical oblivion.

5. Conclusion

Slavoj Žižek’s work represents a striking challenge within the contemporary philosophical scene. Žižek’s very style, and his prodigious ability to write and examine examples from widely divergent fields, is a remarkable thing. His work reintroduces and reinvigorates for a wider audience ideas from the works of German Idealism. Žižek’s work is framed in terms of a polemical critique of other leading theorists within today’s new left or liberal academy (Derrida, Habermas, Deleuze), which claims to unmask their apparent radicality as concealing a shared recoil from the possibility of a subjective, political Act which in fact sits comfortably with a passive resignation to today’s political status quo. Not the least interesting feature of his work, politically, is indeed how Žižek’s critique of the new left both significantly mirrors criticisms from conservative and neoconservative authors, yet hails from an avowedly opposed political perspective. In political philosophy, Žižek’s Lacanian theory of ideology presents a radically new descriptive perspective that affords us a unique purchase on many of the paradoxes of liberal consumerist subjectivity, which is at once politically cynical (as the political right laments) and politically conformist (as the political left struggles to come to terms with). Prescriptively, Žižek’s work challenges us to ask questions about the possibility of sociopolitical change that have otherwise rarely been asked after 1989, including: what forms such changes might take?; and what might justify them or make them possible?

Looked at in a longer perspective, it is of course too soon to judge what the lasting effects of Žižek’s philosophy will be, especially given Žižek’s own comparative youth as a thinker (Žižek was born in 1949). In terms of the history of ideas, in particular, while Žižek’s thought certainly turns on their heads many of today’s widely accepted theoretical notions, it is surely a more lasting question whether his work represents any more lasting a break with the parameters that Kant’s critical philosophy set out in the three Critiques.

6. References and Further Reading

a. Primary Literature (Books by Žižek)

  • Iraq The Borrowed Kettle, New York: Verso, 2004.
  • Organs Without Bodies: On Deleuze and Consequences, New York, London: Routledge, 2003.
  • The Puppet and the Dwarf, New York: Routledge, 2003.
  • Did Somebody Say Totalitarianism? Five Essays on the (Mis)Use of a Notion, London; New York: Verso, 2001.
  • The Fright of Real Tears, Kieslowski and The Future, Bloomington: Indiana University Press, 2001.
  • On Belief, London: Routledge, 2001.
  • The Fragile Absolute or Why the Christian Legacy is Worth Fighting For, London; New York: Verso, 2000.
  • The Art of the Ridiculous Sublime, On David Lynch’s Lost Highway, Walter Chapin Center for the Humanities: University of Washington, 2000.
  • Contingency, Hegemony, Universality: Contemporary Dialogues on the Left, Judith Butler, Ernesto Laclau and SZ. London; New York: Verso, 2000.
  • Enjoy Your Symptom! Jacques Lacan in Hollywood and Out, second expanded edition, New York: Routledge, 2000.
  • The Ticklish Subject: The Absent Centre of Political Ontology, London; New York: Verso, 1999.
  • The Abyss Of Freedom Ages Of The World, with F.W.J. von Schelling, Ann Arbor: University of Michigan Press, 1997.
  • The Plague of Fantasies, London; New York: Verso, 1997.
  • Gaze And Voice As Love Objects, Renata Salecl and SZ editors. Durham: Duke University Press, 1996.
  • The Indivisible Remainder: An Essay On Schelling And Related Matters, London; New York: Verso, 1996.
  • The Metastases Of Enjoyment: Six Essays On Woman And Causality (Wo Es War), London; New York: Verso, 1994.
  • Mapping Ideology, SZ editor. London; New York: Verso, 1994.
  • Tarrying With The Negative: Kant, Hegel And The Critique Of Ideology, Durham: Duke University Press, 1993.
  • Enjoy Your Symptom! Jacques Lacan In Hollywood And Out, London; New York: Routledge, 1992.
  • Everything You Always Wanted to Know about Lacan (But Were Afraid To Ask Hitchcock), SZ editor. London; New York: Verso, 1992.
  • Looking Awry: an Introduction to Jacques Lacan through Popular Culture, Cambridge, Mass.: MIT Press, 1991.
  • For They Know Not What They Do: Enjoyment As A Political Factor, London; New York: Verso, 1991.
  • The Sublime Object of Ideology, London; New York: Verso, 1989.

b. Secondary Literature (Texts on Žižek)

  • Slavoj Žižek: A Little Piece of the Real, Matthew Sharpe, Hants: Ashgate, 2004.
  • Slavoj Žižek: A Critical Introduction, Ian Parker, London: Pluto Press, 2004.
  • Slavoj Žižek: Live Theory, Rex Butler, London: Continuum, 2004.
  • Žižek: A Critical Introduction, Sarah Kay, London: Polity, 2003.
  • Slavoj Žižek (Routledge Critical Thinkers), Tony Myers, London: Routledge, 2003.

 

Author Information

Matthew Sharpe
Email: matthew.sharpe@dewr.gov.au

Australia

Karl Popper: Philosophy of Science

Karl PopperKarl Popper (1902-1994) was one of the most influential philosophers of science of the 20th century. He made significant contributions to debates concerning general scientific methodology and theory choice, the demarcation of science from non-science, the nature of probability and quantum mechanics, and the methodology of the social sciences. His work is notable for its wide influence both within the philosophy of science, within science itself, and within a broader social context.

Popper’s early work attempts to solve the problem of demarcation and offer a clear criterion that distinguishes scientific theories from metaphysical or mythological claims. Popper’s falsificationist methodology holds that scientific theories are characterized by entailing predictions that future observations might reveal to be false. When theories are falsified by such observations, scientists can respond by revising the theory, or by rejecting the theory in favor of a rival or by maintaining the theory as is and changing an auxiliary hypothesis. In either case, however, this process must aim at the production of new, falsifiable predictions, while Popper recognizes that scientists can and do hold onto theories in the face of failed predictions when there are no predictively superior rivals to turn to. He holds that scientific practice is characterized by its continual effort to test theories against experience and make revisions based on the outcomes of these tests. By contrast, theories that are permanently immunized from falsification by the introduction of untestable ad hoc hypotheses can no longer be classified as scientific. Among other things, Popper argues that his falsificationist proposal allows for a solution of the problem of induction, since inductive reasoning plays no role in his account of theory choice.

Along with his general proposals regarding falsification and scientific methodology, Popper is notable for his work on probability and quantum mechanics and on the methodology of the social sciences. Popper defends a propensity theory of probability, according to which probabilities are interpreted as objective, mind-independent properties of experimental setups. Popper then uses this theory to provide a realist interpretation of quantum mechanics, though its applicability goes beyond this specific case. With respect to the social sciences, Popper argued against the historicist attempt to formulate universal laws covering the whole of human history and instead argued in favor of methodological individualism and situational logic.

Table of Contents

  1. Background
  2. Falsification and the Criterion of Demarcation
    1. Popper on Physics and Psychoanalysis
    2. Auxiliary and Ad Hoc Hypotheses
    3. Basic Sentences and the Role of Convention
    4. Induction, Corroboration, and Verisimilitude
  3. Criticisms of Falsificationism
  4. Realism, Quantum Mechanics, and Probability
  5. Methodology in the Social Sciences
  6. Popper’s Legacy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Background

Popper began his academic studies at the University of Vienna in 1918, and he focused on both mathematics and theoretical physics. In 1928, he received a PhD in Philosophy. His dissertation, On the Problem of Method in the Psychology of Thinking, dealt primarily with the psychology of thought and discovery. Popper later reported that it was while writing this dissertation that he came to recognize “the priority of the study of logic over the study of subjective thought processes” (1976, p. 86), a sentiment that would be a primary focus in his more mature work in the philosophy of science.

In 1935, Popper published Logik der Forschung (The Logic of Research), his first major work in the philosophy of science.  Popper later translated the book into English and published it under the title The Logic of Scientific Discovery (1959). In the book, Popper offered his first detailed account of scientific methodology and of the importance of falsification. Many of the arguments in this book, as well as throughout his early work, are directed against members of the so-called “Vienna Circle,” such as Moritz Schlick, Otto Neurath, Rudolph Carnap, Hans Reichenbach, Carl Hempel, and Herbert Feigl, among others. Popper shared these thinkers’ concern with general issues of scientific methodology, and he sympathized with their distrust of traditional philosophical methodology. His proposed solutions to the problems arising from these concerns, however, were significantly different from those favored by the Vienna Circle.

Popper stayed in Vienna until 1937, when he took a teaching position at Canterbury University College in Christchurch, New Zealand, and he stayed there throughout World War II. His major works on the philosophy of science from this period include the articles that would eventually make up The Poverty of Historicism (1957). In these articles, he offered a highly critical analysis of the methodology of the social sciences, in particular, of attempts by social scientists to formulate predictive, explanatory laws.

In 1946, Popper took a teaching position at the London School of Economics, where he stayed until he retired in 1969. While there, he continued to work on a variety of issues relating to the philosophy of science, including quantum mechanics, entropy, evolution, and the realism vs. anti-realism debate, along with the issues already mentioned. His major works from this period include “The Propensity Interpretation of Probability” (1959) and Conjectures and Refutations (1963). He continued to publish until shortly before his death in 1994. In The Philosophy of Karl Popper (1974), Popper offers responses to many of his most important critics and provides clarifications of his mature views. His intellectual autobiography Unended Quest (1976) gives a detailed account of Popper’s evolving views, especially as they relate to the philosophy of science.

2. Falsification and the Criterion of Demarcation

Much of Popper’s early work in the philosophy of science focuses on what he calls the problem of demarcation, or the problem of distinguishing scientific (or empirical) theories from non-scientific theories. In particular, Popper aims to capture the logical or methodological differences between scientific disciplines, such as physics, and non-scientific disciplines, such as myth-making, philosophical metaphysics, Freudian psychoanalysis, and Marxist social criticism.

Popper’s proposals concerning demarcation can be usefully seen as a response to the verifiability criterion of demarcation proposed by logical empiricists, such as Carnap and Schlick. According to this criterion, a statement is cognitively meaningful if and only if it is, in principle, possible to verify. This criterion is intended to, among other things, capture the idea that the claims of empirical science are meaningful in a way that the claims of traditional philosophical metaphysics are not. For example, this criterion entails that claims about the locations of mid-sized objects are meaningful, since one can, in principle, verify them by going to the appropriate location. By contrast, claims about the fundamental nature of causation are not meaningful.

While Popper shares the belief that there is a qualitative difference between science and philosophical metaphysics, he rejects the verifiability criterion for several reasons. First, it counts existential statements (like “unicorns exist”) as scientific, even though there is no way of definitively showing that they are false. After all, the mere fact that one has failed to see a unicorn in a particular place does not establish that unicorns could not be observed in some other place. Second, it inappropriately counts universal statements (like “all swans are white”) as meaningless simply because they can never be conclusively verified. These sorts of universal claims, though, are common within science, and certain observations (like the observation of a black swan) can clearly show them to be false. Finally, the verifiability criterion is by its own light not meaningful, since it cannot be verified.

Partially in response to worries such as these, the logical empiricists’ later work abandons the verifiability criterion of meaning and instead emphasizes the importance of the empirical confirmation of scientific theories. Popper, however, argues that verification and confirmation played no role in formulating a satisfactory criterion of demarcation. Instead, Popper proposes that scientific theories are characterized by being bold in two related ways. First, scientific theories regularly disagree with accepted views of the world based on common sense or previous theoretical commitments. To an uneducated observer, for example, it may seem obvious that Earth is stationary, while the sun moves rapidly around it. However, Copernicus posited that Earth in fact revolved around the sun. In a similar way, it does not seem as though a tree and a human share a common ancestor, but this is what Darwin’s theory of evolution by natural selection claims. As Popper notes, however, this sort of boldness is not unique to scientific theories, since most mythological and metaphysical theories also make bold, counterintuitive claims about the nature of reality. For example, the accounts of world creation provided by various religions would count as bold in this sense, but this does not mean that they thereby count as scientific theories.

With this in mind, he goes on argue that scientific theories are distinguished from non-scientific theories by a second sort of boldness: they make testable claims that future observations might reveal to be false. This boldness thus amounts to a willingness to take a risk of being wrong. On Popper’s view, scientists investigating a theory make repeated, honest attempts to falsify the theory, whereas adherents of pseudoscientific or metaphysical theories routinely take measures to make the observed reality fit the predictions of the theory. Popper describes his proposal as follows:

Thus my proposal was, and is, that it is this second boldness, together with the readiness to look for tests and refutations, which distinguished “empirical” science from non-science, and especially from pre-scientific myths and metaphysics (1974, pp. 980-981)

In other places, Popper calls attention to the fact that scientific theories are characterized by possessing potential falsifiers—that is, that they make claims about the world that might be discovered to be false. If these claims are, in fact, found to be false, then the theory as a whole is said to be falsified. Non-scientific theories, by contrast, do not have any such potential falsifiers—there is literally no possible observation that could serve to falsify these theories.

Popper’s falsificationist proposal differs from the verifiability criterion in several important ways. First, Popper does not hold that non-scientific claims are meaningless. Instead, he argues that such unfalsifiable claims can often serve important roles in both scientific and philosophical contexts, even if we are incapable of ascertaining their truth or falsity. Second, while Popper is a realist who holds that scientific theories aim at the truth (see Section 4), he does not think that empirical evidence can ever provide us grounds for believing that a theory is either true or likely to be true. In this sense, Popper is a fallibilist who holds that while the particular unfalsified theory we have adopted might be true, we could never know this to be the case. For these same reasons, Popper holds that it is impossible to provide justification for one’s belief that a particular scientific theory is true. Finally, where others see science progressing by confirming the truth of various particular claims, Popper describes science as progressing on an evolutionary model, with observations selecting against unfit theories by falsifying them.

a. Popper on Physics and Psychoanalysis

In order to see how falsificationism works in practice, it will help to consider one of Popper’s most memorable examples: the contrast between Einstein’s theory of general relativity and the theories of psychoanalysis defended by Sigmund Freud and Alfred Adler. We might roughly summarize the theories as follows:

General relativity (GR): Einstein’s theory of special relativity posits that the observed speed of light in a vacuum will be the same for all observers, regardless of which direction or at what velocity these observers are themselves moving. GR allows this theory to be applied to cases where acceleration or gravity plays a role, specifically by treating gravity as a sort of distortion or bend in space-time created by massive objects.

Psychoanalysis: The theory of psychoanalysis holds that human behavior is driven at least in part by unconscious desires and motives. For example, Freud posited the existence of the id, an unconscious part of the human psyche that aims toward gratifying instinctive desires, regardless of whether this is rational. However, the desires of the id might be mediated or superseded in certain circumstances by its interaction with both the self-interested ego and the moral superego.

As we can see, both theories make bold, counter-intuitive claims about the fundamental nature of reality. Moreover, both theories can account for previously observed phenomena; for example, GR allows for an accurate description of the observed perihelion of Mercury, while psychoanalysis entails that it is possible for people to consistently act in ways that are against their own long-term best interest. Finally, both of these theories enjoyed significant support among their academic peers when Popper was first writing about these issues.

Popper argues, however, that GR is scientific while psychoanalysis is not. The reason for this has to do with the testability of Einstein’s theory. As a young man, Popper was especially impressed by Arthur Eddington’s 1919 test of GR, which involved observing during a solar eclipse the degree to which the light from distant stars was shifted when passing by the sun. Importantly, the predictions of GR regarding the magnitude shift disagreed with the then-dominant theory of Newtonian mechanics. Eddington’s observation thus served as a crucial experiment for deciding between the theories, since it was impossible for both theories to give accurate predictions. Of necessity, at least one theory would be falsified by the experiment, which would provide strong reason for scientists to accept its unfalsified rival. On Popper’s view, the continual effort by scientists to design and carry out these sorts of potentially falsifying experiments played a central role in theory choice and clearly distinguished scientific theorizing from other sorts of activities. Popper also takes care to note that insofar as GR was not a unified field theory, there was no question of GR’s being the complete truth, as Einstein himself repeatedly emphasized. The scientific status of GR, then, had nothing to do with neither (1) the truth of GR as a general theory of physics (the theory was already known to false) nor (2) the confirmation of GR by evidence (one cannot confirm a false theory).

In contrast to such paradigmatically scientific theories as GR, Popper argues that non-scientific theories such as Freudian psychoanalysis do not make any predictions that might allow them to be falsified. The reason for this is that these theories are compatible with every possible observation. On Popper’s view, psychoanalysis simply does not provide us with adequate details to rule out any possible human behavior. Absent of these sorts of precise predictions, the theory can be made to fit with, and to provide a purported explanation of, any observed behavior whatsoever.

To illustrate this point, Popper offers the example of two men, one who pushes a child into the water with the intent of drowning it, and another who dives into the water in order to save the child. Popper notes that psychoanalysis can explain both of these seemingly contradictory actions. In the first case, the psychoanalyst can claim that the action was driven by a repressed component of the (unconscious) id and in the second case, that the action resulted from a successful sublimation of this exact same sort of desire by the ego and superego. The point generalizes that regardless of how a person actually behaves, psychoanalysis can be used to explain the behavior. This, in turn, prevents us from formulating any crucial experiments that might serve to falsify psychoanalysis. Popper writes:

The point is very clear. Neither Freud nor Adler excludes any particular person’s acting in any particular way, whatever the outward circumstances. Whether a man sacrificed his life to rescue a drowning child (a case of sublimation) or whether he murdered the child by drowning (a case of repression) could not possibly be predicted or excluded by Freud’s theory (1974, p. 985).

Popper allows that there are often legitimate purposes for positing non-scientific theories, and he argues that theories which start out as non-scientific can later become scientific, as we determine methods for generating and testing specific predictions based on these theories. Popper offers the example of Copernicus’s theory of a sun-centered universe, which initially yielded no potentially falsifying predictions, and so would not have counted as scientific by Popper’s criteria. However, later astronomers determined ways of testing Copernicus’s hypothesis, thus rendering it scientific. For Popper, then, the demarcation between scientific and non-scientific theories is not grounded on the nature of entities posited by theories, by the truth or usefulness of theories, or even by the degree to which we are justified in believing in such theories. Instead, falsification provides a methodological distinction based on the unique role that observation and evidence play in scientific practice.

b. Auxiliary and Ad Hoc Hypotheses

While Popper consistently defends a falsification-based solution to the problem of demarcation throughout his published work, his own explications of it include a number of qualifications to ensure a better fit with the realities of scientific practice. It is in this context that Popper introduces several of his more notable contributions to the philosophy of science, including auxiliary versus ad hoc hypotheses, basic sentences, and degrees of verisimilitude.

One immediate objection to the simple proposal regarding falsification sketched in the previous section is based on the Duhem-Quine thesis, according to which it is in many cases impossible to test scientific theories in isolation. For example, suppose that a group of investigators uses GR to deduce a prediction about the perihelion of Mercury, but then discovers that this prediction disagrees with their measurements. This failure might lead them to conclude that GR is false; however, the failure of the prediction might also plausibly be blamed on the falsity of some other proposition that the scientists relied on to deduce the apparently falsifying prediction. There are generally a large number of such propositions, concerning everything from the absence of human error to the accuracy of the scientific theories underlying the construction and application of the measuring equipment.

Popper recognizes that scientists routinely attribute the failure of experiments to factors such as this, and further grants that there is in many cases nothing objectionable about their doing so. On Popper’s view, the distinctive mark of scientific inquiry concerns the investigators’ responses to failed predictions in cases where they do not abandon the falsified theory altogether. In particular, Popper argues that a scientific theory can be legitimately saved from falsification by the introduction of an auxiliary hypothesis that allows for the generation of new, falsifiable predictions. Popper offers an example taken from the early 19th century, when astronomers noticed that the orbit of Uranus deviated significantly from what Newtonian mechanics seemed to predict. In this case, the scientists did not treat Newton’s laws as being falsified by such an observation. Instead, they considered the auxiliary hypothesis that there existed an additional and so far unobserved planet that was influencing the orbit of Uranus. They then used this auxiliary hypothesis, together with equations of Newtonian mechanics, to predict where this planet must be located. Their predictions turned out to be successful, and Neptune was discovered in 1846.

Popper contrasts this legitimate, scientific method of theory revision with the illegitimate, non-scientific use of ad hoc hypotheses to rescue theories from falsification. Here, an ad hoc hypothesis is one that does not allow for the generation of new, falsifiable predictions. Popper gives the example of Marxism, which he argues had originally made definite predictions about the evolution of society: the capitalist, free-market system would self-destruct and be replaced by joint ownership of the means of production, and this would happen first in the most highly developed economies. By the time Popper was writing in the mid-20th century, however, it seemed clear to him that these predictions were false: free market economies had not self-destructed, and the first communist revolutions happened in relatively undeveloped economies. The proponents of Marxism, however, neither abandoned the theory as falsified nor introduced any new, falsifiable auxiliary hypotheses that might account for the failed predictions. Instead, they adopted ad hoc hypotheses that immunized Marxism against any potentially falsifying observations whatsoever. For example, the continued persistence of capitalism might be blamed on the action of counter-revolutionaries but without providing an account of which specific actions these were, or what specific new predictions about society we should expect instead. Popper concludes that, while Marxism had originally been a scientific theory:

It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Ever since then, it can be described only as non-science—as a metaphysical dream, if you like, married to a cruel reality (1974, p. 985).

c. Basic Sentences and the Role of Convention

A second complication for the simple theory of falsification just described concerns the character of the observations that count as potential falsifiers of a theory. The problem here is that decisions about whether to accept an apparently falsifying observation are not always straightforward. For example, there is always the possibility that a given observation is not an accurate representation of the phenomenon but instead reflects theoretical bias or measurement error on the part of the observer(s). Examples of this sort of phenomenon are widespread and occur in a variety of contexts: students getting the “wrong” results on lab tests, a small group of researchers reporting results that disagree with those obtained by the larger research community, and so on.

In any specific case in which bias or error is suspected, Popper notes that researchers might introduce a falsifiable, auxiliary hypothesis allowing us to test this. And in many cases, this is just what they do: students redo the test until they get the expected results, or other research groups attempt to replicate the anomalous result obtained. Popper argues that this technique cannot solve the problem in general, however, since any auxiliary hypotheses researchers introduce and test will themselves be open to dispute in just the same way, and so on ad infinitum. If science is to proceed at all then, there must be some point at which the process of attempted falsification stops.

In order to resolve this apparently vicious regress, Popper introduces the idea of a basic statement, which is an empirical claim that can be used to both determine whether a given theory is falsifiable and thus scientific and, where appropriate, to corroborate falsifying hypotheses. According to Popper, basic statements are “statements asserting that an observable event is occurring in a certain individual region of space and time” (1959, p. 85). More specifically, basic statements must be both singular and existential (the formal requirement) and be testable by intersubjective observation (the material requirement). On Popper’s view, “there is a raven in space-time region k” would count as a basic statement, since it makes a claim about an individual raven whose existence, or lack thereof, could be determined by appropriately located observers. By contrast, the negative existential claim “there are no ravens in space-time region k” does not do this, and thus fails to qualify as a basic statement.

In order to avoid the infinite regress alluded to earlier, where basic statements themselves must be tested in order to justify their status as potential falsifiers, Popper appeals to the role played by convention and what he calls the “relativity of basic statements.” He writes as follows:

Every test of a theory, whether resulting in its collaboration or falsification, must stop at some basic statement or other which we decide to accept. If we do not come to any decision, and do not accept some basic statement or other, then the test will have led nowhere… This procedure has no natural end. Thus if the test is to lead us anywhere, nothing remains but to stop at some point or other and say that we are satisfied, for the time being. (1959, p. 86)

From this, Popper concludes that a given statement’s counting as a basic statement requires the consensus of the relevant scientific community—if the community decides to accept it, it will count as a basic statement; if the community does not accept it as basic, then an effort must be made to test the statement by using it together with other statements to deduce a statement that the relevant community will accept as basic. Finally, if the scientific community cannot reach a consensus on what would count as a falsifier for the disputed statement, the statement itself, despite initial appearances, may not actually be empirical or scientific in the relevant sense.

d. Induction, Corroboration, and Verisimilitude

Falsification also plays a key role in Popper’s proposed solution to David Hume’s infamous problem of induction. On Popper’s interpretation, Hume’s problem involves the impossibility of justifying belief in general laws based on evidence that concerns only particular instances. Popper agrees with Hume that inductive reasoning in this sense could not be justified, and he thus rejects the idea that empirical evidence regarding particular individuals, such as successful predictions, is in any way relevant to confirming the truth of general scientific laws or theories. This places Popper’s view in explicit contrast to logical empiricists such as Carnap and Hempel, who had developed extensive, mathematical systems of inductive logic intended to explicate the degree of confirmation of scientific theories by empirical evidence.

Popper argues that there are in fact two closely related problems of induction: the logical problem of induction and the psychological problem of induction. The first problem concerns the possibility of justifying belief in the truth or falsity of general laws based on empirical evidence that concerns only specific individuals. Popper holds that Hume’s argument concerning this problem “establishes for good that all our universal laws or theories remain forever guesses, conjectures, [and] hypotheses” (1974, p. 1019). However, Popper claims that while a successful prediction is irrelevant to confirming a law, a failed prediction can immediately falsify it. On Popper’s view, then, observing 1,000 white swans does nothing to increase our confidence that the hypothesis “all swans are white” is true; however, the observation of a single black swan can, subject to the caveats mentioned in previous sections, falsify this same hypothesis.

In contrast to the logical problem of induction, the psychological problem of induction concerns the possibility of explaining why reasonable people nevertheless have the expectation that unobserved instances will obey the same general laws as did previously observed instances. Hume tries to resolve the psychological problem by appeal to habit or custom, but Popper rejects this solution as inadequate, since it suggests that there is a “clash between the logic and the psychology of knowledge” (1974, p. 1019) and hence that people’s beliefs in general laws are fundamentally irrational.

Popper proposes to solve these twin problems of induction by offering an account of theory preference that does not rely upon inductive inference and thus avoids Hume’s problems altogether. While the technical details of this account evolve throughout his writings, he consistently emphasizes two main points. First, he holds that a theory with greater informative content is to be preferred to one with less content. Here, informative content is a measure of how much a theory rules out; roughly speaking, a theory with more informative content makes a greater number of empirical claims, and thus has a higher degree of falsifiability. Second, Popper holds that a theory is corroborated by passing severe tests, or “by predictions which were highly improbable in the lights of our previous knowledge (previous to the theory which was tested and corroborated)” (1963, p. 220).

It is important to distinguish Popper’s claim that a theory is corroborated by surviving a severe test from the claim that the logical empiricist view that a theory is inductively confirmed by successfully predicting events that, were the theory to have been false, would have been highly unlikely. According to the latter view, a successful prediction of this sort, subject to certain caveats, provides evidence that the theory in question is actually true. The question of theory choice is tightly tied to that of confirmation: scientists should adopt whichever theory is most probable by light of the available evidence. On Popper’s view, by contrast, corroboration provides no evidence whatsoever the theory in question is true, or even that the theory is preferable to a so-far-untested but still unfalsified rival. Instead, a corroborated theory has shown merely that it is the sort of theory that could be falsified and thus can be legitimately classified as scientific. While a corroborated theory should obviously be preferred to an already falsified rival (see Section 2), the real work here is being done by the falsified theory, which has taken itself out of contention.

While Popper consistently rejects the idea that we are justified in believing that non-falsified, well-corroborated scientific theories with high levels of informative content are either true or likely to be true, his work on degrees of verisimilitude explores the idea that such theories are closer to the truth than were the falsified theories that they had replaced. The basic idea is as follows:

  1. For a given statement H, let the content of H be the class of all of the logical consequences of So, if H is true, then all of the members of this class would be true; if H were false however, then only some members of this class would be true, since every false statement has at least some true consequences.
  2. The content of H can be broken into two parts: the truth content consisting of all the true consequences of H, and the falsity content, consisting of all of the false consequences of
  3. The verisimilitude of H is defined as the difference between the truth content of H and falsity content of H. This is intended to capture the idea that a theory with greater verisimilitude will entail more truths and fewer falsehoods than does a theory will less verisimilitude.

With this definition in hand, it might now seem that Popper could incorporate truth into his account of his theory preference: non-falsified theories with high levels of informative content were closer to the truth than either the falsified theories they replaced or their unfalsified but less informative competitors. Unfortunately, however, this definition does not work, as arguments from Tichý (1974), Miller (1974), Harris (1974), and others show. Tichý and Miller in particular demonstrate that Popper’s proposed definition cannot be used to compare the relative verisimilitude of false theories, which is Popper’s main purpose in introducing the notion of verisimilitude. While Popper (1976) explores ways of modifying his proposal to deal with these problems, he is never able to provide a satisfactory formal definition of verisimilitude. His work on this area is nevertheless invaluable in identifying a problem that has continued to interest many contemporary researchers.

3. Criticisms of Falsificationism

While Popper’s account of scientific methodology has continued to be influential, it has also faced a number of serious objections. These objections, together with the emergence of alternative accounts of scientific reasoning, have led many philosophers of science to reject Popper’s falsificationist methodology. While a comprehensive list of these criticisms and alternatives is beyond the scope of this entry, interested readers are encouraged to consult Kuhn (1962), Salmon (1967), Lakatos (1970, 1980), Putnam (1974), Jeffrey (1975), Feyerabend (1975), Hacking (1983), and Howson and Urbach (1989).

One criticism of falsificationism involves the relationship between theory and observation. Thomas Kuhn, among others, argues that observation is itself strongly theory-laden, in the sense that what one observes is often significantly affected by one’s previously held theoretical beliefs. Because of this, those holding different theories might report radically different observations, even when they both are observing the same phenomena. For example, Kuhn argues those working within the paradigm provided by classical, Newtonian mechanics may genuinely have different observations than those working within the very different paradigm of relativistic mechanics.

Popper’s account of basic sentences suggests that he clearly recognizes both the existence of this sort of phenomenon and its potential to cause problems for attempts to falsify theories. His solution to it, however, crucially depends on the ability of the overall scientific community to reach a consensus as to which statements count as basic and thus can be used to formulate tests of the competing theories. This remedy, however, looks less attractive to the extent that advocates of different theories consistently find themselves unable to reach an agreement on what sentences count as basic. For example, it is important to Popper’s example of the Eddington experiment that both proponents of classical mechanics and those of relativistic mechanics could recognize Eddington’s reports of his observations as basic sentences in the relevant sense—that is, certain possible results would falsify the Newtonian laws of classical mechanics, while other possible results would falsify GR. If, by contrast, adherents of rival theories consistently disagreed on whether or not certain reports could be counted as basic sentences, this would prevent observations such as Eddington’s from serving any important role in theory choice. Instead, the results of any such potentially falsifying experiment would be interpreted by one part of the community as falsifying a particular theory, while a different section of the community would demand that these reports themselves be subjected to further testing.  In this way, disagreements over the status of basic sentences would effectively prevent theories from ever being falsified.

This purported failure to clearly distinguish the basic statements that formed the empirical base from other, more theoretical, statements would also have consequences for Popper’s proposed criterion of demarcation, which holds that scientific theories must allow for the deduction of basic sentences whose truth or falsity can be ascertained by appropriately located observers. If, contrary to Popper’s account, there is no distinct category of basic sentences within actual scientific practice, then his proposed method for distinguishing science from non-science fails.

A second, related criticism of falsifiability contends that falsification fails to provide an accurate picture of scientific practice. Specifically, many historians and philosophers of science have argued that scientists only rarely give up their theories in the face of failed predictions, even in cases where they are unable to identify testable auxiliary hypotheses. Conversely, it has been suggested that scientists routinely adopt and make use of theories that they know are already falsified. Instead, scientists will generally hold on to such theories unless and until a better alternative theory emerges.

For example, Lakatos (1970) describes a hypothetical case where pre-Einsteinian scientists discover a new planet whose behavior apparently violates classical mechanics. Lakatos argues that, in such a case, the scientists would surely attempt to account for these observed discrepancies in the way that Popper advocates—for example, by hypothesizing the existence of a hitherto unobserved planet or dust cloud. In contrast to what he takes Popper to be arguing, however, Lakatos contends that the failure of such auxiliary hypotheses would not lead them to abandon classical mechanics, since they had no alternative theory to turn to.

In a similar vein, Putnam (1975) argues that the initial widespread acceptance of Newtonian mechanics had little or nothing to do with falsifiable predictions, since the theory made very few of these. Instead, scientists were impressed by the theory’s success in explaining previously established phenomena, such as the orbits of the planets and the behavior of the tides. Putnam argues that, on Popper’s view, accepting such an uncorroborated theory would seem to be irrational. Finally, Hacking (1983) argues that many aspects of ordinary scientific practice, including a wide variety of observations and experiments, cannot plausibly be construed as attempts to falsify or corroborate any particular theory or hypothesis. Instead, scientists regularly perform experiments that have little or no bearing on their current theories and measure quantities about which these theories do not make any specific claims.

When considering the cogency of such criticisms, it is worth noting several things. First, it is worth recalling that Popper defends falsificationism as a normative, methodological proposal for how science ought to work in certain sorts of cases and not as an empirical description intended to accurately capture all aspects of historical scientific practice. Second, Popper does not commit himself to the implausible thesis that theories yielding false predictions about a particular phenomenon must immediately be abandoned, even if it is not apparent which auxiliary hypotheses must change. This is especially true in the absence of any rival theory yielding a correct prediction. For example, Newtonian mechanics had well-known problems with predicting certain sorts of phenomena, such as the orbit of Mercury, in the years preceding Einstein’s proposals regarding special and general relativity. Popper’s proposal does not entail that these failures of prediction should have led nineteenth century scientists to abandon this theory.

This being said, Popper himself argues that the methodology of falsificationism has played an important role in the history of science and that adopting his proposal would not require a wholesale revision of existing scientific methodology. If it turns out that scientists rarely, if ever, make theory choice on the basis of crucial experiments that falsify one theory or another, then Popper’s methodological proposal looks to be considerably less appealing.

A final criticism concerns Popper’s account of corroboration and the role it plays in theory choice. Popper’s deductive account of theory testing and adoption posits that it is rational to choose highly informative, well-corroborated theories, even though we have no inductive grounds for thinking that these theories are likely to be true. For example, Popper explicitly rejects the idea that corroboration is intended as an analogue to the subjective probability or logical probability that a theory is true, given the available evidence. This idea is central to both Popper’s proposed solution to the problem of induction and to his criticisms of competing inductivist or “Bayesian” programs.

Many philosophers of science, however, including Salmon (1967, 1981), Jeffrey (1975), Howson (1984a), and Howson and Urbach (1989), have objected to this aspect of Popper’s account. One line of criticism has focused on the extent to which Popper’s falsification offers a legitimate alternative to the inductivist proposals that Popper criticizes. For example, Jeffrey (1975) points out that it is just as difficult to conclusively falsify a hypothesis as it to conclusively verify it, and he argues that Bayesianism, with its emphasis on the degree to which empirical evidence supports a hypothesis, is much more closely aligned to scientific practice than Popper’s program.

A related line of objection has focused on Popper’s contention that it is rational for scientists to rely on corroborated theories, a claim that plays a central role in his proposed solution to the problem of induction. Urbach (1984) argues that, insofar as Popper is committed to the claim that every universal hypothesis has zero probability of being true, he cannot explain the rationality of adopting a corroborated theory over an already falsified one, since both have the same probability (zero) of being true. Taking a different tack, Salmon (1981) questions whether, on Popper’s account, it would be rational to use corroborated hypotheses for the purposes of prediction. After all, corroboration is entirely a matter of hypotheses’ past performance—a corroborated hypothesis is one that has survived severe empirical tests. Popper’s account, however, does not provide us with any reason for thinking that this hypothesis will have more accurate predictions about the future than any one of the infinite number of competing uncorroborated hypotheses that are also logically compatible with all of the evidence observed up to this point.

If these objections concerning corroboration are correct, it looks as though Popper’s account of theory choice is either (1) vulnerable to the same sorts of problems and puzzles that plague accounts of theory choice based on induction or (2) does not work as an account of theory choice at all.

While the sorts of objections mentioned here have led many to abandon falsificationism, David Miller (1998) provides a recent, sustained attempt to defend a Popperian-style critical rationalism. For more details on debates concerning confirmation and induction, see the entries on Confirmation and Induction and Evidence.

4. Realism, Quantum Mechanics, and Probability

While Popper holds that it is impossible for us to justify claims that particular scientific theories are true, he also defends the realist view that “what we attempt in science is to describe (and so far as possible) explain reality” (1975, p. 40). While Popper grants that realism is, according to his own criteria, an irrefutable metaphysical view about the nature, he nevertheless thinks we have good reasons for accepting realism and for rejecting anti-realist views such as idealism or instrumentalism. In particular, he argues that realism is both part of common sense and entailed by our best scientific theories. By contrast, he contends that the most prominent arguments for anti-realism are based on a “mistaken quest for certainty, or for secure foundations on which to build” (1975, p. 42). Once one accepts the impossibility of securing such certain knowledge, as Popper contends we ought to do, the appeal of these sorts of arguments is considerably diminished.

Popper consistently emphasizes that scientific theories should be interpreted as attempts to describe a mind-independent reality. Because of this, he rejects the Copenhagen interpretation of quantum mechanics, in which the act of human measurement is seen as playing a fundamental role in collapsing the wave-function and randomly causing a particle to assume a determinate position or momentum. In particular, Popper opposes the idea, which he associates with the Copenhagen interpretation, that the probabilistic equations describing the results of potential measurements of quantum phenomena are about the subjective states of the human observers, rather than concerning mind-independent existing physical properties such as the positions or momenta of particles.

It is in the context of this debate over quantum mechanics that Popper first introduces his propensity theory of probability. This theory’s applicability, however, extends well beyond the quantum world, and Popper argues that it can be used to interpret the sorts of claims about probability that arise both in other areas of science and in everyday life. Popper’s propensity theory holds that probabilities are objective claims about the mind-independent external world and that it is possible for there to be single-case probabilities for non-recurring events.

Popper proposes his propensity theory as a variant of the relative frequency theories of probability defended by logical positivists such as Richard von Mises and Hans Reichenbach. According to simple versions of frequency theory, the probability of an event of type e can be defined as the relative frequency of e in a large, or perhaps even infinite, reference class. For example, the claim that the “the probability of getting a six on a fair die is 1/6” can be understood as the claim that, in a long sequence of rolls with a fair die (the reference class), six would come up 1/6 of the time. The main alternatives to frequency theory that concern Popper are logical and subjective theories of probability, according to which claims about probability should be understood as claims about the strength of evidence for or degree of belief in some proposition. On these views, the claim that “the probability of getting a six on a fair die is 1/6” can be understood as a claim about our lack of evidence—if all we know is that the die is fair, then we have no reason to think that any particular number, such as a six, is more likely to come up on the next roll than any of the other five possible numbers.

Like other defenders of frequency theories, Popper argues that logical or subjective theories incorrectly interpret scientific claims about probability as being about the scientific investigators, and the evidence they have available to them, rather than the external world they are investigating. However, Popper argues that traditional frequency theories cannot account for single-case probabilities. For example, a frequency theorist would have no problem answering questions about “the probability that it will rain on an arbitrarily chosen August day,” since August days form a reference class. By contrast, questions about the probability that it will rain on a particular, future August day raises problems, since each particular day only occurs once. At best, frequency theories allow us to say the probability of it raining on that specific day is either 0 or 1, though we do not know which.

On Popper’s view, the failure to provide adequate treatment of single-case probabilities is a serious one, especially given what he saw as the centrality of such probabilities in quantum mechanics. To resolve this issue, Popper proposes that probabilities should be treated as the propensities of experimental setups to produce certain results, rather than as being derived from the reference class of results that were produced by running these experiments. On the propensity view, the results of experiments are important because they allow us to test hypotheses concerning the values of certain probabilities; however, the results are not themselves part of the probability itself. Popper argues that this solves the problem of single-case probability, since propensities can exist even for experiments that only happen once. Importantly, Popper does not require that these experiments utilize human intervention—instead, nature can itself run experiments, the results of which we can observe. For example, the propensity theory should, in theory, be able to make sense of claims about the probability that it will rain on a particular day, even though the experimental setup in this case is constituted by naturally occurring, meteorological phenomena.

Popper argues that the propensity theory of probability helps provide the grounds for a realist solution to the measurement problem within quantum mechanics. As opposed to the Copenhagen interpretation, which posits that the probabilities discussed in quantum mechanics reflect the ignorance of the observers, Popper argues these probabilities are in fact the propensities of the experimental setups to produce certain outcomes. Interpreted this way, he argues that they raise no interesting metaphysical dilemmas beyond those raised by classical mechanics and that they are equally amenable to a realist interpretation. Popper gives the example of tossing a penny, which he argues is strictly analogous to the experiments performed in quantum mechanics: if our experimental setup consists of simply tossing the penny, then the probability of getting heads is 1/2. If the experimental setup, however, is expanded to include the results of our looking at the penny, and thus includes the outcome of the experiment itself, then the probability will be either 0 or 1. This does not, though, involve positing any collapse of the wave-function caused merely by the act of human observation. Instead, what has occurred is simply a change in the experimental setup. Once we include the measurement result in our setup, the probability of a particular outcome will trivially become 0 or 1.

5. Methodology in the Social Sciences

Much of Popper’s early work on the methodology of science is concerned with physics and closely related fields, especially those where experimentation plays a central role. On Popper’s view, which was discussed in detail in previous sections, these sciences make progress by formulating a theory and then carefully designing experiments and observations aimed at falsifying the purported theory. The ever-present possibility that a theory might be falsified by these sorts of tests is, on Popper’s view, precisely what differentiates legitimate sciences, such as physics, from non-scientific activities, such as philosophical metaphysics, Freudian psychoanalysis, or myth-making.

This picture becomes somewhat more complicated, however, when we consider methodology in social sciences such as sociology and economics, where experimentation plays a much less central role. On Popper’s view, there are significant problems with many of the methods used in these disciplines. In particular, Popper argues against what he calls historicism, which he describes as “an approach to the social sciences which assumes that historical prediction is their principal aim, and which assumes that this aim is attainable by discovering the ‘rhythms’ or ‘patterns’, the ‘laws’ or ‘trends’ that underlie the evolution of history” (1957, p. 3).

Popper’s central argument against historicism contends that, insofar as the whole of human history is a singular process that occurs only once, it is impossible to formulate and test any general laws about history. This stands in stark contrast to disciplines such as physics, where the formulation and testing of laws plays a central role in making progress. For example, potential laws of gravitation can be tested by observations of planetary motions, by controlled experiments concerning the rates of falling objects near the earth’s surface, or in numerous other ways. If the relevant theories are falsified, scientists can easily respond, for instance, by changing one or more auxiliary hypotheses, and then conducting additional experiments on the new, slightly modified theory. By contrast, a law that purports to describe the future progress of history in its entirety cannot easily be tested in this way. Even if a particular prediction about the occurrence of some particular event is incorrect, there is no way of altering the theory to retest it—each historical event only occurs one, thus ruling out the possibility of carrying more tests regarding this event. Popper also rejects the claim that it is possible to formulate and test laws of more limited scope, such as those that purport to describe an evolutionary process that occurs in multiple societies, or that attempt to capture a trend within a given society.

Popper’s opposition to historicism is also evident in his objections what he calls utopian social engineering, which involves attempts by governments to fundamentally restructure the whole of society based on an overall plan or blueprint. On Popper’s view, the problem again concerns the impossibility of carrying out critical tests of the effectiveness of such plans. This impossibility is because of the holism of utopian plans, which involve changing everything at the same time. When the planners’ actions fail—as Popper thinks is inevitably the case with human interventions in society—to achieve their predicted results, the planners have no method for determining what in particular went wrong with their plan. This lack of testability, in turn, means that there is no way for the utopian engineers to improve their plans. This argument, among others, plays a central role in Popper’s critique of Marxism and totalitarianism in The Open Society and its Enemies (1945). More details on Popper’s political philosophy, including his critique of totalitarian societies, can be found here.

In place of historicism and utopian holism, Popper argues that the social sciences should embrace both methodological individualism and situational analysis. On Popper’s definition, methodological individualism is the view that the behavior of social institutions should be analyzed in terms of the behaviors of the individual humans that made them up. This individualism is motivated, in part, by Popper’s contention that many important social institutions, such as the market, are not the result of any conscious design but instead arise out of the uncoordinated actions of individuals with widely disparate motives. Scientific hypotheses about the behavior of such unplanned institutions, then, must be formulated in terms of the constituent participants. Popper’s presentation and defense of methodological individualism is closely related to that provided by the Austrian economist Frederich von Hayek (1942, 1943, 1944), with whom Popper maintained close personal and professional relationships throughout most of his life. For both Popper and Hayek, the defense of methodological individualism within the social sciences plays a key role in their broader argument in favor of liberal, market economies and against planned economies.

While Popper endorses methodological individualism, he rejects the doctrine of psychologism, according to which laws about social institutions must be reduced to psychological laws concerning the behavior of individuals. Popper objects to this view, which he associates with John Stuart Mill, on the grounds that it ends up collapsing into a form of historicism. The argument can be summarized as follows: once we begin trying to explain or predict the behavior currently existing in institutions in terms of individuals’ psychological motives, we quickly notice that these motives themselves cannot be understood without reference to the broader social environment within which these individuals find themselves. In order to eliminate the reference to the particular social institutions that make up this environment, we are then forced to demonstrate how these institutions were themselves a product of individual motives that had operated within some other previously existing social environment. This, though, quickly leads to an unsustainable regress, since humans always act within particular social environments, and their motives cannot be understood without reference to these environments. The only way out for the advocate of psychologism is to posit that both the origin and evolution of all human institutions can be explained purely in terms of human psychology. Popper argues that there is no historical support for the idea that there was ever such as an origin of social institutions. He also argues that this is a form of historicism, insofar as it commits us to discovering laws governing the evolution of society as a whole. As such, it inherits all of the problems mentioned previously.

In place of psychologism, Popper endorses a version of methodological individualism based on situational analysis. On this method, we begin by creating abstract models of the social institutions that we wish to investigate, such as markets or political institutions. In keeping with methodological individualism, these models will contain, among other things, representations of individual agents. However, instead of stipulating that these agents will behave according to the laws governing individual human psychology, as psychologism does, we animate the model by assuming that the agents will respond appropriately according to the logic of the situation. Popper calls this constraint on model building within the social sciences the rationality principle.

Popper recognizes that both the rationality principle and the models built on the basis of it are empirically false—after all, real humans often respond to situations in ways that are irrational and inappropriate. Popper also rejects, however, the idea that the rationality principle should be thought of as a methodological principle that is a priori immune to testing, since part of what makes theories in the social sciences testable is the fact that they make definite claims about individual human behavior. Instead, Popper defends the use of the rationality principle in model building on the grounds that is generally good policy to avoid blaming the falsification of a model on the inaccuracies introduced by the rationality principle and that we can learn more if we blame the other assumptions of our situational analysis (1994, p. 177). On Popper’s view, the errors introduced by the rationality principle are generally small ones, since humans are generally rational. More importantly, holding the rationality principle fixed makes it much easier for us to formulate crucial tests of rival theories and to make genuine progress in the social sciences. By contrast, if the rationality principle were relaxed, he argues, there would be almost no substantive constraints on model building.

6. Popper’s Legacy

While few of Popper’s individual claims have escaped criticism, his contributions to philosophy of science are immense. As mentioned earlier, Popper was one of the most important critics of the early logical empiricist program, and the criticisms he leveled against helped shape the future work of both the logical empiricists and their critics. In addition, while his falsification-based approach to scientific methodology is no longer widely accepted within philosophy of science, it played a key role in laying the ground for later work in the field, including that of Kuhn, Lakatos, and Feyerabend, as well as contemporary Bayesianism.  It also plausible that the widespread popularity of falsificationism—both within and outside of the scientific community—has had an important role in reinforcing the image of science as an essentially empirical activity and in highlighting the ways in which genuine scientific work differs from so-called pseudoscience.  Finally, Popper’s work on numerous specialized issues within the philosophy of science—including verisimilitude, quantum mechanics, the propensity theory of probability, and methodological individualism—has continued to influence contemporary researchers.

7. References and Further Reading

Popper Selections (1985) is an excellent introduction to Popper’s writings for the beginner, while The Philosophy of Karl Popper (Schilpp 1974) contains an extensive bibliography of Popper’s work published before the date, together with numerous critical essays and Popper’s responses to these. Finally, Unended Quest (1976) is an expanded version of the “Intellectual Autobiography” from Schilpp (1974), and it provides a helpful, non-technical overview of many of Popper’s main works in his own words.

a. Primary Sources

  • 1945. The Open Society and Its Enemies. 2 volumes. London: Routledge.
  • 1957. The Poverty of Historicism. London: Routledge. Originally published as a series of three articles in Economica 42, 43, and 46 (1944-1945).
  • 1959. The Logic of Scientific Discovery. London: Hutchinson. This is an English translation of Logik der Forschung, Vienna: Springer (1935).
  • 1959. “The Propensity Interpretation of Probability.” The British Journal for the Philosophy of Science 10 (37): 25–42.
  • 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge. Fifth edition 1989.
  • 1970. “Normal Science and Its Dangers.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgravez 51–58
  • 1972. Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon Press. Revised edition 1979.
  • 1974. “Replies to My Critics” and “Intellectual Autobiography.” In: Schilpp, Paul Arthur, ed.
  • 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • 1976. Unended Quest. London: Fontana. Revised edition 1984.
  • 1976. “A Note on Verisimilitude.” The British Journal for the Philosophy of Science 27 (2): 147–59.
  • 1978. “Natural Selection and the Emergence of Mind.” Dialectica 32 (3-4): 339–55.
  • 1982. The Open Universe: An Argument for Indeterminism. Edited by W. W. Bartley III. London: Routledge.
  • 1982. Quantum Theory and the Schism in Physics. Edited by W. W. Bartley III. New York: Routledge.
  • 1983. Realism and the Aim of Science. Edited by W. W. Bartley III. New York: Routledge.
  • 1985. Popper Selections. Edited by David W Miller. Princeton: Princeton University Press.
  • 1994. The Myth of the Framework: In Defense of Science and Rationality. Edited by Mark Amadeus Notturno. London: Routledge.
  • 1999. All Life Is Problem Solving. London: Routledge.

b. Secondary Sources

  • Ackermann, Robert John. 1976. The Philosophy of Karl Popper. Amherst: University of Mass. Press.
  • Agassi, Joseph. 2014. Popper and His Popular Critics: Thomas Kuhn, Paul Feyerabend and Imre Lakatos. 2014 edition. New York: Springer.
  • Blaug, Mark. 1992. The Methodology of Economics: Or, How Economists Explain. 2nd edition. New York: Cambridge University Press.
  • Caldwell, Bruce J. 1991. “Clarifying Popper.” Journal of Economic Literature 29 (1): 1–33.
  • Carnap, Rudolf. 1936. “Testability and Meaning.” Philosophy of Science 3 (4): 419–71. Continued in Philosophy of Science 4 (1): 1-40.
  • Carnap, Rudolf. 1995. An Introduction to the Philosophy of Science. New York: Dover. Originally published as Philosophical Foundations of Physics (1966).
  • Carnap, Rudolf.  2003. The Logical Structure of the World and Pseudoproblems in Philosophy. Translated by Rolf A. George. Chicago and La Salle, Ill: Open Court. Originally published in 1928 as Der logische Aufbau der Welt and Scheinprobleme in der Philosophie.
  • Catton, Philip, and Graham MacDonald, eds. 2004. Karl Popper: Critical Appraisals. New York: Routledge.
  • Currie, Gregory, and Alan Musgrave, eds. 1985. Popper and the Human Sciences. Dordrecht: Martinus Nijhoff.
  • Edmonds, David, and John Eidinow. 2002. Wittgenstein’s Poker: The Story of a Ten-Minute Argument Between Two Great Philosophers. Reprint edition. New York: Harper Perennial.
  • Feyerabend, Paul. 1975. Against Method. London; New York: New Left Books. Fourth edition 2010.
  • Fuller, Steve. 2004. Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
  • Gattei, Stefano. 2010. Karl Popper’s Philosophy of Science: Rationality without Foundations. London; New York: Routledge.
  • Grünbaum, Adolf. 1976. “Is Falsifiability the Touchstone of Scientific Rationality? Karl Popper Versus Inductivism.” In Essays in Memory of Imre Lakatos, edited by R. S. Cohen, P. K. Feyerabend, and M. W. Wartofsky, 213–52. Dordrecht: Springer Netherlands.
  • Hacking, Ian. 1983. Representing and Intervening: Introductory Topics in the Philosophy of Natural Science. Cambridge; New York: Cambridge University Press.
  • Hacohen, Malachi Haim. 2002. Karl Popper: The Formative Years, 1902-1945 : Politics and Philosophy in Interwar Vienna. Cambridge: Cambridge University Press.
  • Hands, Douglas W. 1985. “Karl Popper and Economic Methodology: A New Look.” Economics and Philosophy 1 (1): 83–99.
  • Harris, John H. 1974. “Popper’s Definitions of ‘Verisimilitude.’” The British Journal for the Philosophy of Science 25 (2): 160–66.
  • Hausman, Daniel M. 1985. “Is Falsificationism Unpractised or Unpractisable?” Philosophy of the Social Sciences 15 (3): 313–19.
  • Hayek, Frederich von. 1942. “Scientism and the Study of Society. Part I.” Economica, New Series, 9 (35): 267–91.
  • Hayek, Frederich von.  1943. “Scientism and the Study of Society. Part II.” Economica, New Series, 10 (37): 34–63.
  • Hayek, Frederich von. 1944. “Scientism and the Study of Society. Part III.” Economica, New Series, 11 (41): 27–39.
  • Hempel, Carl G. 1945a. “Studies in the Logic of Confirmation (I.).” Mind, New Series, 54 (213): 1–26.
  • Hempel, Carl G. 1945b. “Studies in the Logic of Confirmation (II.).” Mind, New Series, 54 (214): 97–121.
  • Howson, Colin. 1984a. “Popper’s Solution to the Problem of Induction.” The Philosophical Quarterly 34 (135): 143–47.
  • Howson, Colin. 1984b. “Probabilities, Propensities, and Chances.” Erkenntnis 21 (3): 279–93.
  • Howson, Colin, and Peter Urbach. 1989. Scientific Reasoning: The Bayesian Approach. Chicago: Open Court Publishing. Third edition 2006.
  • Hudelson, Richard. 1980. “Popper’s Critique of Marx.” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 37 (3): 259–70.
  • Hume, David. 1993. An Enquiry Concerning Human Understanding: With Hume’s Abstract of A Treatise of Human Nature and A Letter from a Gentleman to His Friend in Edinburgh. Edited by Eric Steinberg. 2nd ed. Indianapolis: Hackett Publishing Company, Inc.
  • Jeffrey, Richard C. 1975. “Probability and Falsification: Critique of the Popper Program.” Synthese 30 (1/2): 95–117.
  • Keuth, Herbert. 2004. The Philosophy of Karl Popper. New York: Cambridge University Press.
  • Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Third edition 1996.
  • Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgrave, 91–196. Cambridge: Cambridge University Press.
  • Lakatos, Imre.  1980. The Methodology of Scientific Research Programmes: Volume 1: Philosophical Papers. Cambridge University Press.
  • Lakatos, Imre, and Alan Musgrave, eds. 1970. Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press.
  • Levi, Isaac. 1963. “Corroboration and Rules of Acceptance.” The British Journal for the Philosophy of Science 13 (52): 307–13.
  • Maher, Patrick. 1990. “Why Scientists Gather Evidence.” The British Journal for the Philosophy of Science 41 (1): 103-119.
  • Magee, Bryan. 1985. Philosophy and the Real World: An Introduction to Karl Popper. La Salle, Ill: Open Court.
  • Miller, David. 1974. “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science, 166–77.
  • Miller, David. 1998. Critical Rationalism: A Restatement and Defense. Chicago: Open Court.
  • Munz, Peter. 1985. Our Knowledge of the Growth of Knowledge: Popper or Wittgenstein?. London; New York: Routledge.
  • O’Hear, Anthony. 1996. Karl Popper: Philosophy and Problems. Cambridge ; New York: Cambridge University Press.
  • Putnam, Hilary. 1974. “The ‘corroboration’ of Theories.” In The Philosophy of Karl Popper, edited by Paul Arthur Schilpp, 221–40. La Salle, Ill: Open Court.
  • Rowbottom, Darrell. 2010. Popper’s Critical Rationalism: A Philosophical Investigation. New York: Routledge.
  • Runde, Jochen. 1996. “On Popper, Probabilities, and Propensities.” Review of Social Economy 54 (4): 465–85.
  • Ruse, Michael. 1977. “Karl Popper’s Philosophy of Biology.” Philosophy of Science 44 (4): 638–61.
  • Salmon, Wesley. 1967. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press.
  • Salmon, Wesley. 1981. “Rational Prediction.” The British Journal for the Philosophy of Science 32 (2): 115–25.
  • Schilpp, Paul Arthur, ed. 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • Thornton, Stephen. 2014. “Karl Popper.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.
  • Tichý, Pavel. 1974. “On Popper’s Definitions of Verisimilitude.” The British Journal for the Philosophy of Science 25 (2): 155–60.
  • Urbach, Peter. 1978. “Is Any of Popper’s Arguments against Historicism Valid?” The British Journal for the Philosophy of Science 29 (2): 117–30.

 

Author Information

Brendan Shea
Email: Brendan.Shea@rctc.edu
Rochester Community and Technical College, Minnesota Center for Philosophy of Science
U. S. A.

Albert Camus (1913—1960)

CamusAlbert Camus was a French-Algerian journalist, playwright, novelist, philosophical essayist, and Nobel laureate. Though he was neither by advanced training nor profession a philosopher, he nevertheless made important, forceful contributions to a wide range of issues in moral philosophy in his novels, reviews, articles, essays, and speeches—from terrorism and political violence to suicide and the death penalty. He is often described as an existentialist writer, though he himself disavowed the label. He began his literary career as a political journalist and as an actor, director, and playwright in his native Algeria. Later, while living in occupied France during WWII, he became active in the Resistance and from 1944-47 served as editor-in-chief of the newspaper Combat.  By mid-century, based on the strength of his three novels (The Stranger, The Plague, and The Fall) and two book-length philosophical essays (The Myth of Sisyphus and The Rebel), he had achieved an international reputation and readership. It was in these works that he introduced and developed the twin philosophical ideas—the concept of the Absurd and the notion of Revolt—that made him famous. These are the ideas that people immediately think of when they hear the name Albert Camus spoken today. The Absurd can be defined as a metaphysical tension or opposition that results from the presence of human consciousness—with its ever-pressing demand for order and meaning in life—in an essentially meaningless and indifferent universe. Camus considered the Absurd to be a fundamental and even defining characteristic of the modern human condition. The notion of Revolt refers to both a path of resolved action and a state of mind. It can take extreme forms such as terrorism or a reckless and unrestrained egoism (both of which are rejected by Camus), but basically, and in simple terms, it consists of an attitude of heroic defiance or resistance to whatever oppresses human beings. In awarding Camus its prize for literature in 1957, the Nobel Prize committee cited his persistent efforts to “illuminate the problem of the human conscience in our time.” He was honored by his own generation, and is still admired today, for being a writer of conscience and a champion of imaginative literature as a vehicle of philosophical insight and moral truth. He was at the height of his career—at work on an autobiographical novel, planning new projects for theatre, film, and television, and still seeking a solution to the lacerating political turmoil in his homeland—when he died tragically in an automobile accident in January 1960.

Table of Contents

  1. Life
  2. Literary Career
  3. Camus, Philosophical Literature, and the Novel of Ideas
  4. Works
    1. Fiction
    2. Drama
    3. Essays, Letters, Prose Collections, Articles, and Reviews
  5. Philosophy
    1. Background and Influences
    2. Development
    3. Themes and Ideas
      1. The Absurd
      2. Revolt
      3. The Outsider
      4. Guilt and Innocence
      5. Christianity vs. “Paganism”
      6. Individual vs. History and Mass Culture
      7. Suicide
      8. The Death Penalty
  6. Existentialism
  7. Camus, Colonialism, and Algeria
  8. Significance and Legacy
  9. References and Further Reading
    1. Works by Albert Camus
    2. Critical and Biographical Studies

1. Life

Albert Camus was born on November 7, 1913, in Mondovi, a small village near the seaport city of Bonê (present-day Annaba) in the northeast region of French Algeria. He was the second child of Lucien Auguste Camus, a military veteran and wine-shipping clerk, and of Catherine Helene (Sintes) Camus, a house-keeper and part-time factory worker. (Note: Although Camus believed that his father was Alsatian and a first-generation émigré, research by biographer Herbert Lottman indicates that the Camus family was originally from Bordeaux and that the first Camus to leave France for Algeria was actually the author’s great-grandfather, who in the early 19th century became part of the first wave of European colonial settlers in the new melting pot of North Africa.)

Shortly after the outbreak of WWI, when Camus was less than a year old, his father was recalled to military service and, on October 11, 1914, died of shrapnel wounds suffered at the first battle of the Marne. As a child, about the only thing Camus ever learned about his father was that he had once become violently ill after witnessing a public execution. This anecdote, which surfaces in fictional form in the author’s novel The Stranger and is also recounted in his philosophical essay “Reflections on the Guillotine,” strongly affected Camus and influenced his lifelong opposition to the death penalty.

After his father’s death, Camus, his mother, and his older brother moved to Algiers where they lived with his maternal uncle and grandmother in her cramped second-floor apartment in the working-class district of Belcourt. Camus’s mother Catherine, who was illiterate, partially deaf, and afflicted with a speech pathology, worked in an ammunition factory and cleaned homes to help support the family. In his posthumously published autobiographical novel The First Man, Camus recalls this period of his life with a mixture of pain and affection as he describes conditions of harsh poverty (the three-room apartment had no bathroom, no electricity, and no running water) relieved by hunting trips, family outings, childhood games, and scenic flashes of sun, seashore, mountain, and desert.

Camus attended elementary school at the local Ecole Communale, and it was there that he encountered the first in a series of teacher-mentors who recognized and nurtured the young boy’s lively intelligence. These father figures introduced him to a new world of history and imagination and to literary landscapes far beyond the dusty streets of Belcourt and working-class poverty. Though stigmatized as a pupille de la nation (that is, a war veteran’s child dependent on public welfare) and hampered by recurrent health issues, Camus distinguished himself as a student and was eventually awarded a scholarship to attend high school at the Grand Lycee. Located near the famous Kasbah district, the school brought him into close proximity with the native Muslim community and thus gave him an early recognition of the idea of the “outsider” that would dominate his later writings.

It was in secondary school that Camus became an avid reader (absorbing Gide, Proust, Verlaine, and Bergson, among others), learned Latin and English, and developed a lifelong interest in literature, art, theatre, and film. He also enjoyed sports, especially soccer, of which he once wrote (recalling his early experience as a goal-keeper): “I learned . . . that a ball never arrives from the direction you expected it. That helped me in later life, especially in mainland France, where nobody plays straight.” It was also during this period that Camus suffered his first serious attack of tuberculosis, a disease that was to afflict him, on and off, throughout his career.

By the time he finished his Baccalauréat degree in June 1932, Camus was already contributing articles to Sud, a literary monthly, and looking forward to a career in journalism, the arts, or higher education. The next four years (1933-37) were an especially busy period in his life during which he attended college, worked at odd jobs, married his first wife (Simone Hié), divorced, briefly joined the Communist party, and effectively began his professional theatrical and writing career. Among his various employments during the time were stints of routine office work where one job consisted of a Bartleby-like recording and sifting of meteorological data and another involved paper shuffling in an auto license bureau. One can well imagine that it was as a result of this experience that his famous conception of Sisyphean struggle, heroic defiance in the face of the Absurd, first began to take shape within his imagination.

In 1933, Camus enrolled at the University of Algiers to pursue his diplome d’etudes superieures, specializing in philosophy and gaining certificates in sociology and psychology along the way. In 1936, he became a co-founder, along with a group of young fellow intellectuals, of the Théâtre du Travail, a professional acting company specializing in drama with left-wing political themes. Camus served the company as both an actor and director and also contributed scripts, including his first published play Revolt in Asturia, a drama based on an ill-fated workers’ revolt during the Spanish Civil War. That same year Camus also earned his degree and completed his dissertation, a study of the influence of Plotinus and neo-Platonism on the thought and writings of St. Augustine.

Over the next three years Camus further established himself as an emerging author, journalist, and theatre professional. After his disillusionment with and eventual expulsion from the Communist Party, he reorganized his dramatic company and renamed it the Théâtre de l’Equipe (literally the Theater of the Team). The name change signaled a new emphasis on classic drama and avant-garde aesthetics and a shift away from labor politics and agitprop. In 1938 he joined the staff of a new daily newspaper, the Alger Républicain, where his assignments as a reporter and reviewer covered everything from contemporary European literature to local political trials. It was during this period that he also published his first two literary works—Betwixt and Between, a collection of five short semi-autobiographical and philosophical pieces (1937) and Nuptials, a series of lyrical celebrations interspersed with political and philosophical reflections on North Africa and the Mediterranean.

The 1940s witnessed Camus’s gradual ascendance to the rank of world-class literary intellectual. He started the decade as a locally acclaimed author and playwright, but he was a figure virtually unknown outside the city of Algiers; however, he ended the decade as an internationally recognized novelist, dramatist, journalist, philosophical essayist, and champion of freedom. This period of his life began inauspiciously—war in Europe, the occupation of France, official censorship, and a widening crackdown on left-wing journals. Camus was still without stable employment or steady income when, after marrying his second wife, Francine Faure, in December of 1940, he departed Lyons, where he had been working as a journalist, and returned to Algeria. To help make ends meet, he taught part-time (French history and geography) at a private school in Oran. All the while he was putting finishing touches to his first novel The Stranger, which was finally published in 1942 to favorable critical response, including a lengthy and penetrating review by Jean-Paul Sartre. The novel propelled him into immediate literary renown.

Camus returned to France in 1942 and a year later began working for the clandestine newspaper Combat, the journalistic arm and voice of the French Resistance movement. During this period, while contending with recurrent bouts of tuberculosis, he also published The Myth of Sisyphus, his philosophical anatomy of suicide and the absurd, and joined Gallimard Publishing as an editor, a position he held until his death.

After the Liberation, Camus continued as editor of Combat, oversaw the production and publication of two plays, The Misunderstanding and Caligula, and assumed a leading role in Parisian intellectual society in the company of Sartre and Simone de Beauvoir among others. In the late 40s his growing reputation as a writer and thinker was enlarged by the publication of The Plague, an allegorical novel and fictional parable of the Nazi Occupation and the duty of revolt, and by the lecture tours to the United States and South America. In 1951 he published The Rebel, a reflection on the nature of freedom and rebellion and a philosophical critique of revolutionary violence. This powerful and controversial work, with its explicit condemnation of Marxism-Leninism and its emphatic denunciation of unrestrained violence as a means of human liberation, led to an eventual falling out with Sartre and, along with his opposition to the Algerian National Liberation Front, to his being branded a reactionary in the view of many European Communists. Yet his position also established him as an outspoken champion of individual freedom and as an impassioned critic of tyranny and terrorism, whether practiced by the Left or by the Right.

In 1956, Camus published the short, confessional novel The Fall, which unfortunately would be the last of his completed major works and which in the opinion of some critics is the most elegant, and most under-rated of all his books. During this period he was still afflicted by tuberculosis and was perhaps even more sorely beset by the deteriorating political situation in his native Algeria—which had by now escalated from demonstrations and occasional terrorist and guerilla attacks into open violence and insurrection. Camus still hoped to champion some kind of rapprochement that would allow the native Muslim population and the French pied noir minority to live together peaceably in a new de-colonized and largely integrated, if not fully independent, nation. Alas, by this point, as he painfully realized, the odds of such an outcome were becoming increasingly unlikely.

In the fall of 1957, following publication of Exile and the Kingdom, a collection of short fiction, Camus was shocked by news that he had been awarded the Nobel Prize for literature. He absorbed the announcement with mixed feelings of gratitude, humility, and amazement. On the one hand, the award was obviously a tremendous honor. On the other, not only did he feel that his friend and esteemed fellow novelist Andre Malraux was more deserving, he was also aware that the Nobel itself was widely regarded as the kind of accolade usually given to artists at the end of a long career. Yet, as he indicated in his acceptance speech at Stockholm, he considered his own career as still in mid-flight, with much yet to accomplish and even greater writing challenges ahead:

Every person, and assuredly every artist, wants to be recognized. So do I. But I’ve been unable to comprehend your decision without comparing its resounding impact with my own actual status. A man almost young, rich only in his doubts, and with his work still in progress…how could such a man not feel a kind of panic at hearing a decree that transports him all of a sudden…to the center of a glaring spotlight? And with what feelings could he accept this honor at a time when other writers in Europe, among them the very greatest, are condemned to silence, and even at a time when the country of his birth is going through unending misery?

Of course Camus could not have known as he spoke these words that most of his writing career was in fact behind him. Over the next two years, he published articles and continued to write, produce, and direct plays, including his own adaptation of Dostoyevsky’s The Possessed. He also formulated new concepts for film and television, assumed a leadership role in a new experimental national theater, and continued to campaign for peace and a political solution in Algeria. Unfortunately, none of these latter projects would be brought to fulfillment. On January 4, 1960, Camus died tragically in a car accident while he was a passenger in a vehicle driven by his friend and publisher Michel Gallimard, who also suffered fatal injuries. The author was buried in the local cemetery at Lourmarin, a village in Provencal where he and his wife and daughters had lived for nearly a decade.

Upon hearing of Camus’s death, Sartre wrote a moving eulogy in the France-Observateur, saluting his former friend and political adversary not only for his distinguished contributions to French literature but especially for the heroic moral courage and “stubborn humanism” which he brought to bear against the “massive and deformed events of the day.”

2. Literary Career

According to Sartre’s perceptive appraisal, Camus was less a novelist and more a writer of philosophical tales and parables in the tradition of Voltaire. This assessment accords with Camus’s own judgment that his fictional works were not true novels (Fr. romans), a form he associated with the densely populated and richly detailed social panoramas of writers like Balzac, Tolstoy, and Proust, but rather contes (“tales”) and recits (“narratives”) combining philosophical and psychological insights.

In this respect, it is also worth noting that at no time in his career did Camus ever describe himself as a deep thinker or lay claim to the title of philosopher. Instead, he nearly always referred to himself simply, yet proudly, as un ecrivain—a writer. This is an important fact to keep in mind when assessing his place in intellectual history and in twentieth-century philosophy, for by no means does he qualify as a system-builder or theorist or even as a disciplined thinker. He was instead (and here again Sartre’s assessment is astute) a sort of all-purpose critic and modern-day philosophe: a debunker of mythologies, a critic of fraud and superstition, an enemy of terror, a voice of reason and compassion, and an outspoken defender of freedom—all in all a figure very much in the Enlightenment tradition of Voltaire and Diderot. For this reason, in assessing Camus’s career and work, it may be best simply to take him at his own word and characterize him first and foremost as a writer—advisedly attaching the epithet “philosophical” for sharper accuracy and definition.

3. Camus, Philosophical Literature, and the Novel of Ideas

To pin down exactly why and in what distinctive sense Camus may be termed a philosophical writer, we can begin by comparing him with other authors who have merited the designation. Right away, we can eliminate any comparison with the efforts of Lucretius and Dante, who undertook to unfold entire cosmologies and philosophical systems in epic verse. Camus obviously attempted nothing of the sort. On the other hand, we can draw at least a limited comparison between Camus and writers like Pascal, Kierkegaard, and Nietzsche—that is, with writers who were first of all philosophers or religious writers, but whose stylistic achievements and literary flair gained them a special place in the pantheon of world literature as well. Here we may note that Camus himself was very conscious of his debt to Kierkegaard and Nietzsche (especially in the style and structure of The Myth of Sisyphus and The Rebel) and that he might very well have followed in their literary-philosophical footsteps if his tuberculosis had not side-tracked him into fiction and journalism and prevented him from pursuing an academic career.

Perhaps Camus himself best defined his own particular status as a philosophical writer when he wrote (with authors like Melville, Stendhal, Dostoyevsky, and Kafka especially in mind): “The great novelists are philosophical novelists”; that is, writers who eschew systematic explanation and create their discourse using “images instead of arguments” (The Myth of Sisyphus 74).

By his own definition then Camus is a philosophical writer in the sense that he has (a) conceived his own distinctive and original world-view and (b) sought to convey that view mainly through images, fictional characters and events, and via dramatic presentation rather than through critical analysis and direct discourse. He is also both a novelist of ideas and a psychological novelist, and in this respect, he certainly compares most closely to Dostoyevsky and Sartre, two other writers who combine a unique and distinctly philosophical outlook, acute psychological insight, and a dramatic style of presentation. (Like Camus, Sartre was a productive playwright, and Dostoyevsky remains perhaps the most dramatic of all novelists, as Camus clearly understood, having adapted both The Brothers Karamazov and The Possessed for the stage.)

4. Works

Camus’s reputation rests largely on the three novels published during his lifetime—The Stranger, The Plague, and The Fall—and on his two major philosophical essays—The Myth of Sisyphus and The Rebel. However, his body of work also includes a collection of short fiction, Exile and the Kingdom; an autobiographical novel, The First Man; a number of dramatic works, most notably Caligula, The Misunderstanding, The State of Siege, and The Just Assassins; several translations and adaptations, including new versions of works by Calderon, Lope de Vega, Dostoyevsky, and Faulkner; and a lengthy assortment of essays, prose pieces, critical reviews, transcribed speeches and interviews, articles, and works of journalism. A brief summary and description of the most important of Camus’s writings is presented below as preparation for a larger discussion of his philosophy and world-view, including his main ideas and recurrent philosophical themes.

a. Fiction

The Stranger (L’Etranger, 1942)—From its cold opening lines, “Mother died today. Or maybe yesterday; I can’t be sure,” to its bleak concluding image of a public execution set to take place beneath the “benign indifference of the universe,” Camus’s first and most famous novel takes the form of a terse, flat, first-person narrative by its main character Meursault, a very ordinary young man of unremarkable habits and unemotional affect who, inexplicably and in an almost absent-minded way, kills an Arab and then is arrested, tried, convicted, and sentenced to death. The neutral style of the novel—typical of what the critic Roland Barthes called “writing degree zero”—serves as a perfect vehicle for the descriptions and commentary of its anti-hero narrator, the ultimate “outsider” and a person who seems to observe everything, including his own life, with almost pathological detachment.

The Plague (La Peste, 1947)—Set in the coastal town of Oran, Camus’s second novel is the story of an outbreak of plague, traced from its subtle, insidious, unheeded beginnings and horrible, seemingly irresistible dominion to its eventual climax and decline, all told from the viewpoint of one of the survivors. Camus made no effort to conceal the fact that his novel was partly based on and could be interpreted as an allegory or parable of the rise of Nazism and the nightmare of the Occupation. However, the plague metaphor is both more complicated and more flexible than that, extending to signify the Absurd in general as well as any calamity or disaster that tests the mettle of human beings, their endurance, their solidarity, their sense of responsibility, their compassion, and their will. At the end of the novel, the plague finally retreats, and the narrator reflects that a time of pestilence teaches “that there is more to admire in men than to despise,” but he also knows “that the plague bacillus never dies or disappears for good,” that “the day would come when, for the bane and the enlightening of men, it would rouse up its rats again” and send them forth yet once more to spread death and contagion into a happy and unsuspecting city.

The Fall (La Chute, 1956)—Camus’s third novel, and the last to be published during his lifetime, is in effect an extended dramatic monologue spoken by M. Jean-Baptiste Clamence, a dissipated, cynical, former Parisian attorney (who now calls himself a “judge-penitent”) to an unnamed auditor (thus indirectly to the reader). Set in a seedy bar in the red-light district of Amsterdam, the work is a small masterpiece of compression and style: a confessional (and semi-autobiographical) novel, an arresting character study and psychological portrait, and at the same time a wide-ranging philosophical discourse on guilt and innocence, expiation and punishment, good and evil.

b. Drama

Camus began his literary career as a playwright and theatre director and was planning new dramatic works for film, stage, and television at the time of his death. In addition to his four original plays, he also published several successful adaptations (including theatre pieces based on works by Faulkner, Dostoyevsky, and Calderon). He took particular pride in his work as a dramatist and man of the theatre. However, his plays never achieved the same popularity, critical success, or level of incandescence as his more famous novels and major essays.

Caligula (1938, first produced 1945)—“Men die and are not happy.” Such is the complaint against the universe pronounced by the young emperor Caligula, who in Camus’s play is less the murderous lunatic, slave to incest, narcissist, and megalomaniac of Roman history than a theatrical martyr-hero of the Absurd: a man who carries his philosophical quarrel with the meaninglessness of human existence to a kind of fanatical but logical extreme. Camus described his hero as a man “obsessed with the impossible” willing to pervert all values, and if necessary destroy himself and all those around him in the pursuit of absolute liberty. Caligula was Camus’s first attempt at portraying a figure in absolute defiance of the Absurd, and through three revisions of the play over a period of several years he eventually achieved a remarkable composite by adding to Caligula’s original portrait touches of Sade, of revolutionary nihilism, of the Nietzschean Superman, of his own version of Sisyphus, and even of Mussolini and Hitler.

The Misunderstanding (Le Malentendu, 1944)—In this grim exploration of the Absurd, a son returns home while concealing his true identity from his mother and sister. The two women operate a boarding house where, in order to make ends meet, they quietly murder and rob their patrons. Through a tangle of misunderstanding and mistaken identity they wind up murdering their unrecognized visitor. Camus has explained the drama as an attempt to capture the atmosphere of malaise, corruption, demoralization, and anonymity that he experienced while living in France during the German occupation. Despite the play’s dark themes and bleak style, he described its philosophy as ultimately optimistic: “It amounts to saying that in an unjust or indifferent world man can save himself, and save others, by practicing the most basic sincerity and pronouncing the most appropriate word.”

State of Siege (L’Etat de Siege, 1948)This odd allegorical drama combines features of the medieval morality play with elements of Calderon and the Spanish baroque; it also has apocalyptic themes, bits of music hall comedy, and a collection of avant-garde theatrics thrown in for good measure. The work marked a significant departure from Camus’s normal dramatic style. It also resulted in virtually universal disapproval and negative reviews from Paris theatre-goers and critics, many of whom came expecting a play based on Camus’s recent novel The Plague. The play is set in the Spanish seaport city of Cadiz, famous for its beaches, carnivals, and street musicians. By the end of the first act, the normally laid-back and carefree citizens fall under the dominion of a gaudily beribboned and uniformed dictator named Plague (based on Generalissimo Franco) and his officious, clip-board wielding Secretary (who turns out to be a modern, bureaucratic incarnation of the medieval figure Death). One of the prominent concerns of the play is the Orwellian theme of the degradation of language via totalitarian politics and bureaucracy (symbolized onstage by calls for silence, scenes in pantomime, and a gagged chorus). As one character observes, “we are steadily nearing that perfect moment when nothing anybody says will rouse the least echo in another’s mind.”

The Just Assassins (Les Justes, 1950)—First performed in Paris to largely favorable reviews, this play is based on real-life characters and an actual historical event: the 1905 assassination of the Russian Grand Duke Sergei Alexandrovich by Ivan Kalyayev and fellow members of the Combat Organization of the Socialist Revolutionary Party. The play effectively dramatizes the issues that Camus would later explore in detail in The Rebel, especially the question of whether acts of terrorism and political violence can ever be morally justified (and if so, with what limitations and in what specific circumstances). The historical Kalyayev passed up his original opportunity to bomb the Grand Duke’s carriage because the Duke was accompanied by his wife and two young nephews. However, this was no act of conscience on Kalyayev’s part but a purely practical decision based on his calculation that the murder of children would prove a setback to the revolution. After the successful completion of his bombing mission and subsequent arrest, Kalyayev welcomed his execution on similarly practical and purely political grounds, believing that his death would further the cause of revolution and social justice. Camus’s Kalyayev, on the other hand, is a far more agonized and conscientious figure, neither so cold-blooded nor so calculating as his real-life counterpart. Upon seeing the two children in the carriage, he refuses to toss his bomb not because doing so would be politically inexpedient but because he is overcome emotionally, temporarily unnerved by the sad expression in their eyes.  Similarly, at the end of the play he embraces his death not so much because it will aid the revolution, but almost as a form of karmic penance, as if it were indeed some kind of sacred duty or metaphysical requirement that must be performed in order for true justice to be achieved.

c. Essays, Letters, Prose Collections, Articles, and Reviews

Betwixt and Between (L’Envers et l’endroit, 1937)—This short collection of semi-autobiographical, semi-fictional, philosophical pieces might be dismissed as juvenilia and largely ignored if it were not for the fact that it represents Camus’s first attempt to formulate a coherent life-outlook and world-view. The collection, which in a way serves as a germ or starting point for the author’s later philosophy, consists of five lyrical essays. In “Irony” (“L’Ironie”), a reflection on youth and age, Camus asserts, in the manner of a young disciple of Pascal, our essential solitariness in life and death. In “Between yes and no” (“Entre Oui et Non”) he suggests that to hope is as empty and as pointless as to despair, yet he goes beyond nihilism by positing a fundamental value to existence-in-the-world. In “Death in the soul” (“La Mort dans l’ame”) he supplies a sort of existential travel review, contrasting his impressions of central and Eastern Europe (which he views as purgatorial and morgue-like) with the more spontaneous life of Italy and Mediterranean culture. The piece thus affirms the author’s lifelong preference for the color and vitality of the Mediterranean world, and especially North Africa, as opposed to what he perceives as the soulless cold-heartedness of modern Europe. In “Love of life” (“Amour de vivre”) he claims there can be no love of life without despair of life and thus largely re-asserts the essentially tragic, ancient Greek view that the very beauty of human existence is largely contingent upon its brevity and fragility. The concluding essay, “Betwixt and between” (“L’Envers et l’endroit”), summarizes and re-emphasizes the Romantic themes of the collection as a whole: our fundamental “aloneness,” the importance of imagination and openness to experience, the imperative to “live as if….”

Nuptials (Noces, 1938)—This collection of four rhapsodic narratives supplements and amplifies the youthful philosophy expressed in Betwixt and Between. That joy is necessarily intertwined with despair, that the shortness of life confers a premium on intense experience, and that the world is both beautiful and violent—these are, once again, Camus’s principal themes. “Summer in Algiers,” which is probably the best (and best-known) of the essays in the collection, is a lyrical, at times almost ecstatic, celebration of sea, sun, and the North African landscape. Affirming a defiantly atheistic creed, Camus concludes with one of the core ideas of his philosophy: “If there is a sin against life, it consists not so much in despairing as in hoping for another life and in eluding the implacable grandeur of this one.”

The Myth of Sisyphus (Le Mythe de Sisyphe, 1943)—If there is a single non-fiction work that can be considered an essential or fundamental statement of Camus’s philosophy, it is this extended essay on the ethics of suicide (eventually translated and repackaged for American publication in 1955). It is here that Camus formally introduces and fully articulates his most famous idea, the concept of the Absurd, and his equally famous image of life as a Sisyphean struggle. From its provocative opening sentence—“There is but one truly serious philosophical problem, and that is suicide”—to its stirring, paradoxical conclusion—“The struggle itself toward the heights is enough to fill a man’s heart. One must imagine Sisyphus happy”—the book has something interesting and challenging on nearly every page and is shot through with brilliant aphorisms and insights. In the end, Camus rejects suicide: the Absurd must not be evaded either by religion (“philosophical suicide”) or by annihilation (“physical suicide”); the task of living should not merely be accepted, it must be embraced.

The Rebel (L’Homme Revolte, 1951)—Camus considered this work a continuation of the critical and philosophical investigation of the Absurd that he began with The Myth of Sisyphus. Only this time his primary concern is not suicide but murder. He takes up the question of whether acts of terrorism and political violence can be morally justified, which is basically the same question he had addressed earlier in his play The Just Assassins. After arguing that an authentic life inevitably involves some form of conscientious moral revolt, Camus winds up concluding that only in rare and very narrowly defined instances is political violence justified. Camus’s critique of revolutionary violence and terror in this work, and particularly his caustic assessment of Marxism-Leninism (which he accused of sacrificing innocent lives on the altar of History), touched nerves throughout Europe and led in part to his celebrated feud with Sartre and other French leftists.

Resistance, Rebellion, and Death (1960)—This posthumous collection is of interest to students of Camus mainly because it brings together an unusual assortment of his non-fiction writings on a wide range of topics, from art and politics to the advantages of pessimism and the virtues (from a non-believer’s standpoint) of Christianity. Of special interest are two pieces that helped secure Camus’s worldwide reputation as a voice of liberty: “Letters to a German Friend,” a set of four letters originally written during the Nazi Occupation, and “Reflections on the Guillotine,” a denunciation of the death penalty cited for special mention by the Nobel committee and eventually revised and re-published as a companion essay to go with fellow death-penalty opponent Arthur Koestler’s “Reflections on Hanging.”

5. Philosophy

To re-emphasize a point made earlier, Camus considered himself first and foremost a writer (un ecrivain). Indeed, Camus’s dissertation advisor penciled onto his dissertation the assessment “More a writer than a philosopher.” And at various times in his career he also accepted the labels journalist, humanist, novelist, and even moralist. However, he apparently never felt comfortable identifying himself as a philosopher—a term he seems to have associated with rigorous academic training, systematic thinking, logical consistency, and a coherent, carefully defined doctrine or body of ideas.

This is not to suggest that Camus lacked ideas or to say that his thought cannot be considered a personal philosophy. It is simply to point out that he was not a systematic, or even a notably disciplined thinker and that, unlike Heidegger and Sartre, for example, he showed very little interest in metaphysics and ontology, which seems to be one of the reasons he consistently denied that he was an existentialist. In short, he was not much given to speculative philosophy or any kind of abstract theorizing. His thought is instead nearly always related to current events (e.g., the Spanish War, revolt in Algeria) and is consistently grounded in down-to-earth moral and political reality.

a. Background and Influences

Though he was baptized, raised, and educated as a Catholic and invariably respectful towards the Church, Camus seems to have been a natural-born pagan who showed almost no instinct whatsoever for belief in the supernatural. Even as a youth, he was more of a sun-worshipper and nature lover than a boy notable for his piety or religious faith. On the other hand, there is no denying that Christian literature and philosophy served as an important influence on his early thought and intellectual development. As a young high school student, Camus studied the Bible, read and savored the Spanish mystics St. Theresa of Avila and St. John of the Cross, and was introduced to the thought of St. Augustine St. Augustine would later serve as the subject of his baccalaureate dissertation and become—as a fellow North African writer, quasi-existentialist, and conscientious observer-critic of his own life—an important lifelong influence.

In college Camus absorbed Kierkegaard, who, after Augustine, was probably the single greatest Christian influence on his thought. He also studied Schopenhauer and Nietzsche—undoubtedly the two writers who did the most to set him on his own path of defiant pessimism and atheism. Other notable influences include not only the major modern philosophers from the academic curriculum—from Descartes and Spinoza to Bergson—but also, and just as importantly, philosophical writers like Stendhal, Melville, Dostoyevsky, and Kafka.

b. Development

The two earliest expressions of Camus’s personal philosophy are his works Betwixt and Between (1937) and Nuptials (1938). Here he unfolds what is essentially a hedonistic, indeed almost primitivistic, celebration of nature and the life of the senses. In the Romantic poetic tradition of writers like Rilke and Wallace Stevens, he offers a forceful rejection of all hereafters and an emphatic embrace of the here and now. There is no salvation, he argues, no transcendence; there is only the enjoyment of consciousness and natural being. One life, this life, is enough. Sky and sea, mountain and desert, have their own beauty and magnificence and constitute a sufficient heaven.

The critic John Cruikshank termed this stage in Camus’s thinking “naïve atheism” and attributed it to his ecstatic and somewhat immature “Mediterraneanism.” Naïve seems an apt characterization for a philosophy that is romantically bold and uncomplicated yet somewhat lacking in sophistication and logical clarity. On the other hand, if we keep in mind Camus’s theatrical background and preference for dramatic presentation, there may actually be more depth and complexity to his thought here than meets the eye. That is to say, just as it would be simplistic and reductive to equate Camus’s philosophy of revolt with that of his character Caligula (who is at best a kind of extreme or mad spokesperson for the author), so in the same way it is possible that the pensées and opinions presented in Nuptials and Betwixt and Between are not so much the views of Camus as they are poetically heightened observations of an artfully crafted narrator—an exuberant alter ego who is far more spontaneous and free-spirited than his more naturally reserved and sober-minded author.

In any case, regardless of this assessment of the ideas expressed in Betwixt and Between and Nuptials, it is clear that these early writings represent an important, if comparatively raw and simple, beginning stage in Camus’s development as a thinker where his views differ markedly from his more mature philosophy in several noteworthy respects. In the first place, the Camus of Nuptials is still a young man of twenty-five, aflame with youthful joie de vivre. He favors a life of impulse and daring as it was honored and practiced in both Romantic literature and in the streets of Belcourt. Recently married and divorced, raised in poverty and in close quarters, beset with health problems, this young man develops an understandable passion for clear air, open space, colorful dreams, panoramic vistas, and the breath-taking prospects and challenges of the larger world. Consequently, the Camus of the period 1937-38 is a decidedly different writer from the Camus who will ascend the dais at Stockholm nearly twenty years later.

The young Camus is more of a sensualist and pleasure-seeker, more of a dandy and aesthete, than the more hardened and austere figure who will endure the Occupation while serving in the French underground. He is a writer passionate in his conviction that life ought to be lived vividly and intensely—indeed rebelliously (to use the term that will take on increasing importance in his thought). He is also a writer attracted to causes, though he is not yet the author who will become world-famous for his moral seriousness and passionate commitment to justice and freedom. All of which is understandable. After all, the Camus of the middle 1930s had not yet witnessed and absorbed the shattering spectacle and disillusioning effects of the Spanish Civil War, the rise of Fascism, Hitlerism, and Stalinism, the coming into being of total war and weapons of mass destruction, and the terrible reign of genocide and terror that would characterize the period 1938-1945. It was under the pressure and in direct response to the events of this period that Camus’s mature philosophy—with its core set of humanistic themes and ideas—emerged and gradually took shape. That mature philosophy is no longer a “naïve atheism” but a very reflective and critical brand of unbelief. It is proudly and inconsolably pessimistic, but not in a polemical or overbearing way. It is unbending, hardheaded, determinedly skeptical. It is tolerant and respectful of world religious creeds, but at the same time wholly unsympathetic to them. In the end it is an affirmative philosophy that accepts and approves, and in its own way blesses, our dreadful mortality and our fundamental isolation in the world.

c. Themes and Ideas

Regardless of whether he is producing drama, fiction, or non-fiction, Camus in his mature writings nearly always takes up and re-explores the same basic philosophical issues. These recurrent topoi constitute the key components of his thought. They include themes like the Absurd, alienation, suicide, and rebellion that almost automatically come to mind whenever his name is mentioned. Hence any summary of his place in modern philosophy would be incomplete without at least a brief discussion of these ideas and how they fit together to form a distinctive and original world-view.

i. The Absurd

Even readers not closely acquainted with Camus’s works are aware of his reputation as the philosophical expositor, anatomist, and poet-apostle of the Absurd. Indeed, as even sitcom writers and stand-up comics apparently understand (odd fact: the comic-bleak final episode of Seinfeld has been compared to The Stranger, and Camus’s thought has been used to explain episodes of The Simpsons), it is largely through the thought and writings of the French-Algerian author that the concept of absurdity has become a part not only of world literature and twentieth-century philosophy but also of modern popular culture.

What then is meant by the notion of the Absurd? Contrary to the view conveyed by popular culture, the Absurd, (at least in Camus’ terms) does not simply refer to some vague perception that modern life is fraught with paradoxes, incongruities, and intellectual confusion. (Although that perception is certainly consistent with his formula.) Instead, as he himself emphasizes and tries to make clear, the Absurd expresses a fundamental disharmony, a tragic incompatibility, in our existence. In effect, he argues that the Absurd is the product of a collision or confrontation between our human desire for order, meaning, and purpose in life and the blank, indifferent “silence of the universe.” (“The absurd is not in man nor in the world,” Camus explains, “but in their presence together . . . it is the only bond uniting them.”)

So here we are: poor creatures desperately seeking hope and meaning in a hopeless, meaningless world. Sartre, in his essay-review of The Stranger provides an additional gloss on the idea: “The absurd, to be sure, resides neither in man nor in the world, if you consider each separately. But since man’s dominant characteristic is ‘being in the world,’ the absurd is, in the end, an inseparable part of the human condition.” The Absurd, then, presents itself in the form of an existential opposition. It arises from the human demand for clarity and transcendence on the one hand and a cosmos that offers nothing of the kind on the other. Such is our fate: we inhabit a world that is indifferent to our sufferings and deaf to our protests.

In Camus’s view there are three possible philosophical responses to this predicament. Two of these he condemns as evasions, and the other he puts forward as a proper solution.

The first choice is blunt and simple: physical suicide. If we decide that a life without some essential purpose or meaning is not worth living, we can simply choose to kill ourselves. Camus rejects this choice as cowardly. In his terms it is a repudiation or renunciation of life, not a true revolt.

The second choice is the religious solution of positing a transcendent world of solace and meaning beyond the Absurd. Camus calls this solution “philosophical suicide” and rejects it as transparently evasive and fraudulent. To adopt a supernatural solution to the problem of the Absurd (for example, through some type of mysticism or leap of faith) is to annihilate reason, which in Camus’s view is as fatal and self-destructive as physical suicide. In effect, instead of removing himself from the absurd confrontation of self and world like the physical suicide, the religious believer simply removes the offending world and replaces it, via a kind of metaphysical abracadabra, with a more agreeable alternative.

The third choice—in Camus’s view the only authentic and valid solution—is simply to accept absurdity, or better yet to embrace it, and to continue living. Since the Absurd in his view is an unavoidable, indeed defining, characteristic of the human condition, the only proper response to it is full, unflinching, courageous acceptance. Life, he says, can “be lived all the better if it has no meaning.”

The example par excellence of this option of spiritual courage and metaphysical revolt is the mythical Sisyphus of Camus’s philosophical essay. Doomed to eternal labor at his rock, fully conscious of the essential hopelessness of his plight, Sisyphus nevertheless pushes on. In doing so he becomes for Camus a superb icon of the spirit of revolt and of the human condition. To rise each day to fight a battle you know you cannot win, and to do this with wit, grace, compassion for others, and even a sense of mission, is to face the Absurd in a spirit of true heroism.

Over the course of his career, Camus examines the Absurd from multiple perspectives and through the eyes of many different characters—from the mad Caligula, who is obsessed with the problem, to the strangely aloof and yet simultaneously self-absorbed Meursault, who seems indifferent to it even as he exemplifies and is finally victimized by it. In The Myth of Sisyphus, Camus traces it in specific characters of legend and literature (Don Juan, Ivan Karamazov) and also in certain character types (the Actor, the Conqueror), all of who may be understood as in some way a version or manifestation of Sisyphus, the archetypal absurd hero.

[Note: A rather different, yet possibly related, notion of the Absurd is proposed and analyzed in the work of Kierkegaard, especially in Fear and Trembling and Repetition. For Kierkegaard, however, the Absurd describes not an essential and universal human condition, but the special condition and nature of religious faith—a paradoxical state in which matters of will and perception that are objectively impossible can nevertheless be ultimately true. Though it is hard to say whether Camus had Kierkegaard particularly in mind when he developed his own concept of the absurd, there can be little doubt that Kierkegaard’s knight of faith is in certain ways an important predecessor of Camus’s Sisyphus: both figures are involved in impossible and endlessly agonizing tasks, which they nevertheless confidently and even cheerfully pursue. In the knight’s quixotic defiance and solipsism, Camus found a model for his own ideal of heroic affirmation and philosophical revolt.]

ii. Revolt

The companion theme to the Absurd in Camus’s oeuvre (and the only other philosophical topic to which he devoted an entire book) is the idea of Revolt. What is revolt? Simply defined, it is the Sisyphean spirit of defiance in the face of the Absurd. More technically and less metaphorically, it is a spirit of opposition against any perceived unfairness, oppression, or indignity in the human condition.

Rebellion in Camus’s sense begins with a recognition of boundaries, of limits that define one’s essential selfhood and core sense of being and thus must not be infringed—as when a slave stands up to his master and says in effect “thus far, and no further, shall I be commanded.” This defining of the self as at some point inviolable appears to be an act of pure egoism and individualism, but it is not. In fact Camus argues at considerable length to show that an act of conscientious revolt is ultimately far more than just an individual gesture or an act of solitary protest. The rebel, he writes, holds that there is a “common good more important than his own destiny” and that there are “rights more important than himself.” He acts “in the name of certain values which are still indeterminate but which he feels are common to himself and to all men” (The Rebel 15-16).

Camus then goes on to assert that an “analysis of rebellion leads at least to the suspicion that, contrary to the postulates of contemporary thought, a human nature does exist, as the Greeks believed.” After all, “Why rebel,” he asks, “if there is nothing permanent in the self worth preserving?” The slave who stands up and asserts himself actually does so for “the sake of everyone in the world.” He declares in effect that “all men—even the man who insults and oppresses him—have a natural community.” Here we may note that the idea that there may indeed be an essential human nature is actually more than a “suspicion” as far as Camus himself was concerned. Indeed for him it was more like a fundamental article of his humanist faith. In any case it represents one of the core principles of his ethics and is one of the tenets that sets his philosophy apart from existentialism.

True revolt, then, is performed not just for the self but also in solidarity with and out of compassion for others. And for this reason, Camus is led to conclude that revolt too has its limits. If it begins with and necessarily involves a recognition of human community and a common human dignity, it cannot, without betraying its own true character, treat others as if they were lacking in that dignity or not a part of that community. In the end it is remarkable, and indeed surprising, how closely Camus’s philosophy of revolt, despite the author’s fervent atheism and individualism, echoes Kantian ethics with its prohibition against treating human beings as means and its ideal of the human community as a kingdom of ends.

iii. The Outsider

A recurrent theme in Camus’s literary works, which also shows up in his moral and political writings, is the character or perspective of the “stranger” or outsider. Meursault, the laconic narrator of The Stranger, is the most obvious example. He seems to observe everything, even his own behavior, from an outside perspective. Like an anthropologist, he records his observations with clinical detachment at the same time that he is warily observed by the community around him.

Camus came by this perspective naturally. As a European in Africa, an African in Europe, an infidel among Muslims, a lapsed Catholic, a Communist Party drop-out, an underground resister (who at times had to use code names and false identities), a “child of the state” raised by a widowed mother (who was illiterate and virtually deaf and dumb), Camus lived most of his life in various groups and communities without really being integrated within them. This outside view, the perspective of the exile, became his characteristic stance as a writer. It explains both the cool, objective (“zero-degree”) precision of much of his work and also the high value he assigned to longed-for ideals of friendship, community, solidarity, and brotherhood.

iv. Guilt and Innocence

Throughout his writing career, Camus showed a deep interest in questions of guilt and innocence. Once again Meursault in The Stranger provides a striking example. Is he legally innocent of the murder he is charged with? Or is he technically guilty? On the one hand, there seems to have been no conscious intention behind his action. Indeed the killing takes place almost as if by accident, with Meursault in a kind of absent-minded daze, distracted by the sun. From this point of view, his crime seems surreal and his trial and subsequent conviction a travesty. On the other hand, it is hard for the reader not to share the view of other characters in the novel, especially Meursault’s accusers, witnesses, and jury, in whose eyes he seems to be a seriously defective human being—at best, a kind of hollow man and at worst, a monster of self-centeredness and insularity. That the character has evoked such a wide range of responses from critics and readers—from sympathy to horror—is a tribute to the psychological complexity and subtlety of Camus’s portrait.

Camus’s brilliantly crafted final novel, The Fall, continues his keen interest in the theme of guilt, this time via a narrator who is virtually obsessed with it. The significantly named Jean-Baptiste Clamence (a voice in the wilderness calling for clemency and forgiveness) is tortured by guilt in the wake of a seemingly casual incident. While strolling home one drizzly November evening, he shows little concern and almost no emotional reaction at all to the suicidal plunge of a young woman into the Seine. But afterwards the incident begins to gnaw at him, and eventually he comes to view his inaction as typical of a long pattern of personal vanity and as a colossal failure of human sympathy on his part. Wracked by remorse and self-loathing, he gradually descends into a figurative hell. Formerly an attorney, he is now a self-described “judge-penitent” (a combination sinner, tempter, prosecutor, and father-confessor) who shows up each night at his local haunt, a sailor’s bar near Amsterdam’s red light district, where, somewhat in the manner of Coleridge’s Ancient Mariner, he recounts his story to whoever will hear it. In the final sections of the novel, amid distinctly Christian imagery and symbolism, he declares his crucial insight that, despite our pretensions to righteousness, we are all guilty. Hence no human being has the right to pass final moral judgment on another.

In a final twist, Clamence asserts that his acid self-portrait is also a mirror for his contemporaries. Hence his confession is also an accusation—not only of his nameless companion (who serves as the mute auditor for his monologue) but ultimately of the hypocrite lecteur as well.

v. Christianity vs. “Paganism”

The theme of guilt and innocence in Camus’s writings relates closely to another recurrent tension in his thought: the opposition of Christian and pagan ideas and influences. At heart a nature-worshipper, and by instinct a skeptic and non-believer, Camus nevertheless retained a lifelong interest and respect for Christian philosophy and literature. In particular, he seems to have recognized St. Augustine and Kierkegaard as intellectual kinsmen and writers with whom he shared a common passion for controversy, literary flourish, self-scrutiny, and self-dramatization. Christian images, symbols, and allusions abound in all his work (probably more so than in the writing of any other avowed atheist in modern literature), and Christian themes—judgment, forgiveness, despair, sacrifice, passion, and so forth—permeate the novels. (Meursault and Clamence, it is worth noting, are presented not just as sinners, devils, and outcasts, but in several instances explicitly, and not entirely ironically, as Christ figures.)

Meanwhile alongside and against this leitmotif of Christian images and themes, Camus sets the main components of his essentially pagan worldview. Like Nietzsche, he maintains a special admiration for Greek heroic values and pessimism and for classical virtues like courage and honor. What might be termed Romantic values also merit particular esteem within his philosophy: passion, absorption in pure being, an appreciation for and indeed a willingness to revel in raw sensory experience, the glory of the moment, the beauty of the world.

As a result of this duality of influence, Camus’s basic philosophical problem becomes how to reconcile his Augustinian sense of original sin (universal guilt) and rampant moral evil with his personal ideal of pagan primitivism (universal innocence) and with his conviction that the natural world and our life in it have intrinsic beauty and value. Can an absurd world have intrinsic value? Is authentic pessimism compatible with the view that there is an essential dignity to human life? Such questions raise the possibility that there may be deep logical inconsistencies within Camus’s philosophy, and some critics (notably Sartre) have suggested that these inconsistencies cannot be surmounted except through some sort of Kierkegaardian leap of faith on Camus’s part—in this case a leap leading to a belief not in God but in man.

Such a leap is certainly implied in an oft-quoted remark from Camus’s “Letter to a German Friend,” where he wrote: “I continue to believe that this world has no supernatural meaning…But I know that something in the world has meaning—man.” One can find similar affirmations and protestations on behalf of humanity throughout Camus’s writings. They are almost a hallmark of his philosophical style. Oracular and high-flown, they clearly have more rhetorical force than logical potency. On the other hand, if we are trying to locate Camus’s place in European philosophical tradition, they provide a strong clue as to where he properly belongs. Surprisingly, the sentiment here, a commonplace of the Enlightenment and of traditional liberalism, is much closer in spirit to the exuberant secular humanism of the Italian Renaissance than to the agnostic skepticism of contemporary post-modernism.

vi. Individual vs. History and Mass Culture

A primary theme of early twentieth-century European literature and critical thought is the rise of modern mass civilization and its suffocating effects of alienation and dehumanization. This became a pervasive theme by the time Camus was establishing his literary reputation. Anxiety over the fate of Western culture, already intense, escalated to apocalyptic levels with the sudden emergence of fascism, totalitarianism, and new technologies of coercion and death. Here then was a subject ready-made for a writer of Camus’s political and humanistic views. He responded to the occasion with typical force and eloquence.

In one way or another, the themes of alienation and dehumanization as by-products of an increasingly technical and automated world enter into nearly all of Camus’s works. Even his concept of the Absurd becomes multiplied by a social and economic world in which meaningless routines and mind-numbing repetitions predominate. The drudgery of Sisyphus is mirrored and amplified in the assembly line, the business office, the government bureau, and especially in the penal colony and concentration camp.

In line with this theme, the ever-ambiguous Meursault in The Stranger can be understood as both a depressing manifestation of the newly emerging mass personality (that is, as a figure devoid of basic human feelings and passions) and, conversely, as a lone hold-out, a last remaining specimen of the old Romanticism—and hence a figure who is viewed as both dangerous and alien by the robotic majority. Similarly, The Plague can be interpreted, on at least one level, as an allegory in which humanity must be preserved from the fatal pestilence of mass culture, which converts formerly free, autonomous, independent-minded human beings into a soulless new species.

At various times in the novel, Camus’s narrator describes the plague as if it were a dull but highly capable public official or bureaucrat:

It was, above all, a shrewd, unflagging adversary; a skilled organizer, doing his work thoroughly and well. (180) “But it seemed the plague had settled in for good at its most virulent, and it took its daily toll of deaths with the punctual zeal of a good civil servant.” (235)

 This identification of the plague with oppressive civil bureaucracy and the routinization of charisma looks forward to the author’s play The State of Siege, where plague is used once again as a symbol for totalitarianism—only this time it is personified in an almost cartoonish way as a kind of overbearing government functionary or office manager from hell. Clad in a gaudy military uniform bedecked with ribbons and decorations, the character Plague (a satirical portrait of Generalissimo Francisco Franco—or El Caudillo as he liked to style himself) is closely attended by his personal Secretary and loyal assistant Death, depicted as a prim, officious female bureaucrat who also favors military garb and who carries an ever-present clipboard and notebook.

So Plague is a fascist dictator, and Death a solicitous commissar. Together these figures represent a system of pervasive control and micro-management that threatens the future of mass society.

In his reflections on this theme of post-industrial dehumanization, Camus differs from most other European writers (and especially from those on the Left) in viewing mass reform and revolutionary movements, including Marxism, as representing at least as great a threat to individual freedom as late-stage capitalism. Throughout his career he continued to cherish and defend old-fashioned virtues like personal courage and honor that other Left-wing intellectuals tended to view as reactionary or bourgeois.

vii. Suicide

Suicide is the central subject of The Myth of Sisyphus and serves as a background theme in Caligula and The Fall. In Caligula the mad title character, in a fit of horror and revulsion at the meaninglessness of life, would rather die—and bring the world down with him—than accept a cosmos that is indifferent to human fate or that will not submit to his individual will. In The Fall, a stranger’s act of suicide serves as the starting point for a bitter ritual of self-scrutiny and remorse on the part of the narrator.

Like Wittgenstein (who had a family history of suicide and suffered from bouts of depression), Camus considered suicide the fundamental issue for moral philosophy. However, unlike other philosophers who have written on the subject (from Cicero and Seneca to Montaigne and Schopenhauer), Camus seems uninterested in assessing the traditional motives and justifications for suicide (for instance, to avoid a long, painful, and debilitating illness or as a response to personal tragedy or scandal). Indeed, he seems interested in the problem only to the extent that it represents one possible response to the Absurd. His verdict on the matter is unqualified and clear: The only courageous and morally valid response to the Absurd is to continue living—“Suicide is not an option.”

viii. The Death Penalty

From the time he first heard the story of his father’s literal nausea and revulsion after witnessing a public execution, Camus began a vocal and lifelong opposition to the death penalty. Executions by guillotine were a common public spectacle in Algeria during his lifetime, but he refused to attend them and recoiled bitterly at their very mention.

Condemnation of capital punishment is both explicit and implicit in his writings. For example, in The Stranger Meursault’s long confinement during his trial and his eventual execution are presented as part of an elaborate, ceremonial ritual involving both public and religious authorities. The grim rationality of this process of legalized murder contrasts markedly with the sudden, irrational, almost accidental nature of his actual crime. Similarly, in The Myth of Sisyphus, the would-be suicide is contrasted with his fatal opposite, the man condemned to death, and we are continually reminded that a sentence of death is our common fate in an absurd universe.

Camus’s opposition to the death penalty is not specifically philosophical. That is, it is not based on a particular moral theory or principle (such as Cesare Beccaria’s utilitarian objection that capital punishment is wrong because it has not been proven to have a deterrent effect greater than life imprisonment). Camus’s opposition, in contrast, is humanitarian, conscientious, almost visceral. Like Victor Hugo, his great predecessor on this issue, he views the death penalty as an egregious barbarism—an act of blood riot and vengeance covered over with a thin veneer of law and civility to make it acceptable to modern sensibilities. That it is also an act of vengeance aimed primarily at the poor and oppressed, and that it is given religious sanction, makes it even more hideous and indefensible in his view.

Camus’s essay “Reflections on the Guillotine” supplies a detailed examination of the issue. An eloquent personal statement with compelling psychological and philosophical insights, it includes the author’s direct rebuttal to traditional retributionist arguments in favor of capital punishment (such as Kant’s claim that death is the legally appropriate, indeed morally required, penalty for murder). To all who argue that murder must be punished in kind, Camus replies:

Capital punishment is the most premeditated of murders, to which no criminal’s deed, however calculated, can be compared. For there to be an equivalency, the death penalty would have to punish a criminal who had warned his victim of the date on which he would inflict a horrible death on him and who, from that moment onward, had confined him at his mercy for months. Such a monster is not to be encountered in private life.

Camus concludes his essay by arguing that, at the very least, France should abolish the savage spectacle of the guillotine and replace it with a more humane procedure (such as lethal injection). But he still retains a scant hope that capital punishment will be completely abolished at some point in the time to come: “In the unified Europe of the future the solemn abolition of the death penalty ought to be the first article of the European Code we all hope for.” Camus himself did not live to see the day, but he would no doubt be gratified to know that abolition of capital punishment is now an essential prerequisite for membership in the European Union.

6. Existentialism

Camus is often classified as an existentialist writer, and it is easy to see why. Affinities with Kierkegaard and Sartre are patent. He shares with these philosophers (and with the other major writers in the existentialist tradition, from Augustine and Pascal to Dostoyevsky and Nietzsche) an habitual and intense interest in the active human psyche, in the life of conscience or spirit as it is actually experienced and lived. Like these writers, he aims at nothing less than a thorough, candid exegesis of the human condition, and like them he exhibits not just a philosophical attraction but also a personal commitment to such values as individualism, free choice, inner strength, authenticity, personal responsibility, and self-determination.

However, one troublesome fact remains: throughout his career Camus repeatedly denied that he was an existentialist. Was this an accurate and honest self-assessment? On the one hand, some critics have questioned this “denial” (using the term almost in its modern clinical sense), attributing it to the celebrated Sartre-Camus political “feud” or to a certain stubbornness or even contrariness on Camus’s part. In their view, Camus qualifies as, at minimum, a closet existentialist, and in certain respects (e.g., in his unconditional and passionate concern for the individual) as an even truer specimen of the type than Sartre.

On the other hand, besides his personal rejection of the label, there appear to be solid reasons for challenging the claim that Camus is an existentialist. For one thing, it is noteworthy that he never showed much interest in (indeed he largely avoided) metaphysical and ontological questions (the philosophical raison d’etre of Heidegger and Sartre). Of course there is no rule that says an existentialist must be a metaphysician. However, Camus’s seeming aversion to technical philosophical discussion does suggest one way in which he distanced himself from contemporary existentialist thought.

Another point of divergence is that Camus seems to have regarded existentialism as a complete and systematic world-view, that is, a fully articulated doctrine. In his view, to be a true existentialist one had to commit to the entire doctrine (and not merely to bits and pieces of it), and this was apparently something he was unwilling to do.

A further point of separation, and possibly a decisive one, is that Camus actively challenged and set himself apart from the existentialist motto that being precedes essence. Ultimately, against Sartre in particular and existentialists in general, he clings to his instinctive belief in a common human nature. In his view human existence necessarily includes an essential core element of dignity and value, and in this respect he seems surprisingly closer to the humanist tradition from Aristotle to Kant than to the modern tradition of skepticism and relativism from Nietzsche to Derrida (the latter his fellow-countryman and, at least in his commitment to human rights and opposition to the death penalty, his spiritual successor and descendant).

7. Camus, Colonialism, and Algeria

One of the main topics and even preoccupations of recent Camus studies has been the writer’s attitude, as reflected in both his fiction and in his non-fiction, towards European colonialism in general and his response to the French-Algerian “problem” or “question” (as it was often termed) in particular. The first thing that can be noted in this respect is that, unlike Sartre and many other European intellectuals, Camus never delivered a formal critique of colonialism. Nor did he sign any of the frequent manifestos and declarations deploring the practice – a sin for which he was sharply criticized and even accused of moral cowardice. In 1958, partly to explain and vindicate himself, but mainly to illustrate and give voice to the painful complexities of colonial reform and decolonization, he published Algerian Chronicles, a collection of his writings on the vexing “problem” that he had personally agonized over for more than twenty years.

In addition to his perceived silence on the issue of colonialism (a silence, as Algerian Chronicles reveals, motivated by his fear that speaking out aggressively would be more likely to heighten tensions than secure the united and independent post-colonial Algeria he hoped for), Camus has also been criticized for the virtual erasure of Arab characters and culture from his fiction. The Irish writer and politician Conor Cruise O’Brien made a partial attempt to rescue Camus from this criticism by arguing that The Fall should be read as an autobiographical work in which Camus confesses his own personal failures, including his guilt at becoming a privileged citizen in a poor country. Several writers, and most prominently and forcefully Edward Said, have denounced the nearly total absence of Arab characters in Camus’s novels and stories. Moreover, the few Arab characters who do appear, these critics point out, are inevitably mute and anonymous. They are either shadow figures, including the nameless murder victim at the climactic center of The Stranger, or mere bodies, like the uncounted and unidentified native Algerians who presumably make up the major part of the death toll in The Plague but who otherwise have no speaking role or even visible presence in the novel. Along this same line of criticism, The Meursault Investigation is a fictional and metafictional riposte to Camus by the Algerian writer Kamel Daoud. A reimagining of the characters and events of The Stranger, told from the point of view of the brother of the murdered Arab, the novel represents both a corrective rebuke and a literary tribute to it famous original. In the introduction to her recent expanded edition of Algerian Chronicles, Alice Kaplan addresses these and related criticisms and cites relevant passages from Camus’s own writing in response to them.

8. Significance and Legacy

Obviously, Camus’s writings remain the primary reason for his continuing importance and the chief source of his cultural legacy, but his fame is also due to his exemplary life. He truly lived his philosophy; thus it is in his personal political stands and public statements as well as in his books that his views are clearly articulated. In short, he bequeathed not just his words but also his actions. Taken together, those words and actions embody a core set of liberal democratic values—including tolerance, justice, liberty, open-mindedness, respect for personhood, condemnation of violence, and resistance to tyranny—that can be fully approved and acted upon by the modern intellectual engagé.

On a purely literary level, one of Camus’s most original contributions to modern discourse is his distinctive prose style. Terse and hard-boiled, yet at the same time lyrical, and indeed capable of great, soaring flights of emotion and feeling, Camus’s style represents a deliberate attempt on his part to wed the famous clarity, elegance, and dry precision of the French philosophical tradition with the more sonorous and opulent manner of 19th century Romantic fiction. The result is something like a cross between Hemingway (a Camus favorite) and Melville (another favorite) or between Diderot and Hugo. For the most part when we read Camus we encounter the plain syntax, simple vocabulary, and biting aphorism typical of modern theatre or noir detective fiction. By the way it’s worth noting that Camus was a fan of the novels of Dashiell Hammett and James M Cain and that his own work has influenced the style and the existentialist loner heroes of a succession of later crime writers, including John D McDonald and Lee Child. This muted, laconic style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Moreover, this base style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Here we may note that this attempted reconciliation or union of opposing styles is not just an aesthetic gesture on the author’s part: It is also a moral and political statement. It says, in effect, that the life of reason and the life of feeling need not be opposed; that intellect and passion can, and should, operate together.

Perhaps the greatest inspiration and example that Camus provides for contemporary readers is the lesson that it is still possible for a serious thinker to face the modern world (with a full understanding of its contradictions, injustices, brutal flaws, and absurdities) with hardly a grain of hope, yet utterly without cynicism. To read Camus is to find words like justice, freedom, humanity, and dignity used plainly and openly, without apology or embarrassment, and without the pained or derisive facial expressions or invisible quotation marks that almost automatically accompany those terms in public discourse today.

At Stockholm Camus concluded his Nobel acceptance speech with a stirring reminder and challenge to modern writers: “The nobility of our craft,” he declared, “will always be rooted in two commitments, both difficult to maintain: the refusal to lie about what one knows and the resistance to oppression.” He left behind a body of work faithful to his own credo that the arts of language must always be used in the service of truth and the service of liberty.

9. References and Further Reading

a. Works by Albert Camus

  • The Stranger. Trans. Stuart Gilbert. New York: Vintage-Random House, 1946.
  • Camus’s first novel, a classic portrait of the “outsider” originally published in France as L’Etranger by Librairie Gallimard in 1942.
  • The Plague. Trans. Stuart Gilbert. New York: Vintage-International, 1991.
  • Camus’s second novel, originally published in France as La Peste by Librairie Gallimard in 1947.
  • The Fall. Trans. Justin O’Brien. New York: Vintage-Random House, 1956.
  • Camus’s third novel, a confessional monologue originally published in France as La Chute by Librairie Gallimard in 1956.
  • The Myth of Sisyphus and other Essays. Trans. Justin O’Brien. New York: Vintage-Random House, 1955.
  • A philosophical meditation on suicide originally published as Le Mythe de Sisyphe by Librairie Gallimard in 1942.
  • The Rebel. Trans. Anthony Bower. New York: Vintage-Random House, 1956.
  • A philosophical essay on the ethics of rebellion and political violence originally published as L’Homme Revolte by Librairie Gallimard in 1951.
  • Exile and the Kingdom. Trans. Justin O’Brien.  New York: Vintage-Random House, 1958.
  • A collection of short fiction originally published as L’Exil et le Royaume by Librairie Gallimard in 1957.
  • Lyrical and Critical Essays. Ed. Philip Thody. Trans. Ellen Conroy Kennedy. New York: Vintage-Random House, 1970.
  • A selection of critical writings, including essays on Melville, Faulkner, and Sartre, plus all the early essays from Betwixt and Between and Nuptials.
  • Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Vintage International, 1995.
  • A collection of essays on a wide variety of political topics ranging from the death penalty to the Cold War.
  • Caligula and Three Other Plays. Trans. Stuart Gilbert. New York: Vintage-Random House, 1958.
  • A collection of four of Camus’s best-known dramatic works: Caligula, The Misunderstanding, The State of Siege, and The Just Assassins, with a foreword by the author.
  • The First Man. Trans. David Hapgood. New York: Alfred Knopf, 1995.
  • A posthumous novel, partly autobiographical.
  • Camus at Combat: Writings 1944-1947. Ed. Jaqueline Levi-Valenci. Trans. Arthur Goldhammer. Princeton, NJ: Princeton University Press, 2006.
  • A collection of articles and editorials that Camus wrote during and after WW II for the French Resistance journal Combat.
  • Algerian Chronicles. Ed. Alice Kaplan. Trans. Arthur Goldhammer. Cambridge, MA: Belknap Press, 2013.
  • A collection of Camus’s political writings on Algeria.

b. Critical and Biographical Studies

  • Barthes, Roland. Writing Degree Zero. New York: Hill and Wang, 1968.
  • Bloom, Harold, ed. Albert Camus. New York: Chelsea House, 1989.
  • Brée, Germaine. Camus. New Brunswick, NJ: Rutgers University Press, 1961.
  • Brée, Germaine, ed. Camus: A Collection of Critical Essays. Englewood Cliffs, NJ: Prentice-Hall, 1962.
  • Cruickshank, John. Albert Camus and the Literature of Revolt. London: Oxford University Press, 1959.
  • Cruickshank, John. The Novelist as Philosopher. London: Oxford University Press, 1959.
  • Foley, John. Albert Camus: From the Absurd to Revolt. Montreal: McGill-Queens University Press, 2008.
  • Hughes, Edward J. ed. The Cambridge Companion to Camus. Cambridge, UK: Cambridge University Press, 2007.
  • Kauffman, Walter, ed. Religion from Tolstoy to Camus. New York: Harper, 1964.
  • Lottman, Herbert R. Albert Camus: A Biography. Corte Madera, CA: Gingko Press, 1997.
  • Malraux, Andre. Anti-Memoirs. New York: Holt, Rinehart, and Winston, 1968.
  • Margerrison, Christine. et al. Albert Camus in the 21st Century: A Reassessment of his Thinking at the Dawn of the New Millennium. Amsterdam, NL: Rodopi, 2008.
  • McBride, Joseph. Albert Camus: Philosopher and Littérateur. New York: St. Martin’s Press, 1992.
  • O’Brien, Conor Cruise. Camus. London: Faber and Faber, 1970.
  • Said, Edward. “Camus and the French Imperial Experience.” In Culture and Imperialism. New York: Vintage Books, 1994.
  • Sartre, Jean-Paul. “Camus’s The Outsider.” In Situations. New York: George Braziller, 1965.
  • Ronald D Srigley. Albert Camus’ Critique of Modernity. Columbia, MO: University of Missouri Press, 2011.
  • Thrody, Philip. Albert Camus, 1913-1960. London: Hamish Hamilton, 1961.
  • Todd, Olivier. Albert Camus: A Life. New York: Alfred A. Knopf, 1997.
  • Zaretsky, Robert. A Life Worth Living: Albert Camus and the Quest for Meaning. Cambridge, MA: Belknap Press of Harvard University Press, 2013.

 

Author Information

David Simpson
Email: dsimpson@depaul.edu
DePaul University
U. S. A.

Proper Functionalism

‘Proper Functionalism’ refers to a family of epistemological views according to which whether a belief (or some other doxastic state) was formed by way of properly functioning cognitive faculties plays a crucial role in whether it has a certain kind of positive epistemic status (such as being an item of knowledge, or a justified belief). Alvin Plantinga’s proper functionalist theory of knowledge has been the most prominent among these theories. Michael Bergmann’s (2006) proper functionalist theory of justification has also been the focus of much discussion. But proper functionalist theories of other epistemic properties have also been developed. Richard Otte (1987) and Alvin Plantinga (1993b: Chapter 9) offer proper functionalist theories of epistemic probability, for example. Nicholas Wolterstorff (2010) defends a proper functionalist theory of epistemic oughts. And Peter Graham (2010) develops a proper functionalist theory of epistemic entitlement. Since Plantinga’s theory of knowledge and Bergmann’s theory of justification are the most widely known and most discussed proper functionalist views, and because they share many features with other proper functionalist theories, this article focuses primarily on them—what can be said in their favor, the challenges they face, the ways in which they might be defended, and how they compare with some of their closest rivals.

Table of Contents

  1. Plantinga’s Proper Functionalist Theory of Knowledge
    1. Motivations of Plantinga’s Theory
    2. The Content of Plantinga’s Theory
    3. Swampman
    4. Gettier Cases
  2. Bergmann’s Proper Functionalist Theory of Justification
    1. Some Advantages of Bergmann’s Theory
    2. Some Objections to Bergmann’s Theory
  3. Rival Theories
    1. Proper Functionalism and Phenomenal Conservatism
    2. Proper Functionalism and Virtue Epistemology
  4. References and Further Reading

1. Plantinga’s Proper Functionalist Theory of Knowledge

This article begins with a discussion of Alvin Plantinga’s proper functionalist theory of knowledge. As Plantinga himself frames matters, he takes himself to be giving a proper functionalist theory of a property he calls “warrant,” where warrant is whatever precisely it is which makes the difference between knowledge and mere true belief.

a. Motivations of Plantinga’s Theory

A theory of warrant is subject to Gettier-style counterexamples if a belief can meet all the conditions the theory specifies as jointly sufficient for knowledge, but meet them merely by accident (in a manner that precludes that beliefs being an item of knowledge). Plantinga argues that any theory that fails to construe a proper function condition as necessary for warrant is subject to counterexamples of this sort. This is so whether the theory emphasizes the believer’s internal states as most relevant to whether her belief has warrant, external factors, or both of these.

By way of illustration, Plantinga (1993b: 31-37) adopts a scenario originally introduced by Roderick Chisholm, who attributes it to Alexius Meinong. The scenario envisions an aging forest ranger living in the mountains, with a set of wind chimes hanging from a bough. The ranger is unaware of the fact that his hearing has been degenerating of late, and it has gotten to the point where he can no longer hear the chimes. He is also unaware that he is occasionally subject to small auditory hallucinations in which he appears to hear the wind-chime. On one occasion, he is thus appeared to and comes to believe that the wind is blowing. As it happens, the wind is blowing and causing the ringing of the chimes. Even if we stipulate that all is going well with this belief from the ranger’s own internal perspective, it is clear nonetheless that his belief lacks warrant. The reason his belief lacks warrant, Plantinga maintains, results from the fact that it is due to cognitive malfunction.

One might question whether this explanation is correct, however, on the ground that certain cognitively external environmental conditions are also amiss in this case. In particular, the case is one in which there is no reliable connection between the ranger’s appearing to hear the wind-chime and the wind’s blowing. And one might think that it is primarily for this reason that the ranger’s beliefs lack warrant. This thought might push one toward bypassing proper functionalism and endorsing a reliabilist theory of warrant instead (that is, an account according to which a belief having warrant is primarily a matter of it being formed or sustained in a way that involves a reliable connection to the truth). But Plantinga also argues that any reliabilist theory which does not incorporate a proper function condition is also subject to Gettier-style counterexamples.

Plantinga (1993a: 195-198, 205-207) takes this to be illustrated by The Case of the Epistemically Serendipitous Brain Lesion. Imagine Sam has a brain lesion, one that engenders cognitive processes which mostly result in false beliefs. One process the lesion engenders, however, is a process that results in the belief that one has a brain lesion. This particular process is highly reliable (it always results in one’s having a true belief). But clearly the belief that results is not a matter of knowledge. What explains why this is so, Plantinga maintains, is that the belief in question (though formed by a truth-reliable process) is not the result of cognitive proper function. Accordingly, Plantinga concludes that any reliabilist account of warrant must be augmented with a proper function condition.

Kenneth Boyce and Alvin Plantinga (2012: 127-128) have emphasized that there may be an even stronger lesson to be drawn from these cases. Once these cases are on the table, one can imagine variations of them in which different combinations of internal and external conditions (other than proper function ones) are met, but in which the belief in question lacks warrant because it ends up being true merely by accident. Furthermore, Boyce and Plantinga contend that in these cases it seems that part of what explains why these are cases in which the beliefs are true merely by accident (in a way that precludes their being items of knowledge) is that they were not formed in a manner specified by cognitive proper function; that is, the way they get at the truth is accidental from the perspective of the cognitive design plan. If that is correct, however, then (as Boyce and Plantinga point out), there is reason to believe that the notion of cognitive proper function is centrally involved in the notion of non-accidentally that any adequate analysis of warrant must capture.

b. The Content of Plantinga’s Theory

Examples of the sort discussed above are used by Plantinga to motivate the claim that cognitive proper function is necessary for warrant. Plantinga (1993b: 21-24) also maintains that the relevant notion of proper function presupposes that of a design plan—something that specifies the manner in which a thing is supposed to function in various circumstances. As Plantinga conceives of it, a design plan may be modeled as a set of ordered triples, where each triple specifies a circumstance, a response, and a purpose or function. One need not initially take this notion of a design plan to involve conscious design or purpose. The notion of a design plan at issue here is whatever notion is presupposed by talk of proper function for biological systems (as when a physician determines that a human heart is functioning the way it is supposed to on account of its pumping at 70 beats per minute). Plantinga himself gives a theistic account of this notion, but other proper functionalists, such as Ruth Millikan (1984) and Peter Graham (2012), have offered naturalistic, evolutionary accounts.

While Plantinga (1993b: 46) takes cognitive proper function to be necessary for warrant, he does not take it to be sufficient (or even nearly sufficient). Other conditions must also be satisfied. To a rough, first approximation, Plantinga takes a belief to be warranted if and only if it satisfies the following four conditions:

(1) The belief in question is formed by way of cognitive faculties that are properly functioning.

(2) The cognitive faculties in question are aimed at the production of true beliefs.

(3) The design plan is a good one. That is, when a belief is formed by way of truth-aimed cognitive proper function in the sort of environment for which the cognitive faculties in question were designed, there is a high objective probability that the resulting belief is true.

(4) The belief is formed in the sort of environment for which the cognitive faculties in question were designed.

While Plantinga adds various nuances, these four conditions serve to capture the main outlines of his view.

Many objections have been raised to Plantinga’s theory. Two of the most prominent among them are considered below. The first amounts to an objection to the claim that Plantinga’s four conditions are necessary for warrant. The second amounts to an objection to the claim that they are sufficient. For a sampling of other objections, one would do well to examine the collection of essays on Plantinga’s theory of warrant edited by Jonathan L. Kvanvig (1996).

c. Swampman

Some have argued that there are counterexamples to Plantinga’s theory involving beings who have warranted beliefs but who nevertheless fail to exhibit cognitive proper function. The most well-known version of this objection comes from Ernest Sosa (1993), who adapts a scenario originally proposed by Donald Davidson, and uses it against proper functionalism. In that scenario, Davidson is standing next to a swamp when lightning strikes a nearby dead tree, thereby obliterating Davidson. Simultaneously, by sheer accident, the lightning also causes the molecules of the tree to arrange themselves into a perfect duplicate of Davidson as he was at the time of his demise. The Davidson duplicate—this “Swampman”—leaves the swamp, acting and talking as if it were Davidson, having all the same intrinsic properties that Davidson would have had, had he left the swamp without having his unfortunate encounter. According to Sosa, “it … seems logically possible for … Swampman to have warranted beliefs not long after creation if not right away” (p. 54). Yet, not being the product of intentional design, and not having any evolutionary history, it would seem that Swampman has no design plan. And so we have what appears to be a counterexample to proper functionalism.

There are various responses to the Swampman objection. Plantinga (1993c: 206-208) and Graham (2012: 466-467) have each argued, albeit for different reasons, that it is doubtful the Swampman scenario is metaphysically possible. They have also suggested, again for different reasons, that if this scenario is possible, perhaps Swampman can acquire conditions for proper functioning without natural selection or intentional design. See Plantinga (1993c: 78) and Graham (2014). Bergmann (2006: 147-149) has argued that we are intuitively inclined to assign positive epistemic status to Swampman’s beliefs only to the extent we are inclined to think that his beliefs are fitting responses to the inputs he receives. And we are inclined to think that Swampman’s beliefs are fitting, argues Bergmann, only to the extent we are inclined to think of those responses as exhibiting cognitive proper function. Boyce and Plantinga (2012: 130-131) have suggested that since it is merely by accident that Swampman is forming his beliefs reliably, we can think of this case as a Gettier scenario (or at least, relevantly analogous to one), and thereby deny that Swampman’s beliefs have warrant). For a similar response, see (McNabb 2015).

Since then, Kenneth Boyce and Andrew Moon (2015) have argued that the Swampman objection relies on a false intuition concerning the conditions under which the belief of one creature has warrant if the belief of another, similar creature does. According to them, the central intuition that motivates our intuitive reaction to the Swampman case may be stated as follows:

(CI) If a belief B is warranted for a subject S and another subject S* comes to hold B in the same way that S came to hold B in a relevantly similar environment to the one in which S came to hold B, then B is warranted for S*.

They argue that it is CI, in conjunction with the stipulation that Swampman forms his beliefs in the same way that an ordinary human being would (an ordinary human being to whom we would be inclined to attribute knowledge), that explains our tendency to regard Swampman as having warranted beliefs. Boyce and Moon then go on to argue that CI is subject to counterexamples, and that this undercuts the force of the Swampman objection. See Section 3b for further discussion of their argument.

d. Gettier Cases

Plantinga has conceded that his theory, as he originally formulated it, is subject to Gettier-style counterexamples. In 2000, Plantinga formulated this counterexample:

I own a Chevrolet van, drive to Notre Dame on a football Saturday, and unthinkingly park in one of the many places reserved for the football coach. Naturally, his minions tow my van away and, as befits such lèse majesté, destroy it. By a splendid piece of good luck, however, I have won the Varsity Club’s Win-a-Chevrolet-Van contest, although I haven’t yet heard the good news. You ask me what sort of automobile I own; I reply, both honestly and truthfully, “A Chevrolet van.” My belief that I own such a van is true, but ‘just by accident’ (more accurately, it is only by accident that I happen to form a true belief); hence it does not constitute knowledge. All of the non-environmental conditions for warrant, furthermore, are met. It also looks as if the environmental condition is met: after all, isn’t the cognitive environment here on earth and in South Bend just the one for which our faculties were designed?

Clearly Plantinga’s belief (though true) is not an item of knowledge in this case and thus lacks warrant. So Plantinga’s original four conditions are not jointly sufficient for warrant. Something else must be added. But what?

According to Plantinga, what the original account requires is an addition to the environmental condition. More specifically, the problem in the above case is that while the global environment that Plantinga is in is the one for which his faculties were designed, his more local environment is epistemically misleading. So in order to deal with this counterexample, Plantinga proposes adding a resolution condition. This condition involves a distinction between two different kinds of environment, what Plantinga refers to as the “maxi-environment” and what he refers to as the “mini-environment.” The maxi-environment, Plantinga stipulates, is the kind of global environment in which we live here on earth, the kind of environment for which our cognitive faculties were designed (or to which they were adapted). The mini-environment, by contrast, is a much more specific state of affairs, one that includes, for a given exercise of one’s cognitive faculties E resulting in a belief B, all of the epistemically relevant circumstances obtaining when B is formed (though diminished with respect to whether B is true).

Letting ‘MBE’ denote the cognitive mini-environment with respect to B and E (which Plantinga says may contain as large a fragment of the actual world as one likes, up to whether B is true), Plantinga maintains that the needed resolution condition may be stated as follows:

(RC) A belief B produced by an exercise of cognitive powers has warrant sufficient for knowledge only if MBE (the mini-environment with respect to B and E) is favorable for E.

This, of course, raises the question of just what it is for a mini-environment to be “favorable.” Plantinga has, in the past, offered various proposals for what favorableness consists in that he has subsequently admitted to be unsatisfactory. A proposal is found in Boyce and Plantinga (2012: 134). For other proposals, see Crisp (2000) and Chignell (2003).

2. Bergmann’s Proper Functionalist Theory of Justification

Plantinga’s theory of warrant is not the only kind of proper functionalist theory. Proper functionalist theories of other epistemic concepts have also been developed. Noteworthy among these is Michael Bergmann’s proper functionalist theory of epistemic justification. The kind of epistemic justification that Bergmann (2006: 4-5) is interested in is doxastic justification. The having of this property is frequently (though not universally) held to be a necessary condition for a belief being an item of knowledge. In fact, it is often held that a belief having this property, in conjunction with its being non-accidentally true (in a way that rules out Gettier cases), is not only necessary, but also sufficient, for its being an item of knowledge.

A major divide in the literature occurs between those philosophers who are “externalists” about this kind of justification and those who are “internalists” about it. Just how this divide should be characterized is itself a matter of dispute. But for present purposes, we may characterize internalists about justification as being committed (at least) to the view that whether a belief is justified depends entirely on which mental states that belief is based upon (in such a way that necessarily, any two believers who are exactly alike in terms of their mental states and in terms of which of those mental states their beliefs are based upon are also alike in terms of which of their beliefs are justified). Externalists, by contrast, maintain that whether a belief is justified may depend on other factors.

It should be noted, however, that Bergmann (2006: chapter 3) divides up the territory a bit differently, though not in a way that impacts the current discussion. He takes it to be a necessary condition for a view of justification to count as “internalist” that it include an awareness requirement (that is, that it require, in order for a belief to be justified, that the believing subject is actually or potentially aware of some justification-contributor to that belief). The characterization of internalism given here, by contrast, includes no such requirement (and is similar to the characterization of a view of justification that Bergmann calls “mentalism,” one which he takes to be distinct from both externalism and internalism).

As Bergmann (2006: 3-7) points out, it is not always clear that philosophers who appear to dispute the nature of justification are actually disagreeing with one another. That is because it is plausible that epistemologists sometimes use the term ‘justification’ in different ways. He notes, for example, that some epistemologists use this term to pick out a subjective notion, one that it is satisfied by a belief provided that the subject is blameless in holding it. Others, by contrast, he observes, use the term to pick out a more objective notion, one according to which a belief is justified only if it is fitting with respect to the believer’s evidence or other epistemically relevant inputs. It is this objective notion of justification in which Bergmann is interested (see also pp. 111-113). He takes it to be a conceptually open question as whether this kind of justification is necessary for knowledge (though he thinks it is). And he also takes some disputes between self-avowed externalists (like himself) and self-avowed internalists (such as Richard Feldman and Earl Conee) to involve a genuine disagreement concerning the nature of this kind of justification.

Bergmann argues that the right way to analyze this kind of justification is in terms of proper function. More specifically, Bergmann’s (2006: 132-137) theory of epistemic justification takes the first of Plantinga’s three conditions (leaving out the fourth, environmental condition) to be necessary for a belief to be justified. Bergmann also takes the first three of Plantinga’s conditions, in conjunction with the condition that the subject does not take the relevant belief to be defeated, to be sufficient for a belief being justified. The motivations for this view are perhaps best appreciated by looking to its purported advantages.

a. Some Advantages of Bergmann’s Theory

Epistemic justification of the kind Bergmann has in mind has some puzzling features. On the one hand, it involves some notion of truth-aptness. In particular, there would appear to be some important, non-trivial, connection between a belief being justified and it being objectively likely to be true. At the very least, it would be a significant cost for a theory of justification to deny this. But which ways of forming and sustaining beliefs result in a high proportion of true beliefs depends on what sort of environment one is in. Our tending to believe that occluded objects still exist, for example, results in a high proportion of true beliefs in our environment, but it is easy to imagine environments in which this would not be the case. These considerations push in the direction of regarding what makes for epistemic justification a contingent matter, one that depends on the sort of environment one inhabits.

On the other hand, justification is a normative concept, the satisfaction of which does not appear to depend on the sort of environment in which one is located. This aspect of justification is made especially vivid by “The New Evil Demon Problem”, originally put forward by Keith Lehrer and Stewart Cohen (1983), as a problem for reliabilist theories of justification. Consider a population of beings, just like ourselves, who form their beliefs in response to experience in just the ways that we do, but who (unlike us) are victims of a Cartesian demon who renders their belief-forming processes unreliable. From many reliabilist theories of justification, it follows that these beings have far fewer justified beliefs than we do (since most of their beliefs are not formed in a truth-reliable manner). But this seems false. These beings are in an epistemically bad situation, to be sure, but they are still forming their beliefs in ways that are appropriate given their experiences because their beliefs are at least justified.

Bergmann’s theory of epistemic justification nicely combines these puzzling features. First, it accommodates the intuition that inhabitants of a demon world, who are like us, and who form their beliefs in response to experience in the same ways we do, have the same proportion of justified beliefs. For, as Bergmann (2006: 141-143) notes, his theory entails that provided these beings have a cognitive design plan comparable to ours and are properly functioning, many of their beliefs are justified, even though their ways of forming beliefs are, for the most part, unreliable. This analysis also, as Bergmann points out, accommodates the intuition that justification is importantly and non-trivially connected with truth-aptness. For, insofar as the beings living in a demon world fulfill Bergmann’s conditions for justification, the manner in which they form their beliefs would be truth-apt if they were placed in the environment for which their cognitive faculties were designed. Finally, since different design plans may be tailored to different kinds of environments, Bergmann’s theory accommodates the possibility that what makes for justification is a contingent matter, one that depends on the kind of environment for which the creatures at issue are situated.

b. Some Objections to Bergmann’s Theory

Like Plantinga’s theory, Bergmann’s faces the objection that it is subject to counterexamples involving creatures like Swampman. There is no need, though, to rehearse the various responses that might be given to this objection here (since many of them will be the same or similar to those described in Section 1c). As a theory of justification, however, Bergmann’s view also faces other objections, ones which are not (or not as obviously) applicable to a theory of warrant.

Todd. R. Long (2012: 264-265) questions, for example, whether Bergmann’s theory does in fact do a better job than alternative views in handling the New Evil Demon Problem. He grants that Bergmann’s view does accommodate the intuition that demon-world victims with the same design plan as ours do in fact have justified beliefs (in the same proportions that we do). But he notes that Bergmann’s view also entails that the same cannot be said for demon-world victims who are mentally indistinguishable from ourselves but whose ways of forming beliefs run contrary to their design plan. And Long maintains that to deny that beliefs of demon-world victims in the latter situation are justified also runs contrary to our intuitions. Bergmann (2006: 150), however, anticipates an objection like this. He suggests that there is an analogy between Swampman and the demon victims in such a scenario; accordingly, he adapts his reply to the former so as to apply it to the latter.

Another kind of objection to a proper functionalist theory of justification involves cases in which the design plan specifies ways of belief formation that appear to be objectively bad in some way, in spite of the fact that this component of the design plan is successfully aimed at truth. Long (2012) and Tucker (2014b) each present variations of this objection directed specifically against Bergmann’s view. There are also precedents found among objections to Plantinga’s theory of warrant (see for example Feldman 1993: 44). There are at least two kinds of cases of this sort. The following discussion will make reference to cases described by Tucker, who provides examples of each kind.

In the first kind of case, the design plan specifies coming to hold a belief on the basis of what appears to be an objectively bad form of reasoning. Tucker (2014b: 3321-3322) presents a case, for instance, in which a design plan specifies coming to hold a certain belief on the basis of the fallacy of denying the antecedent. As Tucker points out, even though denying the antecedent is, from a logical point of view, an objectively bad form of reasoning, there are circumstances in which reasoning that way is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify forming a belief in that way, under the right conditions. Even so, it is counterintuitive to think that a belief formed by way of committing a logical fallacy could be justified (at least in the absence of having any further basis).

In the second kind of case, the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Tucker (pp. 3318-3319) presents an example, for instance, in which a person comes to believe Gödel’s incompleteness theorem solely on the basis of his belief that his students hate a particular type of beer. Since Gödel’s incompleteness theorem is a necessary truth, there is no question that this belief-forming process is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify that a belief be formed in this way. Even so, it seems wrong to say that someone could come to be justified in believing Gödel’s incompleteness theorem solely on the basis of that belief.

This is a formidable objection. But there may be things that can be said on the proper functionalist’s behalf. Consider once again the first kind of case, a case that involves coming to hold a belief in the basis of formally bad reasoning. Some things that have been said in defense of reliabilism might also be of use to the proper functionalist here. Alvin Goldman (2002: 146-153), for example, points to research on the part of cognitive psychologists (such as Amos Tversky and Daniel Kahneman) indicating that human beings tend to rely on heuristics when engaged in probabilistic reasoning. As is now well known, these heuristics make people prone to commit elementary probabilistic fallacies. The conclusion that some psychologists have drawn is that these findings indicate that human beings are terrible at probabilistic reasoning. But as Goldman notes, other psychologists have drawn a more optimistic conclusion.

Goldman points to the work of a group of evolutionary psychologists (led by Gerd Gigerenzer, Leda Cosmides, and John Tooby) who argue that, given the limited information and computational power with which organisms must contend, an inference mechanism can be advantageous if it (in Goldman’s words) “often draws accurate conclusions about real-world environments, and does so quickly and with little computational effort” (p. 152). The heuristics humans rely on in probabilistic reasoning, some of these psychologists maintain, are mechanisms of just that sort. If that is the case, then perhaps human beings often do come to hold justified beliefs by way of these mechanisms after all, in spite of the fact that they are formally suspect. And if that is so, then perhaps other kinds of beings might come to form justified beliefs on the basis of kinds of reasoning that (from a purely logical point of view) are formally suspect, but nonetheless reliable in the environments for which their cognitive faculties were designed.

Now consider the second kind of case, the case in which the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Why is it exactly (concerning Tucker’s example) that we are inclined to deny that a person’s belief that his students dislike of a particular type of beer could justify the belief that Gödel’s incompleteness theorem is true? Perhaps it is because there does not appear to be any interesting logical connection between the content of the latter belief and the belief on which it is based. But a similar observation concerning the relationship between our sense experiences and the content of our perceptual beliefs is part of what motivates Bergmann’s proper functionalist theory.

As Bergmann (2006: 119) points out, “Thomas Reid emphasized that there does not seem to be any logical connection between our sense experiences and the content of the beliefs based on them.” Bergmann notes, for example, that “the tactile sensations we experience when touching a hard surface seem to have no logical relation to (nor do they resemble) the content of the hardness beliefs they prompt.” Because this is so, Bergmann argues that the evidential support relations that hold between various sensory experiences and the beliefs formed in response to them cannot be explained in terms of necessary connections. But this prompts the question as to what does explain these support relations. Bergmann (2006: 130-131) argues that proper functionalism provides a good answer to this question. The connections are to be explained by way of which belief-forming responses to sensory inputs are specified by the cognitive design plan.

To accept this motivation for proper functionalism is to accept the claim that at least some epistemic support relations hold only contingently. It is also to countenance the possibility that the epistemic support relations that hold for certain cognizers might seem utterly bizarre from the perspective of creatures like us. So, perhaps, for those who do take this motivation on board, the possibility of an agent’s coming to justifiably believe that Gödel’s incompleteness theorem is true solely on the basis of a belief concerning the beer preferences of his students no longer seems so counterintuitive. (See Bergmann (2006: 141) for a similar response to BonJour’s purported counterexamples to externalist views of justification involving reliably formed clairvoyant beliefs).

3. Rival Theories

Proper functionalist theories do not exist in a vacuum. A full appreciation of their merits or demerits requires an investigation into how well they stack up against their rivals. Two kinds of theories in particular that are often put up against proper functionalism—phenomenal conservatism and virtue epistemology. It is sometimes claimed by the proponents of these theories that they satisfy many of the same motivations as proper functionalism, while having fewer costs, as well as other advantages.

a. Proper Functionalism and Phenomenal Conservatism

At least to a first approximation, a phenomenal conservative theory of doxastic justification may be characterized as the view that a belief with the content that p is justified for an agent if it seems to the agent that p, the agent appropriately bases her belief that p on that seeming, and the agent has no defeaters for that belief. (See Phenomenal Conservatism for more details). As noted in Section 2a, proper functionalists about justification point to the apparent contingency of the connection between various experiences and the beliefs they justify as a motivation for their view. Phenomenal conservatives sometimes claim that their view does just as well at accommodating this apparent contingency while preserving the claim that there is a necessary connection between the things that justify our beliefs and the beliefs they justify. For this reason, phenomenal conservatism might be thought to do a better job than proper functionalism in accommodating the New Evil Demon intuition. Some phenomenal conservatives have also contended that it does a better job in accounting for the nature of evidential support.

Tucker (2011: 58-63) presses this point in connection with his objection (discussed in Section 2b) that proper functionalism allows for inputs which intuitively fail to provide any kind of epistemic support for a belief to justify that belief. In the example previously discussed, Tucker pointed to an instance in which a belief served as such an input. But Tucker also supplies examples in which the same seems to be true of the support relations that hold between various sensory experiences and the beliefs they are purported to justify. He notes, for example, that it is counterintuitive to think that a sensory experience associated with seeing a beautiful sunset could justify the belief that Gödel’s incompleteness theorem is true. But a design plan (presumably different from ours) might well specify that this is an appropriate belief-forming process.

Here the proper functionalist might attempt once more to press the Reidian point that in general it appears true that there is no inherent connection between our sensory experiences and the contents of the beliefs based on them. But Tucker (2011: 56-58, 61-63) suggests a way the advocate of phenomenal conservatism could account for the role that sensory experience plays in justifying our beliefs that accommodates this fact. According to Tucker, sensory experience might play a role in the justification of a certain belief by triggering a seeming with the content of that belief, it being a contingent matter which sensations trigger which seemings. Andrew Cullison (2013: 34-37) makes a similar suggestion, noting that just as two different sentences from different languages might well express the same proposition, two different kinds of cognitive apparatus associated with different species might cause seemings of the same content in response to differing kinds of phenomenology. This accommodates the Reidian point while preserving the claim that there is a necessary connection between the things that justify our beliefs (that is, our seeming states) and the beliefs they justify (via the identities of their contents).

Suppose one agrees that a phenomenal conservative view of justification does better than a proper functionalist view on these counts. This of course does not commit one to agreeing that phenomenal conservatism does better than proper functionalism over all. Bergmann (2013) argues, for example, that proper functionalists can accommodate many of the intuitions that motivate phenomenal conservatism, while also doing a better job in accommodating the intuition that some belief formations, downstream from sensory experiences, are objectively fitting responses to those experiences, whereas others are not.

Bergmann notes, for instance, that proper functionalists might adopt a model according to which, for humans (though not necessarily for all cognizers), when all goes well, a belief formed in response to a sensory experience is justified via being based on an intermediate seeming (one that is appropriately caused by the experience). He argues that this model accommodates many of the intuitions to which phenomenal conservatives appeal. But it also, he points out, allows for the possibility that there is an objective mismatch between a belief formed in response to a sensory experience and the nature of that experience, one which prevents the belief in question from being justified, even when the content of that belief matches the content of the intermediate seeming.

Bergmann describes, for example, a case in which a human cognizer, suffering from brain damage, forms the belief that she is holding a hard spherical object, in response to the olfactory sensation she experiences while smelling a lilac bush. Even if it is stipulated that she bases this belief on an intermediate seeming with the same content as her belief, it can still seem that her belief is objectively unfitting (in relation to her experience) and, for that reason, unjustified. A proper functionalist can accommodate this intuition, Bergmann claims, whereas a phenomenal conservative cannot. The proper functionalist can maintain that the reason the cognizer’s belief is objectively unfitting in this case is that, even though it is based on an appropriate intermediate seeming, it is not the appropriate response to the relevant sensation; it is not the belief her design plan specifies should result.

Relatedly, one might think that proper functionalism does better than phenomenal conservatism in accounting for the relation between justification and truth-aptness. A common objection to phenomenal conservative views is that they suffer from a “cognitive penetration” problem. In certain kinds of wishful thinking cases, for example, a seeming state might be caused by a desire; and in some such cases the believer in question will be unaware of this fact, and have no defeater for the belief in question. According to phenomenal conservatives, a belief properly based on such a seeming will still be justified. But to many this seems wrong. One explanation for why this consequence seems wrong is that it threatens to radically undermine the connection between justification and truth. A proper functionalist, by contrast, might maintain that when such cognitively penetrated seemings are produced in human beings, this is due either to cognitive malfunction or to one of the non-truth aimed facets of our cognitive design plan (either of which, according to her view, would render the belief unjustified). See Tucker (2014a) however for an argument that proper functionalists also suffer from cognitive penetration problems.

b. Proper Functionalism and Virtue Epistemology

According to John Greco (1993: 414), “the central idea of virtue epistemology is that, Gettier problems aside, knowledge is true belief which results from one’s cognitive virtues.” Similarly, Sosa (1993: 64) characterizes it as consisting of a family of theories which may be seen as “varieties of a single more fundamental option in epistemology, one which puts the explicative emphasis on truth-conducive intellectual virtues or faculties.”

Virtue epistemology is often thought of as coming in at least two varieties. Virtue responsibilists emphasize character traits—intellectual virtues such as open-mindedness, conscientiousness, perseverance in seeking the truth, an so on. Virtue reliabilists emphasize cognitive faculties, abilities, or competencies. (See Virtue Epistemology for more details). Of these two, it is virtue reliabilism that is most akin to proper functionalism. Accordingly, virtue reliabilism serves as a closer competitor. Or rather, since Greco (1993: 414) and Sosa (1993: 64) have both classified proper functionalism as a version of virtue epistemology, perhaps it should be said that it is the non-proper-functionalist versions thereof which may be seen as close competitors. For ease of exposition, the following discussion will focus on Sosa’s development of such a version.

According to Sosa’s (2015: 10) virtue theory of knowledge, knowledge is “apt belief” where apt belief is “belief that gets it right through competence rather than luck.” More precisely, according to Sosa, an apt belief is a belief that sufficiently manifests an “epistemic competence” (that is, a competence to get at the truth) (p. 9), where “a competence is in turn understood as a disposition to succeed in a given field of aimings, these being performances with an aim, whether the aim be intentional and even conscious, or teleological and functional” (p. 2). Note the similarity to proper functionalism here. Sosa’s epistemic competences are akin to Plantinga’s truth-aimed cognitive faculties. Both involve the property of being aimed at the formation of true beliefs, and both (when all goes well) are exercised in a way that is conducive to the fulfilment of that aim.

One way in which Sosa’s epistemic competencies differ from Plantinga’s truth-aimed cognitive faculties, however, is that the former do not initially seem to presuppose any notion of a design plan. And this might make Sosa’s theory more adept at accommodating things like Swampman scenarios (see the discussion in Section 1c). Indeed, it was Sosa (1993) who made famous that objection to proper functionalism. It might also make Sosa’s view more appealing to those who are both naturalistically inclined and skeptical about the prospects for a naturalistically acceptable account of cognitive proper function.

Proper functionalists have called into question whether Sosa’s view does in fact have these advantages. Plantinga (1993c: 79) has argued, for example, that in order to handle the case of The Case of the Epistemically Serendipitous Brain Lesion (discussed in Section 1a), Sosa’s epistemic virtues must involve competencies or faculties that are subject to proper function conditions. If that is right, then, as Plantinga (p. 81) points out, Sosa’s view (developed so as not to be subject to this counterexample) becomes a variety of proper functionalism. It should be noted however that virtue epistemologists may have other ways of dealing with this case. John Greco (2010: 152) has suggested, for instance, that “in the brain lesion case, the problem is not so much a lack of health as it is a lack of cognitive integration.” “The cognitive processes associated with the brain lesion,” claims Greco, “are not sufficiently integrated with other of the person’s cognitive dispositions so as to count as being part of cognitive character.” Whether this reply is successful may turn on just what is necessary for a cognitive process to exhibit the kind of cognitive integration required. Greco (2010: 152) suggests his own, non-proper-functionalist criteria. But it is open to proper functionalists to argue that part of what is required is incorporation into one’s cognitive design plan.

Since then, Boyce and Moon (2015) argued that there are other kinds of cases that pose a challenge to the claim that a true belief manifesting a competence is sufficient for its being an item of knowledge. As noted in Section 1c, Boyce and Moon propose a counterexample to what they regard as the central intuition underlying the Swampman Objection to proper functionalism. Their counterexample employs some of the cognitive science literature on initial knowledge, which supports the claim that human beings sometimes come to know things by way of innate, unlearned cognitive responses (see for example Spelke, 1994). Drawing from this literature (as well as from Bergmann, 2006:116-121), Boyce and Moon argue that some of these innate responses are merely contingently appropriate ways of forming beliefs (where the appropriateness at issue is of a kind necessary for warrant). They argue that while these responses are appropriate for human beings, given the kind environments to which humans are adapted, the same need not have been true for other kinds of beings.

Boyce and Moon then go on to argue that these facts entail there are possible cases involving two cognitive agents, who are members of different species, coming to hold the same belief, in the same way, in the same environment, but in which that belief is warranted for one of them (on account of its resulting from a way of forming beliefs that is appropriate for members of that species) but not the other. They further argue that not only do these cases furnish counterexamples to the central intuition motivating the Swampman objection to proper functionalism, but that they also provide a challenge to alternative theories. Boyce and Moon suggest, for instance, that they afford potential counterexamples to Sosa’s theory, at least insofar as it does not recognize factors such as proper function conditions or species membership as relevant to competence possession.

Proper functionalists point to the kinds of cases alluded to above as lending support to the view that a belief having arisen by way of cognitive proper function is necessary for it to count as an item of knowledge. It should be acknowledged, however, that virtue epistemologists have pointed to other kinds of cases in which the opposite seems true. John Greco (2010: 151-153) has noted, for example, that there appear to be “cases of improper function that actually increase a person’s capacity to know.” Greco cites various cases documented by the neurologist Oliver Sacks (1970) in order to illustrate this point. “An obvious example,” says Greco, “is the story of autistic twins, who enjoyed incredible mathematical abilities associated with their autism.” Another case is that of “a man whose illness resulted in an increase in detail and vividness regarding childhood memory.” So much so, Greco notes, that when “these memories were put to use in accurate and detailed paintings of the man’s hometown in Italy…the man came to be considered an expert on the layout and appearance of that town, even though he had not visited there in decades.” Greco claims that these are cases in which “dysfunction gave rise to knowledge.”

What might a proper functionalist say in response to these scenarios? A couple of strategies are suggested by Plantinga (1993c: 74-75) in a reply to Richard Feldman (1993: 48-49). Feldman also points to these kinds of cases as creating difficulties for proper functionalism; in particular, Feldman cites the case of the autistic twins described above. As Feldman notes, these twins had the ability to “just see” (apparently without counting) that the number of matches that had fallen out of a box was 111. In his reply, Plantinga further notes that these same twins could also “just see,” it seems, whether a given six or eight digit number was prime. The first strategy Plantinga suggests for dealing with these cases is to call into question whether the individuals involved really do acquire knowledge in the scenarios described. The second is to concede (at least for the sake of argument) that they do, but argue that this is consistent with proper functionalism.

Regarding the first strategy, Plantinga notes that while the twins mentioned above can in fact reliably identify prime numbers, they lack, according to Sacks, the concept of multiplication. But if the twins lack the concept of multiplication, Plantinga argues, it is not clear that they genuinely grasp the concept of a prime number; so it is not clear that they have the relevant beliefs. Plantinga concedes, however, that this is a less plausible thing to say regarding the twins’ ability to discern the number of matches that had fallen out of a box. Here Plantinga turns to the second strategy. He concedes that while the twins’ “faculties obviously seem to malfunction in some ways,” it is doubtful that they are malfunctioning in producing the belief that there are 111 matches on the floor. Plantinga suggests that, perhaps, the twins have a different design plan than that of other human beings, and that this belief-forming tendency of theirs is subject to proper function conditions. In support of this claim, he notes that it seems possible that this remarkable ability of theirs might become damaged (in such a way that it is no longer reliable); in that case, he contends, we would be inclined to say that this ability had malfunctioned.

Another possibility open to the proper functionalists is to concede that these are cases in which cognitive malfunction enables the acquisition of knowledge, but only by way of truth-aimed proper function. If a typical human being, as a result of cognitive malfunction, suddenly found it seeming to her that she could just see that 111 matches had fallen out of a box, we might doubt that she really knows there are 111 matches. We might think that this belief, formed by way of this new-found tendency of hers, fails to count as knowledge, unless or until she has independent confirmation that the tendency is reliable. Once she does have such confirmation, we might concede that the resulting beliefs do count as knowledge, but only because she learned this to be a reliable way of getting at the truth. So perhaps, in at least some of the cases at issue, the individuals in question do acquire knowledge via belief-forming tendencies resulting from cognitive malfunction, but only by way of having learned those tendencies to be reliable. And if this learning occurs by way of cognitive processes that are in accord with proper function, these cases pose no difficulties for a proper functionalist theory.

This is perhaps not a plausible thing to say regarding all of these cases, however. It is not as plausible a thing to say regarding the individual whose illness caused him to form detailed memorial beliefs pertaining to his hometown in Italy, for example. One reason this a less plausible thing to say concerning that case is that the person in question is (presumably) forming these beliefs in response to memory phenomenology, which is an epistemically appropriate way for human beings to form beliefs downstream from experience. We would be much less likely to judge this person as having knowledge if these same beliefs arose, say, in response to the kind of phenomenology associated with a vivid daydream, unaccompanied by memorial seemings, even if the resulting beliefs should turn out to be reliably formed. So, the proper functionalist might say, if cognitive malfunction is somehow enabling the acquisition of knowledge in this case, it is not by virtue of causing the subject to respond deviantly to his experience (since, in that regard, he is responding as proper function dictates). It must, rather, be by virtue of its causing some deviation upstream from experience (that is, by virtue of its producing an abnormality in the manner in which the subject’s memorial experiences are produced). Whether this creates a significant problem for proper functionalism, furthermore, may depend on just how the malfunction in question enables knowledge.

However exactly memory information is processed, stored, and retrieved so as to generate belief-producing memorial experiences, it is plausible that the cognitive system responsible (or set of systems responsible) has different functions associated with it. One of these functions is to generate experiences that reliably produce true beliefs. But no doubt there are other functions associated with this system that do not pertain to that goal (indeed, some may even be in tension with it). It is plausible, for instance, that some of those functions pertain to filtering information as it comes in, either by preventing some of that information from being stored in the first place, discarding some of that information after it has been stored, or preventing some of it from being encoded in the relevant experiences. The purpose of this filtration process might not be to secure the production of true beliefs, but to prevent various kinds of information overload, or to highlight important items information at the expense of discarding others. Plausibly, what occurs in the case at issue is that a malfunction results in the suppression of these kinds of functions, leaving various other truth-aimed functions associated with the production of the relevant memorial experiences intact.

This consideration suggests yet another possible strategy the proper functionalist might have for dealing with these kinds of cases. Yes, she might grant, some of these are cases in which cognitive malfunction enables knowledge, but not by way of interfering with truth-aimed cognitive proper function (at least not with respect to the process that issued in the relevant beliefs). In at least some of these cases, the malfunction enables knowledge by preventing various non-truth-aimed aspects of cognitive proper function from interfering with or dampening various truth-aimed aspects (or perhaps by preventing some truth-aimed aspects of cognitive proper function from interfering with or dampening various other truth-aimed aspects). The consequence is that certain truth-aimed aspects of cognitive proper function result in various items of knowledge they would not have otherwise produced. So even though these are cases in which cognitive malfunction enables knowledge, the proper functionalist might say, they are not counterexamples to the claim that knowledge itself must come by way of truth-aimed cognitive proper function. Or, she might insist, to the extent to which it is unclear that these purported items of knowledge come by way of truth-aimed cognitive proper function, it is also unclear that we should count them as genuine items of knowledge.

It should pointed out that many virtue theories of knowledge also quite naturally lend themselves to virtue theories of justification. As Sosa (2007: 22-23) points out, for instance, an agent can manifest skill in a performance even when that performance fails to achieve its aim or achieves it merely by luck. An archer might take a skillful shot (to use one of Sosa’s frequent analogies), for instance, while still missing the target (or hitting it only by luck) on account of erratic wind conditions. Similarly, a believer might manifest her skill at coming to hold true beliefs while nonetheless getting it wrong (or getting it right only by luck) on account of being in an epistemically bad environment. Under these circumstances, the belief in question may be said (in Sosa’s terminology) to be “adroit” but not “apt” (p. 23). A belief that is adroit, according to Sosa, may be said to be justified (in one good sense at least) even if it is not an item of knowledge (BonJour and Sosa: 2003: 157).

A Sosa (2015: 26-27) himself is well aware, the having of a skill presupposes something like a normal environment. As Sosa points out, we do not say that a person lacks driving skill merely because she is disposed to perform poorly on an icy road in the midst of a snowstorm. What matters is whether she is disposed to perform well under ordinary driving conditions. Similarly, what matters for whether an agent is skilled at coming to hold true beliefs is whether she is capable of doing so in a certain kind of environment. But which sort of environment is the relevant one? According to Bergmann, this question points to an area in which a proper functionalist theory of justification has the advantage.

As Bergmann (2006: 142-143) notes, in a 1991 work  Sosa takes justification to be relativized to an environment. The person in the demon world has justified beliefs relative to our environment, according to this view, but not relative to her own. Similarly, the beliefs of alien cognizers who have radically different methods of belief formation than we do (ones that are adapted to their own environment) may have beliefs that are justified relative to their environment but not relative to ours. Bergmann argues however that our ordinary concept of justification does not appear to be relativized in this way.

In later work, as Bergmann also points out, in 2003 Sosa  holds that there are two different senses in which a belief might be said to be justified. A belief is “adroit-justified” if the method by which it is formed is reliable in the actual world, and “apt-justified” if the method by which it is formed is reliable in the subject’s world. As Bergmann notes, however, this view does not account for our intuition that there is a single sense in which our beliefs, the alien cognizers’ beliefs, and the demon victims’ beliefs are all justified.

Proper functionalism, by contrast, Bergmann maintains, has no difficulty accommodating these intuitions, since it holds that the relevant environment is the one specified by the design plan (which is the same between us and the demon victims but different for the alien cognizers). Whether Bergmann points to a genuine advantage of his theory over Sosa’s in this regard has, however, been disputed. Markie (2009: 374-377) argues, for example, that Bergmann’s own theory faces disadvantages akin to those he attributes to Sosa’s.

As with most disputed views, the extent to which one is drawn to proper functionalist theories will depend in large measure on which intuitions one has, the relative weight one assigns to them, one’s assessment of how well the theories in question accommodate those intuitions, and whether their rivals do any better. And here one’s mileage may vary. But it is a safe bet that proper functionalist theories will continue to serve as serious contenders for the foreseeable future.

4. References and Further Reading

  • Bergmann, Michael. 2006. Justification Without Awareness: A Defense of Epistemic Externalism (Oxford: Oxford UP).
  • Bergmann, Michael. 2013. “Externalist Justification and the Role of Seemings” Philosophical Studies 166: 163-184.
  • BonJour, Laurence and Sosa, Ernest. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues (Malden, MA: Blackwell Publishing).
  • Boyce, Kenneth and Plantinga, Alvin. 2012. “Proper Functionalism” The Continuum Companion to Epistemology, ed. Andrew Cullison (London: Continuum International Publishing Group).
  • Boyce, Kenneth and Moon, Andrew. 2015. “In Defense of Proper Functionalism: Cognitive Science Takes on Swampman,” Synthese Online First: DOI 10.1007/s11229-015-0899-6. http://link.springer.com/ article/10.1007/s11229-015-0899-6.
  • Crisp, Thomas M. “Gettier and Plantinga’s Revised Account of Warrant” Analysis 60: 42-50.
  • Chignell, Andrew. 2003. “Accidentally True Belief and Warrant” Synthese 137: 445-458.
  • Cullison, Andrew. 2013. “Seemings and Semantics” Seemings and Justification, ed. Chris Tucker (Oxford: Oxford UP).
  • Feldman, Richard. 1993. “Proper Functionalism” Nous 27: 34-50.
  • Goldman, Alvin “The Sciences and Epistemology” The Oxford Handbook of Epistemology (Oxford: Oxford UP).
  • Graham, Peter. 2012. “Epistemic Entitlement” Nous 46: 449-482.
  • Graham, Peter. 2014. “Warrant, Functions, History” Naturalizing Epistemic Virtue, eds. Abrol Fairweather and Owen Flanagan (Cambridge: Cambridge University Press).
  • Greco, John. 1993. “Virtues and Vices of Virtue Epistemology” Canadian Journal of Philosophy 23: 413-432.
  • Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity (Cambridge: Cambridge University Press).
  • Kvanvig, Jonathan L. (ed.). 1996 Warrant in Contemporary Epistemology: Essay’s in Honor of Plantinga’s Theory of Knowledge (London: Rowman & Littlefield Publishers).
  • Lehrer, Keith, and Cohen, Stewart. 1983. “Justification, Truth, and Coherence” Synthese 55: 191-207.
  • Long, Todd R. 2012. “Mentalist Evidentialism Vindicated (and a Super-Blooper Epistemic Design Problem for Proper Function justification)” Philosophical Studies 157: 251-266.
  • Markie, Peter. 2009. “Justification and Awareness” Philosophical Studies 146: 361-377.
  • McNabb, Tyler Dalton. 2015. “Warranted Religion: Answering Objections to Alvin Plantinga’s Epistemology” Religious Studies 51: 477-495.
  • Millikan, Ruth. 1984. “Naturalist Reflections on Knowledge” Pacific Philosophical Quarterly, 4:  315-334.
  • Otte, Richard. 1987. “A Theistic Conception of Probability” Faith and Philosophy 4: 427-447.
  • Plantinga, Alvin. 1993a. Warrant: The Current Debate (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993b. Warrant and Proper Function (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993c. “Why We Need Proper Function” Nous 27: 66-82.
  • Spelke, Elizabeth. 1994. “Initial Knowledge: Six Suggestions” Cognition 50: 431-445.
  • Sacks, Oliver. 1970. The Man Who Mistook His Wife for a Hat (New York: HarperCollins).
  • Sosa, Ernest. 1993. “Proper Functionalism and Virtue Epistemology” Nous 27: 51-65.
  • Sosa, Ernest. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Vol. 1 (Oxford: Oxford UP).
  • Sosa, Ernest. 2015. Judgement and Agency (Oxford: Oxford UP).
  • Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism” Evidence and Religious Belief, eds. Kelly James Clark and Raymond J. VanArragon (Oxford: Oxford UP).
  • Tucker, Chris. 2014a. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too” Dialectica 68: 35-62.
  • Tucker, Chris. 2014b. “On What Inferentially Justifies What: The Vices of Reliabilism and Proper Functionalism,” Synthese 191: 3311-3328.
  • Wolterstorff, Nicholas. 2010. “Ought to Believe—Two Concepts” Practices of Belief: Selected Essays, Vol. 2, ed. Terence Cuneo (Cambridge: Cambridge University Press).

 

Author Information

Kenneth Boyce
Email: boyceka@missouri.edu
University of Missouri
U. S. A.

Dynamic Epistemic Logic

This article tells the story of the rise of dynamic epistemic logic. The rise began in the 1960s with the creation and development of epistemic logic, the logic of knowledge, Then in the late 1980s came dynamic epistemic logic, the logic of change of knowledge. Much of it was motivated by puzzles and paradoxes.

The number of active researchers in these logics grows significantly every year because there are so many relations and applications to computer science, to multi-agent systems, to philosophy, and to cognitive science.

The modal knowledge operators in epistemic logic are formally interpreted by employing binary accessibility relations in multi-agent Kripke models (relational structures), where these relations should be equivalence relations to respect the properties of knowledge. The operators for change of knowledge correspond to another sort of modality, more akin to a dynamic modality. A peculiarity of this dynamic modality is that it is interpreted by transforming the Kripke structures used to interpret knowledge, and not, at least not on first sight, by an accessibility relation given with a Kripke model. Although called “dynamic epistemic logic,” this two-sorted modal logic applies to more general settings than the logic of merely S5 knowledge.

The present article discusses in depth the early history of dynamic epistemic logic. It then mentions briefly a number of more recent developments involving factual change, one (of several) standard translations to temporal epistemic logic, and a relation to situation calculus (a well-known framework in artificial intelligence to represent change). Special attention is then given to the relevance of dynamic epistemic logic for belief revision, for speech act theory, and for philosophical logic. The part on philosophical logic pays attention to Moore sentences, the Fitch paradox, and the Surprise Examination.

Table of Contents

  1. Introduction
  2. An Example Scenario
  3. A History of DEL
    1. Announcements
    2. Other Informative Events
  4. DEL and Belief Revision
  5. DEL and Language
  6. DEL and Philosophy
  7. References and Further Reading

1. Introduction

In this overview we tell the story of the rise of dynamic epistemic logic. It is a bit presumptious to call it a rise, but we can only observe this rather peculiar phenomenon. The number of active researchers in these logics grows every year because there are so many relations to computer science, to multi-agent systems, to philosophy, and to cognitive science. It all began with the logic of knowledge in the 1960s, and much of it was motivated by puzzles and paradoxes.

Dynamic logic is the logic of changing knowledge. The starting point of dynamic epistemic logic (DEL) is therefore the logic of knowledge. A founding publication is [42]. We refer to [41] for an overview of epistemic logic and references. A key feature of epistemic logic is that the information state of several agents can be represented by a Kripke model. Given a set of agents and a set of propositional variables, a Kripke model consists of a set of states, a set of accessibility relations (each one a binary relation on the domain), namely one for each agent, and a valuation (that tells which propositional variables are true in which states). In epistemic logic the set of states of a Kripke model is interpreted as a set of epistemic alternatives. The information state of an agent consists of those epistemic alternatives that are possible according to the agent, which is represented by the binary accessibility relation Rα. An agent α knows that a proposition φ is true in a state a of a Kripke model M (M; aKαφ), if and only if that proposition φ is true in all the states that agent α considers possible in that state (that is, which are Rα-accessible from a). A proposition known by agent α may itself pertain to the knowledge of some agent (for instance if one considers the formula KαKβψ). In this way, a Kripke model with accessibility relations for all the agents represents the (higher-order) information of all relevant agents simultaneously.

In DEL, information change is modeled by transforming Kripke models. Since DEL is mostly about information change due to communication, the model transformations usually do not involve factual change. The bare physical facts of the world remain unchanged, but the agents’ information about the world changes. In terms of Kripke models that means that the accessibility relations of the agents have to change (and consequently the set of states of the model might change as well). Modal operators in dynamic epistemic languages denote these model transformations. The accessibility relation associated with these operators is not one within the Kripke model, but pertains to the transformation relation between the Kripke models, as the example in the next section will show.

In Section 2 an example scenario is presented which can be captured by DEL. In Section 3 an historical overview of the main approaches in DEL is presented, with details on their modelling techniques. Section 4 discusses how to model belief revision in DEL. Section 5 connects ideas between speech act theory and DEL. Finally, Section 6 is on the relation between DEL and philosophy: it deals with Moore-sentences, the Fitch-paradox, and the Surprise Examination.

2. An Example Scenario

4

Figure 1: A Kripke model for the situation in which two agents are each given a red or a white card.

Consider the following scenario: Ann and Bob are each given a card that is either red or white on one side (the face side) and nondescript on the other side (the back side). They only see their own card, and so they are ignorant about the other agent’s card. There are four possibilities: both have white cards, both have red cards, Ann has a white card and Bob has a red card, or the other way round. These are the states of the model, and are represented by informative names such as rw, meaning Ann was dealt a red card (r) and Bob was dealt a white card (w). Let us assume that both have red cards, that is, let the actual state be rr. This is indicated by the double lines around state rr in Figure 1. The states of the Kripke model are connected by lines, which are labeled (α or β, denoting Ann or Bob respectively) to indicate that the agents cannot distinguish the states thus connected. (To be complete it should also be indicated that no state can ever be distinguished from itself. For readability these “reflexive lines” are not drawn, but indeed the accessibility relations Rα and Rβ are equivalence relations, since epistemic indistinguishability is reflexive, symmetric and transitive.) In the model of Figure 1 there are no α-lines between those states where Ann has different cards, that is, she can distinguish states at the top, where she has a red card, from those at the bottom, where she has a white one. Likewise, Bob is able to distinguish the left half from the right half of the model. This represents the circumstance that Ann and Bob each know the colour of their own card but not the colour of the other’s card. In the Kripke model of Figure 1 we also see that the higher-order information is represented correctly. Both agents know that the other agent knows the colour of his or her card, and they know that they know this, and so on. It is remarkable that a single Kripke model can represent the information of both agents simultaneously.

rr2Figure 2: A Kripke model for the situation after Ann tells Bob she has a red card.

                                    bb 2Figure 3: A Kripke model for the situation after Ann might have looked at Bob’s card.

Suppose that after picking up their cards, Ann truthfully says to Bob “I have a red card”. The Kripke model representing the resulting situation is displayed in Figure 2. Now both agents know that Ann has a red card, and they know that they know she has a red card, and so on: it is common knowledge among them. (A formula φ is common knowledge among a group of agents if everyone in the group knows that φ, everyone knows that everyone knows that φ, and so on.) Hence there is no need anymore for states where Ann has a white card, so those do not appear in the Kripke model. Note that in the new Kripke model there are no longer any lines labeled β. No matter how the cards were dealt, Bob only considers one state to be possible: the actual one. Indeed, Bob is now fully informed.

Now that Bob knows the colour of Ann’s card, Bob puts his card face down on the table, and leaves the room for a moment. When he returns he considers it possible that Ann took a look at his card, but also that she didn’t. Assuming she did not look, the Kripke model representing the resulting situation is the one displayed in Figure 3. In contrast to the previous model, there are in this model lines for Bob again. This is because he is no longer completely informed about the situation. He does not know whether Ann knows the colour of his card, yet he still knows that both Ann and he have a red card. Only his higher-order information has changed. Ann on the other hand knows whether she has looked at Bob’s card and also knows whether she knows the colour of Bob’s card. She also knows that Bob considers it possible that she knows the colour of his card. In the model of Figure 3 we see that two states representing the same factual information can differ by virtue of the lines connecting them to other states: the state rr on the top and rr on the bottom only differ in higher-order information.

In this section, we have seen two ways in which information change can occur. Going from the first model to the second, the information change was public, in the sense that all agents received the same information. Going from the second to the third model involved information change where not all agents had the same information, because Bob did not know whether Ann looked at his card while he was away. The task of DEL is to provide a logic with which to describe these kinds of information change.

3. A History of DEL

DEL did not arise in a scientific vacuum. The “dynamic turn” in logic and semantics ([72], [34] and [60]) very much inspired DEL, and DEL itself can also be seen as a part of the dynamic turn. The formal apparatus of DEL is a lot like propositional dynamic logic (PDL) [40] and quantified dynamic logic (QDL) [39]. There is also a relation to update semantics (US) [36, 93] — not all formulas are interpreted dynamically, as there, but formulas and updates are clearly distinguished.

The study of epistemic logic within computer science and AI led to the development of epistemic temporal logic (ETL) in order to model information change in multi-agent systems (see [25] and [55]). Rather than model change by modal operators that transform the model, change is modeled by the progression of time in these approaches. Yet the kinds of phenomena studied by ETL and DEL largely overlap.

After this brief sketch of the context in which DEL was developed, the remainder of the section focuses on the development of its two main approaches. The first is public announcement logic, which is presented in Section 3.1. The second, presented in Section 3.2, is the dominant approach in DEL (sometimes identified with DEL).

a. Announcements

The original publication: Plaza The first dynamic epistemic logic, called public announcement logic (PAL), was developed by Plaza in [61]. This was published in 1989. The example where Ann says to Bob that she has a red card is an example of a public announcement. A public announcement is a communicative event where all agents receive the same information and it is common knowledge among them that this is so. The language of PAL is given by the following Backus-Naur Form:

3.1 aaaBesides the usual propositional language, Kαφ is read as agent α knows that φ, and [φ] ψ is read as after φ is announced ψ is the case. In the example above, we could for instance translate “After it is announced that Ann has a red card, Bob knows that Ann has a red card” as [rα]Kβrα.

An announcement is modeled by removing the states where the announcement is false, that is, by going to a submodel. This model transformation is the main feature of PAL’s semantics.

 

3.1 bbbb

In clause (v) the condition that the announced formula be true at the actual state entails that only truthful announcements can take place. The model MΙφ is the model obtained from M by removing all non-φ states. The new set of states consists of the φ-states of M. Consequently, the accessibility relations as well as the valuation are restricted to these states . The propositional letters true at a state remain true after an announcement. This reflects the idea that communication can only bring about information change, not factual change.

Gerbrandy and Groeneveld’s approach A logic similar to PAL was developed independently by Gerbrandy and Groeneveld in [32], which is more extensively treated in Gerbrandy’s PhD thesis [30]. There are three main differences between this approach and Plaza’s approach. First of all, Gerbrandy and Groeneveld do not use Kripke models in the semantics of their language. Instead, they use structures called possibilities which are defined by means of non-wellfounded set theory [1], a branch of set theory where the foundation axiom is replaced by another axiom. Possibilities and Kripke models are closely linked: possibilities correspond to bisimulation classes of Kripke models [18]. Later, Gerbrandy provided semantics without using non-wellfounded set theory for a simplified version of his public announcement logic [31].

The second difference is that Gerbrandy and Groeneveld also consider announcements that are not truthful. In their view, a logic for announcements should model what happens when new information is taken to be true by the agents. Hence, according to them, what happens to be true deserves no special status. This is more akin to the notion of update in US. In terms of Kripke models this means that by updating, agents may no longer consider the actual state to be possible, that is, Rα may no longer be reflexive. In a sense it would therefore be more accurate to call this logic a dynamic doxastic logic (a dynamic logic of belief) rather than a dynamic epistemic logic, since according to most theories, knowledge implies truth, whereas beliefs need not be true.

Thirdly, their logic is more general in the sense that subgroup announcements are treated (where only a subgroup of the group of all agents acquires new information); and especially private announcements are considered, where only one agent gets information. These announcements are modeled in such a way that the agents who do not receive information do not even consider it possible that anyone has learned anything. In terms of Kripke models, this is another way in which Rα may lose reflexivity.

Adding common knowledge Semantics for public, group and private announcements using Kripke models was proposed by Baltag, Moss, and Solecki in [14]. This semantics is equivalent to Gerbrandy’s semantics (as was shown in [58]). The main contribution in [14] to PAL was that their approach also covered common knowledge, which is an important concept when one is interested in higher-order information and plays an important role in social interaction (cf. [92]). The inclusion of common knowledge poses a number of technical problems.

b. Other Informative Events

Groeneveld and Gerbrandy’s approach In addition to a logic for announcements Gerbrandy also developed a system for more general information change involving many agents, each of whom may have a different perspective. This is for instance the case when Ann may look at Bob’s card.

In order to model this information change it is important to realize that distinct levels of information are not distinctly represented in a Kripke model. For instance what Ann actually knows about the cards depends on Rα, but what Bob knows about what Ann knows about the cards depends on Rα as well. Therefore changing something in the Kripke model, such as cutting a line, changes the information on many levels. In order to come to grips with this issue it really pays to use non-wellfounded semantics. One of the ways to think about the possibilities defined by Gerbrandy and Groeneveld is as infinite trees. In such a tree, distinct levels of information are represented by certain paths in the tree. By manipulating the appropriate part of the tree, one can change the agents’ information at the appropriate level. This insight stems from Groeneveld [37] and was also used by Renardel de Lavalette in [62], who introduces treelike lean modal structures using ordinary set theory in the semantics of a dynamic epistemic logic.

Van Ditmarsch’s approach Inspired by Gerbrandy and Groeneveld’s work, Van Ditmarsch developed a dynamic epistemic logic for modeling information change in knowledge games, where the goal of the players is to obtain knowledge of some aspect of the game. Clue and Battleships are typical examples of knowledge games. Players are never deceived in such games and therefore the dynamic epistemic logic of Gerbrandy and Groeneveld in which reflexivity might be lost, seems unsuitable. In Van Ditmarsch’s Ph.D. thesis [86], a logic is presented where all model transformations are from Kripke models with equivalence relations to Kripke models with equivalence relations, which is thus tailored to information change involving knowledge. This approach was further streamlined by Van Ditmarsch in [87] and later extended to include concurrent actions (when two or more events occur at the same time) in [90]. One of the open problems of these logics is that a completeness proof for the axiom systems has not been obtained.

The dominant approach: Baltag, Moss and Solecki Another way of modeling complex informative events was developed by [14], which has become the dominant approach in DEL. Their approach is highly intuitive and is lying at the basis of many papers in the field: indeed, many refer to this approach simply as DEL. Their key insight was that information changing events can be modeled in the same way as situations involving information. Given a situation, such as when Ann and Bob each have a card, one can easily provide a Kripke model for such a situation. One simply considers which states might occur and which of those states the agents cannot distinguish. One can do the same with events involving information. Given a scenario, such as Ann possibly looking at Bob’s card, one can determine which events might occur: either she looks and sees it is red (she learns that) or she sees that it is white (she learns that ), or she does not look at the card (she learns nothing new, indicated by the tautology Τ). It is clear that Ann can distinguish these particular events, but Bob cannot. Such models are called action models or event models.

An event model A is a triple 1 consisting of a set of events E, a binary relation 2 over E for each agent, and a precondition function 3 which assigns a formula to each event. This precondition determines under what circumstances the event can actually occur. Ann can only truthfully say that she has a red card, if in fact she does have a red card. The event model for the event where Ann might have looked at Bob’s card is

t2

Figure 4: An event model for when Ann might look at Bob’s card.

t2

Figure 5: The product update for the models of Figure 3 and Figure 4.

given in Figure 4, where each event is represented by its precondition.

The Kripke model of the situation following the event is constructed with a procedure called a product update. For each state in the original Kripke model one determines which events could take place in that state (that is, one determines whether the precondition of the event is true at that state). The set of states of the new model consists of those pairs of states and events (a, e), which represent the result of event e occurring in state a. The new accessibility relation is now easy to determine. If two states were indistinguishable to an agent and two events were also indistinguishable to that agent, then the result of those events taking place in those states should also be indistinguishable. This implication also holds the other way round: if the result of two events happening in two states are indistinguishable, then the original states and events should be indistinguishable as well. (Van Benthem [73] characterizes product update as having perfect recall, no miracles, and uniformity .) The basic facts about the world do not change due to a

comp tFigure 6: The product update for the models of Figure 1 and Figure 4.

merely communicative event. And so the valuation in <a, e> simply follows the old valuation in a.

longThe model in Figure 5 is the result of a product update of the model in Figure 2 and the event model of Figure 4. One can see that this is the same as the model in Figure 3 (except for the names of the states), which indicates that product update yields the intuitively right result.

One may wonder whether the model in Figure 4 represents the event accurately. According to the event model Bob considers it possible that Ann looks at his card and sees that it is white. Bob, however, already knows that the card is red, and therefore should not consider this event possible. This criticism is justified and one could construct an event model that takes this into account, but the beauty of the event model is precisely that it is detached from the agents’ information about the world in such a way that it provides an accurate model of just the information the agents have about the event. This means that product update yields the right outcome regardless of the Kripke model of the situation in which the event occurred. For instance taking the product update with the model of Figure 1, yields the Kripke model depicted in Figure 6, which represents the situation where Ann might look at Bob’s card immediately after the cards were dealt. The resulting model also represents that situation correctly. This indicates that in DEL static information and dynamic information can be separated.

In the logical language of DEL these event models appear as modalities [A, e], where e is taken to be the event that actually occurs. The language is given by the following Backus-Naur Form

gestClauses (i)–(iv) are the same as for PAL. In clause (v) hmmm is the reflexive transitive closure of the union of the accessibility relations of members of Γ. Clause (vi) is a standard clause for dynamic modalities, except that the accessibility relation for dynamic modalities is a relation on the class of all Kripke models. In clause (vii) it is required that the precondition of the event model is true in the actual state, thus ensuring that <a, e>, the new actual state, exists in the product update. Clauses (viii) and (ix) are the usual semantics for non-deterministic choice and sequential composition.

Not only informative events where different agents have a different perspective can be modeled in DEL, but also public announcements can be thought of in terms of event models. A public announcement can be modeled by an event model containing just one event: the announcement. All agents know this is the actual event, so it is the only event considered possible. Indeed, DEL is a generalization of PAL.

Criticism, alternatives, and extensions Many people feel somewhat uncomfortable with having models as syntactical objects. Baltag and Moss have tried to accommodate this by proposing different languages while maintaining an underlying semantics using event models [10, 13]. This issue is extensively discussed in [91, Section 6.1]. There are alternatives using hybrid logic [70], and algebraic logic ([11], [12]). Most papers just use event models in the language.

DEL has been extended in various ways. Operators for factual change [85, 81] and past operators from temporal logic have been added [64, 5]. DEL has been combined with probability [45], justification logic [63] and extended such that belief revision is also within its grasp. Connections have been made between DEL and various other logics. Its relation to PDL, ETL [80], AGM belief revision, and situation calculus [83] has been studied. DEL has been applied to a number of puzzles and paradoxes from recreational mathematics and philosophy. It has also been applied to problems in game theory (see [79] for a very detailed survey), as well as issues in computer security [94]. Complexity and succinctness of DEL has been investigated in [54, 69, 6]. Two recent overviews of DEL are [17, 78]. In the next section we pay attention to DEL and belief revision.

4. DEL and Belief Revision

Something you cannot model in DEL is changing your mind. Once you know a fact, you know it forever, that is, once kap is true, it remains true after every update. Even when we have weaker constraints on the accessibility relations (for belief, or even general accessibility), this remains the case. But sometimes, when you believe a fact, you change your mind, and you may come to believe the opposite. This is not shocking or anything, it might have been that you merely did not believe it firmly. This means a change of kap into ka-por, using the better suited belief modality ba for that: a change of bap into ba-p. In a different community, that of (AGM) belief revision, this is the most natural operation around—indeed called ‘belief revision’. In this section we shortly survey interactions between such AGM belief revision and dynamic epistemic logic.

Belief revision has been studied from the perspective of structural properties of reasoning about changing beliefs [29], from the perspective of changing, growing and shrinking knowledge bases, and from the perspective of models and other structures of belief change wherein such knowledge bases may be interpreted, or that satisfy assumed properties of reasoning about beliefs. A typical approach involves preferential orders to express increasing or decreasing degrees of belief [48, 56], where these works refer to the ‘systems of spheres’ in [51, 38]. Within this tradition multi-agent belief revision has also been investigated, for example, belief merging [46]. Belief operators are normally not explicit in the logical language, so that higher-order beliefs (I know that you are ignorant of a certain proposition) cannot be formalized. Iterated belief revision may also be problematic.

The link between belief revision and modal logic, that is, explicit belief modalities and belief change modalities in the logical language, was made in a strand of research known as dynamic doxastic logic. This was proposed and investigated by Segerberg and collaborators in works such as [68, 52, 67, 22]. These works are distinct from other approaches to belief revision in modal logics, without dynamic modal operators, such as [19, 50, 20], that also influenced the development of dynamic logics combining knowledge and belief change. In dynamic doxastic logics belief operators are in the logical language, and belief revision operators are dynamic modalities. Higher-order belief change, that is, to revise one’s beliefs about one’s own or other agents’ beliefs and ignorance, are considered problematic in dynamic doxastic logic, see [52]. In [68, 67] belief revision is restricted to propositional formulas (factual revision). There are

dynamic doxastic logics wherein [*φ] merely means belief revision with  φ according to some externally defined strategy, as in AGM style (this is the general setup in [68], not unlike the nonepistemic/doxastic modal setup in [71]), but there are also dynamic doxastic logics, such as [67], wherein [*φ] is a recipe operating on a semantic structure and outputting a novel structure, the standard approach in dynamic epistemic logic.

Belief revision in dynamic epistemic logic was initiated in [4, 88, 77, 15]. From these, [4, 88] propose a treatment involving degrees of belief and based on degrees of plausibility among states in structures interpreting such logics, so-called quantitative dynamic belief revision; whereas [77, 15] propose a treatment involving comparative statements about plausibilities (a binary relation between states denoting more/less plausible), so-called qualitative dynamic belief revision. The latter is clearly more suitable for logics of belief revision, and for notions such as conditional belief. The analogue of the AGM postulate of success must be given up when one incorporates higher-order belief change as in dynamic epistemic logic, where again a prime mover are Moore-sentences of the form ‘proposition p is true but you don’t know it’, which cannot after acceptance be believed by you. Many more works on dynamic belief revision have appeared since, for example, [33, 53, 24]. A prior, independent, strand to model belief revision was in temporal epistemic logic, and was initiated in the mid 1990s in [28]. Their integrated treatment of belief, knowledge, plausibilities, and change is similar to the more recent developments to model belief revision in dynamic epistemic logic, and the relation between the two approaches is incompletely understood.

For an example of belief revision in dynamic epistemic logic, consider one agent and a proposition p that the agent is uncertain about. The agent could be Ann, who is uncertain whether Bob has a red card, as in the proposition ro b before. We get a Kripke model depicted in Figure 7, not dissimilar from that in Figure 2. There are two states of the world, one where p is false and another one where p is true. Let us suggestively call them 0 and 1, respectively. The agent has epistemic preferences among these states. Namely, she considers it most plausible that 1 is the actual state, that is,, that p is true, and less plausible that 0 is the actual state. We write 1 < 0 where, as common in the area, the minimal element in the order is the most plausible state (and not, as maybe to be expected, the least plausible state). Let us further assume that p is false.

ppppFigure 7: Ann believes p but considers -p epistemically possible.

The agent believes a proposition when it holds in the most plausible states. For example, she believes that p is true. This is formalized as

bap

We write ba (for belief) instead of ka(for knowledge) as beliefs may be mistaken. Indeed, the agent believes that p but in fact p is false! But we also distinguish a modality for knowledge.

The agent knows a proposition when it holds in all plausible states. These are her strongest beliefs, or knowledge. In the case of this example her factual knowledge only involves tautologies such as p ∨-p This is described as

ka (p ∨-p)

Now imagine that the agent wants to revise her current beliefs. She believes that p is true, but has been given sufficient reason to be willing to revise her beliefs with -p instead. We can accomplish that when we allow a model transformation that makes the 0 state more plausible than the 1 state. There are various ways to do that. In this simple example we can simply observe that it suffices to make the state satisfying the revision formula -p, that is, 0, more plausible than the other state, 1. See Figure 8. As a consequence of that, the agent now believes -p: ba-p is true. Therefore, the revision was successful. This can already be expressed in the initial situation by using a dynamic modal operator [*-p] for the relation induced by the program “belief revision with -p”, followed by what should hold after that program is executed. In this dynamic modal setting we can then write that

-pΛbapΛ[*-p]ba-p

was already true at the outset.

lengFigure 8: Ann revises her belief with -p

In dynamic epistemic logic, unlike in the original AGM or the subsequent DDL setting, beliefs and knowledge can also be about modal formulas. For example, we not only have that bap, but we also have that bapppp the agent believes that she does not know whether p. We might say: Ann is aware that her belief in p is not very strong, that it is defeasible.

5. DEL and Language

Consider the connection between DEL and speech act theory. Speech act theory started with the work of [7], who argued that language is used to perform all sorts of actions; we make promises, we ask questions, we issue commands, and so forth. An example of a speech act is a bartender who says “The bar will be closed in five minutes” [8]. Austin distinguishes three kinds of acts that are performed by the bartender (i) the locutionary act of uttering the words, (ii) the illocutionary act of informing his clientele that the bar will close in five minutes, and (iii) the perlocutionary act of getting the clientele to order one last drink and leave.

Truth conditions, which determine whether an indicative sentence is true of false, are generalized to success conditions to determine whether a speech act is successful or not. In speech act theory there are several distinctions when it comes to the ways in which something can be wrong with a speech act [7, p. 18]. Here we do not make such distinctions and simply speak of success conditions. Searle gives in [66, p. 66] the following success conditions, among others, for an assertion that p by speaker S to hearer H:

  • S has evidence (reasons, and so forth) for the truth of p.
  • It is not obvious to both S and H that H knows (does not need to be reminded of, and so forth) p.
  • S believes p

Speech act theory has been embraced by the multi-agent systems community, for example, by the Foundation for Intelligent Physical Agents (FIPA). FIPA is an IEEE Computer Society standards organization that promotes agent-based technology and the interoperability of its standards with other technologies. It published a Communicative Act Library Specification [26] that includes a specification of the inform action, which is similar to Searle’s analysis of assertions.

It is worthwhile to join this analysis of assertions to the analysis of public announcements in PAL. It is clear from the list of success conditions that one usually only announces what one believes (or knows) to be true. So, an extra precondition for an announcement that φ by an agent α , should be that kap. Public announcements are indeed modeled in this way in [61].

As an example, consider the case when Ann tells Bob she has a red card: it is more appropriate to model this as an announcement that kara rather than the announcement that . Fortunately, these formulas were equivalent in the model under consideration. Suppose that Ann had said “We do not both have white cards”. When this is modeled as an announcement that wa, we obtain the model in Figure 9(a). However, Ann only knows this statement to be true when she in fact has a red card herself. Indeed, when we look at the result of the announcement that kawa we obtain the model in Figure 9(b). We see that the result of this

finished

Figure 9: An illustration of the difference between the effect of the announcement that φ and the announcement that kaφ and an announcement that only changes the agents’ higher-order information

announcement is the same as when Ann says that she has a red card (see Figure 2). By making presuppositions part of the announcement, we are in a way accommodating the precondition (see also [44]).

The second success condition in Searle’s analysis conveys that an announcement ought to provide the hearer with new information. In the light of DEL, one can revise this second success condition by saying that p is not common knowledge, thus taking higher-order information into account. It seems natural to assume that a speaker wants to achieve common knowledge of p, since that plays an important role in coordinating social actions; and so lack of common knowledge of p is a condition for the success of announcing p.

Consider the situation where Ann did look at Bob’s card when he was away and found out that he has a red card (Figure 9(c)). Suppose that upon Bob’s return Ann tells him “I do not know that you have a white card”. Both Ann and Bob already know this, and they also both know that they both know it. Therefore Searle’s second condition is not fulfilled, and so according to his analysis there is something wrong with Ann’s assertion. The result of this announcement is given in Figure 9(d). We see that the information of the agents has changed. Now Bob no longer considers it possible that Ann considers it possible that Bob considers it possible that Ann knows that Bob has a white card. And so the announcement is informative. One can give more and more involved examples to show that indeed change of common knowledge is a more natural requirement for announcements than Searle’s second condition, especially multi-agent scenarios.

Van Benthem [76] analyzes question and answer episodes using DEL. One of the success conditions of questions as speech acts is that the speaker does not know the answer [66, p. 66]. Therefore posing a question can reveal crucial information to the hearer in such a way that the hearer only knows the answer after the question has been posed ([74],[91, p. 61],[82]).

Professor a is program chair of a conference on Changing Beliefs. It is not allowed to submit more than one paper to this conference, a rule all authors of papers did abide to (although the belief that this rule makes sense is gradually changing, but this is besides the point here). Our program chair a likes to have all decisions about submitted papers out of the way before the weekend, since on Saturday he is due to travel to attend a workshop on Applying Belief Change. Fortunately, although there appears not to be enough time to notify all authors, just before he leaves for the workshop, his reliable secretary assures him that she has informed all authors of rejected papers, by personally giving them a call and informing them about the sad news concerning their paper.

Freed from this burden, Professor a is just in time for the opening reception of the workshop, where he meets the brilliant Dr. b. The program chair remembers that b submitted a paper to Changing Beliefs, but to his own embarrassment he must admit that he honestly cannot remember whether it was accepted or not. Fortunately, he does not have to demonstrate his ignorance to b, because b’s question ‘Do you know whether my paper has been accepted?’ does make a reason as follows: a is sure that would b’s paper have been rejected, b would have had that information, in which case b had not shown his ignorance to a. So, instantaneously, a updates his belief with the fact that b’s paper is accepted, and he now can answer truthfully with respect to this new revised belief set.

This phenomenon shows that when a question is regarded as a request [49], the success condition that the hearer is able to grant the request, that is, provide the answer to the question, must be fulfilled after the request has been made, and not before. (However, it is not commonly agreed upon in the literature that questions can be regarded as requests (cf. [35, Section 3].) This analysis of questions in DEL fits well within the broad interest in questions in dynamic semantics [3]. Recent work on DEL and questions is [2, 59, 23].

6. DEL and Philosophy

The role of public announcements as typical informative speech acts focussed the attention on a number of situations wherein that form of success cannot be achieved. This has been investigated mainly within philosophical logic, under the heading of ‘Moore sentences’ and the ‘Fitch paradox’. The ‘Moore sentence’ was introduced by Moore in [57] and his original analysis is that p ¬Kp (p is true and I don’t know/believe it) cannot sincerely be uttered. As this is an informative speech act, you are supposed to believe your beliefs. It seems incoherent, and maybe even paradoxical, to believe a proposition stating that you do not believe it. In the DEL setting we can give this a dynamic interpretation. It is then no longer paradoxical.

If I tell you “You don’t know that I play cello”, this has the conversational implicature “You don’t know that I play cello and it is true that I play cello”. This has the form p ¬Kp. Suppose I were tell you again “You don’t know that I play cello.” Then you can respond: “You’re lying. You just told me that you play cello.” We can analyze what is going on here in modal logic. We model your uncertainty, for which a single epistemic modality suffices. Initially, there are two possible worlds, one in which p is true and another one in which p is false, and that you cannot distinguish from one another. Although in fact p is true, you don’t know that: p ¬Kp. The announcement of p ¬Kp results in a restriction of these two possibilities to those where the announcement is true: in the p-world, p ¬Kp is true, but in the :p-world, p ¬Kp is false.

In the model restriction consisting of the single world where p is true, p is known: Kp. Given that Kp is true, so is¬p ∨ Kp, and ¬p ∨ Kp is equivalent to ¬(p ∧ ¬Kp), the negation of the announced formula. So, announcement of p ∧ ¬Kp makes it false! Gerbrandy [30, 31] calls this phenomenon an unsuccessful update; the matter is also taken up in [89, 43, 84].

We continue with some words on the Fitch paradox [27]. A standard analysis of the Fitch paradox is as follows – see the excellent review of the literature on Fitch’s paradox in the Stanford Encyclopedia of Philosophy [21], and the volume dedicated on knowability [65]. The existence of unknown truths is formalized as ∃p (p ∧ ¬Kp). The requirement that all truths are th-knowable is formalized as ∀p (p → ◊ Kp), where ◊ formalizes the existence of some process after which p is known, or an accessible world in which p is known. Fitch’s paradox is that the existence of unknown truths is inconsistent with the requirement that all truths are knowable.

The Moore-sentence ∧ ¬ Kp witnesses the existential statement ∃p (p ∧ ¬Kp). Assume that it is true. From ∃p (p ∧ ¬Kp) follows the truth of its instance (p ∧ ¬Kp) → ◊ K(p ∧ ¬Kp), and from that and p ∧ ¬Kp follows ◊ K(p ∧ ¬Kp). Whatever the interpretation of ◊, it results in having to evaluate K(p ∧ ¬Kp). But this is inconsistent for knowledge and belief.

We now get to the relation between knowable and DEL. The suggestion to interpret ‘knowable’ as ‘known after an announcement’ was made by van Benthem in [75], and [9] proposes a logic where ‘φ is knowable’ is interpreted in that way. In this setting, ◊p stands for ‘there is an announcement after which p (is true)’, so that ◊Kp stands for ‘there is an announcement after which p is known’, which is a form of ‘proposition p is knowable’.

For example, consider the proposition p for ‘it rains in Liverpool’. Suppose you are ignorant about p: ¬(KpK¬p). First, suppose that p is true. I can announce to you here and now that it is raining in Liverpool (according to your expectations, maybe…), after which you know that: 〈 p Kp stands for ‘p is true and after announcing p, p is known’ (〈φ〉 is the dual of [φ], that is, 〈φ〉ψ  is defined by abbreviation as ¬[φ]¬ψ ). Now, suppose that p is false. In a similar way, after I announce that, you know that; so that we have 〈¬p〉 K¬p. If you already knew whether p, having its value announced does not have any informative consequence for you. Therefore, 〈p〉 K∨ 〈¬p〉 K ¬is a validity. Therefore we also have〈p〉 (K∨  K ¬p) ∨ 〈¬p〉 (K∨  K ¬p) . We can generalize the statement ‘there is a proposition p such that after its announcement, p is known’, to ‘there exists a proposition q, such that after its announcement, p is known’, where q is not necessarily the same as p. Then we have informally captured the meaning of ◊Kp. In other words, this operator is a quantification over announcements. But we have then just proved that ◊ (K∨  K ¬p)is a validity. For more on such matters, see [9, 84].

Another paradox in philosophical logical circles that has been analyzed with DEL methods (and that has similar ‘Moore sentences’-like symptoms) is the Surprise Examination. This has been investigated in works as [30, 31, 89], and more recently by Baltag and Smets using plausibility epistemic structures, along the lines of [16].

Parts of the materials for this overview have been taken from [88, 47, 84], and subsequently revised to make it into a single comprehensive text.

7. References and Further Reading

  • [1] P. Aczel. Non-Well-Founded Sets. CSLI Publications, Stanford, CA, 1988. CSLI Lecture Notes 14.
  • [2] T. Agotnes, J. van Benthem, H. van Ditmarsch, and S. Minica. Question-Answer games, ˚ 2011.
  • [3] M. Aloni, A. Butler, and P. Dekker, editors. Questions in Dynamic Semantics. Elsevier, Amsterdam, 2007.
  • [4] G. Aucher. A combined system for update logic and belief revision. In Proc. of 7th PRIMA, pages 1–17. Springer, 2005. LNAI 3371.
  • [5] G. Aucher and A. Herzig. From DEL to EDL : Exploring the power of converse events. In K. Mellouli, editor, Proc. of ECSQARU, LNCS 4724, pages 199–209. Springer, 2007.
  • [6] G. Aucher and F. Schwarzentruber. On the complexity of dynamic epistemic logic. In Proc. of 14th TARK, 2013.
  • [7] J. L. Austin. How to Do Things with Words. Clarendon Press, Oxford, 1962.
  • [8] K. Bach. Speech acts. In E. Craig, editor, Routledge Encyclopedia of Philosophy, volume 8, pages 81–87. Routledge, London, 1998.
  • [9] P. Balbiani, A. Baltag, H. van Ditmarsch, A. Herzig, T. Hoshi, and T. De Lima. ‘Knowable’ as ‘known after an announcement’. Review of Symbolic Logic, 1(3):305–334, 2008.
  • [10] A. Baltag. A logic for suspicious players: epistemic actions and belief-updates in games. Bulletin of Economic Research, 54(1):1–45, 2002.
  • [11] A. Baltag, B. Coecke, and M. Sadrzadeh. Algebra and sequent calculus for epistemic actions. Electronic Notes in Theoretical Computer Science, 126:27–52, 2005.
  • [12] A. Baltag, B. Coecke, and M. Sadrzadeh. Epistemic actions as resources. J. of Logic Computat., 17(3):555–585, 2007.
  • [13] A. Baltag and L. S. Moss. Logics for epistemic programs. Synthese, 139:165–224, 2004.
  • [14] A. Baltag, L. S. Moss, and S. Solecki. The logic of public announcements, common knowledge, and private suspicions. In I. Gilboa, editor, Proceedings of TARK 98, pages 43–56, 1998.
  • [15] A. Baltag and S. Smets. A qualitative theory of dynamic interactive belief revision. In Proc. of 7th LOFT, Texts in Logic and Games 3, pages 13–60. Amsterdam University Press, 2008.
  • [16] A. Baltag and S. Smets. Group belief dynamics under iterated revision: fixed points and cycles of joint upgrades. In Proc. of 12th TARK, pages 41–50, 2009.
  • [17] A. Baltag, H. van Ditmarsch, and L.S. Moss. Epistemic logic and information update. In J. van Benthem and P. Adriaans, editors, Handbook on the Philosophy of Information, pages 361–456, Amsterdam, 2008. Elsevier.
  • [18] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, Cambridge, 2001. Cambridge Tracts in Theoretical Computer Science 53.
  • [19] O. Board. Dynamic interactive epistemology. Games and Economic Behaviour, 49:49–80, 2004.
  • [20] G. Bonanno. A simple modal logic for belief revision. Synthese (Knowledge, Rationality & Action), 147(2):193–228, 2005.
  • [21] B. Brogaard and J. Salerno. Fitch’s paradox of knowability, 2004. http://plato. stanford.edu/archives/sum2004/entries/fitch-paradox/.
  • [22] J. Cantwell. Some logics of iterated belief change. Studia Logica, 63(1):49–84, 1999.
  • [23] I. Ciardelli and F. Roelofsen. Inquisitive dynamic epistemic logic. Manuscript, 2013.
  • [24] C. Degremont. ´ The Temporal Mind. Observations on the logic of belief change in interactive systems. PhD thesis, University of Amsterdam, 2011. ILLC Dissertation Series DS-2010-03.
  • [25] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIT, Cambridge, Massachusetts, 1995.
  • [26] FIPA. FIPA communicative act library specification, 2002. http://www.fipa.org/.
  • [27] F.B. Fitch. A logical analysis of some value concepts. The Journal of Symbolic Logic, 28(2):135–142, 1963.
  • [28] N. Friedman and J.Y. Halpern. A knowledge-based framework for belief change – part i: Foundations. In Proc. of 5th TARK, pages 44–64. Morgan Kaufmann, 1994.
  • [29] P. Gardenfors. ¨ Knowledge in Flux: Modeling the Dynamics of Epistemic States. Bradford Books, MIT Press, Cambridge, MA, 1988. 17
  • [30] J. Gerbrandy. Bisimulations on Planet Kripke. PhD thesis, University of Amsterdam, 1998. ILLC Dissertation Series DS-1999-01.
  • [31] J. Gerbrandy. The surprise examination in dynamic epistemic logic. Synthese, 155(1):21– 33, 2007.
  • [32] J. Gerbrandy and W. Groeneveld. Reasoning about information change. J. Logic, Lang., Inform., 6:147–169, 1997.
  • [33] P. Girard. Modal logic for belief and preference change. PhD thesis, Stanford University, 2008. ILLC Dissertation Series DS-2008-04.
  • [34] P. Gochet. The dynamic turn in twentieth century logic. Synthese, 130(2):175–184, 2002.
  • [35] J. Groenendijk and M. Stokhof. Questions. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 1055–1124. Elsevier, Amsterdam, 1997.
  • [36] J. Groenendijk, M. Stokhof, and F. Veltman. Coreference and modality. In S. Lappin, editor, The Handbook of Contemporary Semantic Theory, pages 179–213. Blackwell, Oxford, 1996.
  • [37] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam, 1995. ILLC Dissertation Series DS-1995-18.
  • [38] A. Grove. Two modellings for theory change. Journal of Philosophical Logic, 17:157–170, 1988.
  • [39] D. Harel. First-Order Dynamic Logic. LNCS 68. Springer, 1979.
  • [40] D. Harel. Dynamic logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume II, pages 497–604, Dordrecht, 1984. Kluwer Academic Publishers.
  • [41] Vincent Hendricks and John Symons. Epistemic logic. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Spring 2006.
  • [42] J. Hintikka. Knowledge and Belief. Cornell University Press, Ithaca, NY, 1962.
  • [43] W. Holliday and T. Icard. Moorean phenomena in epistemic logic. In L. Beklemishev, V. Goranko, and V. Shehtman, editors, Advances in Modal Logic 8, pages 178–199. College Publications, 2010.
  • [44] J. Hulstijn. Presupposition accommodation in a constructive update semantics. In G. Durieux, W. Daelemans, and S. Gillis, editors, Proceedings of CLIN VI, 1996.
  • [45] J. Gerbrandy J. van Benthem and B. Kooi. Dynamic update with probabilities. Studia Logica, 93(1):67–96, 2009.
  • [46] S. Konieczny and R. Pino Perez. Merging information under constraints: A logical frame- ´ work. Journal of Logic and Computation, 12(5):773–808, 2002.
  • [47] B.P. Kooi. Dynamic epistemic logic. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 671–690. Elsevier, 2011. Second edition.
  • [48] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44:167–207, 1990.
  • [49] R. Lang. Questions as epistemic requests. In H. Hiz, editor, ˙ Questions, pages 301–318. Reidel, Dordrecht, 1978.
  • [50] N. Laverny. Revision, mises ´ a jour et planification en logique doxastique graduelle ` . PhD thesis, Institut de Recherche en Informatique de Toulouse (IRIT), Toulouse, France, 2006.
  • [51] D.K. Lewis. Counterfactuals. Harvard University Press, Cambridge (MA), 1973.
  • [52] S. Lindstrom and W. Rabinowicz. DDL unlimited: dynamic doxastic logic for introspective ¨ agents. Erkenntnis, 50:353–385, 1999.
  • [53] F. Liu. Changing for the Better: Preference Dynamics and Agent Diversity. PhD thesis, University of Amsterdam, 2008. ILLC Dissertation Series DS-2008-02.
  • [54] C. Lutz. Complexity and succinctness of public announcement logic. In Proceedings AAMAS 06, Hakodate, Japan, 2006.
  • [55] J.-J. Ch. Meyer and W. van der Hoek. Epistemic Logic for AI and Computer Science. Cambridge University Press, Cambridge, 1995.
  • [56] T.A. Meyer, W.A. Labuschagne, and J. Heidema. Refined epistemic entrenchment. Journal of Logic, Language, and Information, 9:237–259, 2000.
  • [57] G.E. Moore. A reply to my critics. In P.A. Schilpp, editor, The Philosophy of G.E. Moore, pages 535–677. Northwestern University, Evanston IL, 1942. The Library of Living Philosophers (volume 4).
  • [58] L. S. Moss. From hypersets to Kripke models in logics of announcements. In J. Gerbrandy, M. Marx, M. de Rijke, and Y. Venema, editors, JFAK. Essays Dedicated to Johan van Benthem on the Occasion of his 50th Birthday, Amsterdam, 1999. Amsterdam University Press.
  • [59] Michal Peli and Ondrej Majer. Logic of questions and public announcements. In Nick Bezhanishvili, Sebastian Lbner, Kerstin Schwabe, and Luca Spada, editors, Logic, Language, and Computation, pages 145–157. Springer, 2011. LNCS 6618.
  • [60] J. Peregrin, editor. Meaning: the dynamic turn. Elsevier, Amsterdam, 2003. 19
  • [61] J. Plaza. Logics of public communications. Synthese, 158(2):165–179, 2007. This paper was originally published as Plaza, J. A. (1989). Logics of public communications. In M. L. Emrich, M. S. Pfeifer, M. Hadzikadic, and Z.W. Ras (Eds.), Proceedings of ISMIS: Poster session program (pp. 201–216). Publisher: Oak Ridge National Laboratory, ORNL/DSRD- 24.
  • [62] G. R. Renardel de Lavalette. Changing modalities. J. Logic and Comput., 14(2):253–278, 2004.
  • [63] B. Renne. A survey of dynamic epistemic logic. manuscript, 2008.
  • [64] J. Sack. Adding Temporal Logic to Dynamic Epistemic Logic. PhD thesis, Indiana University, Bloomington, USA, 2007.
  • [65] J. Salerno, editor. New Essays on the Knowability Paradox. Oxford University Press, Oxford, UK, 2009. [66] J. R. Searle. Speach Acts, An Essay in the Philosophy of Language. Cambridge University Press, Cambridge, 1969.
  • [67] K. Segerberg. Irrevocable belief revision in dynamic doxastic logic. Notre Dame Journal of Formal Logic, 39(3):287–306, 1998.
  • [68] K. Segerberg. Two traditions in the logic of belief: bringing them together. In H. J. Ohlbach and U. Reyle, editors, Logic, Language, and Reasoning, pages 135–147, Dordrecht, 1999. Kluwer.
  • [69] P. Iliev T. French, W. van der Hoek and B. Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56–85, 2013.
  • [70] B. D. ten Cate. Internalizing epistemic actions. In M. Martinez, editor, Proceedings of the NASSLLI 2002 student session, pages 109 – 123, Stanford University, 2002.
  • [71] J. van Benthem. Semantic parallels in natural language and computation. In Logic Colloquium ’87, Amsterdam, 1989. North-Holland.
  • [72] J. van Benthem. Exploring Logical Dynamics. CSLI Publications, Stanford, 1996.
  • [73] J. van Benthem. Games in dynamic-epistemic logic. Bulletin of Economic Research, 53(4):219–248, 2001.
  • [74] J. van Benthem. Logics for information update. In J. van Benthem, editor, Proceedings of TARK 2001, pages 51–67, San Francisco, 2001. Morgan Kaufmann.
  • [75] J. van Benthem. What one may come to know. Analysis, 64(2):95–105, 2004.
  • [76] J. van Benthem. ‘one is a lonely number’: on the logic of communication. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. ASL, Poughkeepsie, 2006. 20
  • [77] J. van Benthem. Dynamic logic of belief revision. Journal of Applied Non-Classical Logics, 17(2):129–155, 2007.
  • [78] J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, 2011.
  • [79] J. van Benthem. Logic in Games. MIT Press, 2013. To appear.
  • [80] J. van Benthem, J.D. Gerbrandy, T. Hoshi, and E. Pacuit. Merging frameworks for interaction. Journal of Philosophical Logic, 38:491–526, 2009.
  • [81] J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006.
  • [82] W. van der Hoek and R. Verbrugge. Epistemic logic: a survey. In L.A. Petrosjan and V.V. Mazalov, editors, Game theory and Applications, volume 8, pages 53–94, 2002.
  • [83] H. van Ditmarsch, A. Herzig, and T. De Lima. From situation calculus to dynamic epistemic logic. Journal of Logic and Computation, 21(2):179–204, 2011.
  • [84] H. van Ditmarsch, W. van der Hoek, and P. Iliev. Everything is knowable – how to get to know whether a proposition is true. Theoria, 78(2):93–114, 2012.
  • [85] H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic epistemic logic with assignment. In Proc. of 4th AAMAS, pages 141–148. ACM, 2005.
  • [86] H. P. van Ditmarsch. Knowledge games. PhD thesis, University of Groningen, 2000. ILLC Dissertation Series DS-2000-06.
  • [87] H. P. van Ditmarsch. Descriptions of game actions. J. Logic, Lang., Inform., 11:349–365, 2002.
  • [88] H. P. van Ditmarsch. Prolegomena to dynamic logic for belief revision. Synthese, 147:229– 275, 2005.
  • [89] H. P. van Ditmarsch and B. Kooi. The secret of my success. Synthese, 151(2):201–232, 2006.
  • [90] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Concurrent dynamic epistemic logic. In V. F. Hendricks, K. F. Jørgensen, and S. A. Pedersen, editors, Knowledge Contributors, pages 45–82. Kluwer, Dordrecht, 2003.
  • [91] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic. Springer, Berlin, 2007.
  • [92] Peter Vanderschraaf and Giacomo Sillari. Common knowledge. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Fall 2007. 21
  • [93] F. Veltman. Defaults in update semantics. Journal of Philosophical Logic, 25:221–261, 1996. [94] Y. Wang, L. Kuppusamy, and J. van Eijck. Verifying epistemic protocols under common knowledge. In Proc. of 12th TARK, pages 257–266. ACM, 2009.

 

Author Information

Hans van Ditmarsch
Email: hans.van-ditmarsch@loria.fr
University of Lorraine
France

and

Wiebe van der Hoek
Email: wiebe@csc.liv.ac.uk
The University of Liverpool
United Kingdom

and

Barteld Kooi
Email: B.P.Kooi@rug.nl
University of Groningen
Netherlands

Classification

One of the main topics of scientific research is classification. Classification is the operation of distributing objects into classes or groups—which are, in general, less numerous than them. It has a long history that has developed during four periods: (1) Antiquity, where its lineaments may be found in the writings of Plato and Aristotle; (2) The Classical Age, with natural scientists from Linnaeus to Lavoisier; (3) The 19th century, with the growth of chemistry and information science; and (4) the 20th century, with the arrival of mathematical models and computer science. Since that time, and from an extensional viewpoint, mathematics, specifically, the theory of orders and the theory of graphs or hypergraphs, has facilitated the precise study of strong and weak forms of order in the world, and the computation of all the possible partitions, chains of partitions, covers, hypergraphs or systems of classes that we can construct on a domain. With the development of computer science, Artificial Intelligence, and new kinds of languages such as oriented-objected languages, an intensional approach has completed the previous one. Ancient discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph found some kind of revival via object-oriented modeling and programming, most of objected oriented languages being concerned with hierarchies, or partial orders: these structures reflect in fact the relations between classes in those languages, which generally admit single or multiple inheritance. In spite of these advances, most of classifications are still based on the evaluation of resemblances between objects that constitute the empirical data. This one is almost always computed by the means of some notion of distance and of some algorithms of aggregation of classes. So all these classifications remain, for technical and epistemological reasons that are detailed below, very unstable ones. A real algebra of classifications, which could explain their properties and the relations existing between them, is lacking. Though the aim of a general theory of classifications is surely a wishful thought, some recent conjecture gives the hope that the existence of a metaclassification (or classification of all classification schemes) is possible.

Table of Contents

  1. General Introduction: Classification Problems
  2. A Brief History of Classifications
    1. From Antiquity to the Renaissance
    2. From Classical Age to Victorian Taxonomy
    3. The Beginning of Modernity
  3. The Problem of Information Storage and Retrieval
  4. Ranganathan and the PMEST Scheme
  5. Order and Mathematical Models
    1. Extensional Structures
    2. A Glance at an Intensional Approach
  6. The Idea of a General Theory of Classifications
  7. References and Further Readings

1. General Introduction: Classification Problems

Classification problems are one of the basic topics of scientific research. For example, mathematics, physics, natural sciences, social sciences and, of course, library and information sciences all make use of taxonomies.  Classification is a very useful tool for ordering and organization. It has increased knowledge and helped to facilitate information retrieval.

Roughly speaking, ‘classification’ is the operation consisting of sharing, distributing or allocating objects in classes or groups which are, in general, less numerous than them. Commonly, classifications are defined on finite sets. However, if the objects are, for example, mathematical structures there can be infinite classifications. In this case, the previous requirement, of course, must be weakened: we may only want the (infinite) cardinal of the classification to be less than or equal to the (infinite) cardinal of the set of objects to be classified.  What we call ‘classification’ is also the result of this operation. We want, as much as it is possible, for this result be constant, namely, that the classification itself remains stable for a little transformation of data (of course, the sense of this requirement will have to become clearer). Various situations may happen: the classes may intersect or not, be finite or infinite, formal or fuzzy, hierarchically ordered or not, and so on.

The basic operation of grouping elements into classes, which simplifies the world, is a very powerful operation, but it also raises many questions. In particular, a number of philosophers, from Socrates to Diderot and even post-modern philosophers, criticized such an operation (see, for instance, Foucault 1967). Indeed, this operation has multiple profits.  First is the substitution of a rational and regular order in the chaotic and muddled multiplicities. Second is the reduction of the size of sets, so that, once we have constituted classes of equivalences, we can work with these classes and no more with the elements. Third, and finally, to make a partition of a set means locating in it a symmetry that  decreases the complexity of the problem and so simplifies the world. We can say with Dagognet (1984, 1990) than “less is more”: to compress the data really brings an intellectual gain.

Having outlined the main reasons for classifications, let us see how these classifications have developed and which forms they got throughout the course of time.

2. A Brief History of Classifications

The history of classifications (Dahlberg 1976) develops in four periods. From Plato and Aristotle to the 18th century, ancient classifications are hierarchical ones, they are finite and generally based on one single criterion. During the 18th century, some new classifications appear, which are multicriteria  – a domain can be co-divided in many ways, as Kant said in his Logic (see Kant 1988) – and indefinite or virtually infinite (Kant believed that we could endlessly subdivide the extension of a concept).  At the end of the 18th and at the beginning of the 19th century, with the chemical classifications of Lavoisier and then of Mendeleyev, one discovers combinatorial classifications or multiple crossed orders, like the chemical table of Elements, which correspond to a new concept of classification. In the 20th century, through the progress of mathematical order theory, factorial analysis of correspondence, and automatic classification, formal models begin to develop.

a. From Antiquity to the Renaissance

French commentator of Greek philosophers, R. Joly said that a typical trend of the Greek spirit was to reduce a multiple and complex reality into some categories which satisfy the reason, both by their restricted number and by the clear and precise sense that becomes attached to each of them. Indeed, Plato and Aristotle are among the great classifiers of these ancient times.

In all of Plato’s Dialogues, and especially in the latest ones (Parmenides, Sophist, Politicus, Philaebus), Plato obviously classified a lot of things (ways of life, political constitutions, pleasures, arts, jobs, kinds of knowledge, and so forth). Generally, for Plato, things were classified in relation with the distance that separates them from their archetypal forms, which yields some order (or pre-order) on them. Plato’s classifications are finite, hierarchical, dichotomous, and based on a single criterion. For example, in Gorgias (465c), a set of all practices is divided into two classes, the practices concerning the body and the practices concerning the soul, each of them being then divided into two others: gymnastics and medicine, on one hand, and legislation and justice, on the other hand. In the same way, in Republic (510a), the whole universe, viewed as the set of all real things, is divided into the visible world and the invisible world, each class being subdivided into images and objects or living beings on one hand, mathematical objects and ideas, on the other hand.

According to Plato, the rules of classifications are very simple. First, we have to make symmetric divisions in order to get well-balanced classes. For example, if we classify the peoples, we have to avoid setting the Greek in front of the other peoples, because one of the classes will be plethoric while the other one will have only one element (Politicus, 262a). Second, As a good cook who cuts an animal─this metaphor is in the Phaedrus−it is also necessary to choose the good joints or articulations. For example, in the field of numbers, it would be senseless to set 1000 in front of 999 other numbers. In contrast, the opposition even/odd or prime/not prime, is a real one. Thirds, in general, we must also avoid using negative determinations. For example, we have to avoid determinations like not-A because it is impossible that the non-being has sorts or species, these determinations block the development of thought.

Plato did not observe these wise rules, so incurring Aristotle’s criticisms. Against Plato’s theory, Aristotle argues that the method of division is not a powerful tool because it is non-conclusive. It does not make syllogisms (First Analytics, I, 31). In another text (Second Analytics, II,5), Aristotle insists on the contingency of the passage from a predicate to another one, that is, in the Platonic division, for every new attribute, we can wonder why it is such an attribute  oppose to another one. The differences introduced by dichotomies can be also purely negative and thus do not necessarily define a real being. Moreover, binary divisions presuppose that the number of the primitive species is a power of 2. In a division, a predicate can belong to different primitive species, for example “bipedalism” can apply to both birds and humans. But, according to Aristotle, the application of this term is not the same in both cases. Finally, the Platonic division confuses extensional and intensional views. It can identify the triangle, which is a kind, and one of its properties, for example, the equality of the sum of its angles in two right angles.

The previous questions get no answer in Plato’s theory. Aristotle rejected Plato’s method of division. But, Aristotle also rejected the Platonic doctrine of forms. According to Aristotle (Metaphysics, I, 9), Plato’s forms fail to explain how there could be permanence and order in the world. Far more, he argued, Plato’s theory of forms cannot explain anything at all in our material world. The properties that the forms have (according to Plato the forms are eternal, unchanging, transcendent, and so forth) are not compatible with material objects and the metaphor of participation or imitation breaks down in a number of cases. For instance, it is unclear what it mean for a white object to participate in, or to copy, the form of whiteness−that is, it is hard to understand the relationship between the form of whiteness and white objects themselves.

For all these reasons, Aristotle develops his own concepts, and his own logic of classifications. In the Topics (I, chap. 1), Aristotle introduces the notions of kind, species, property and a whole theory of basic predication that has subsequently developed in the work of Porphyry and Boece, respectively. This theory is based on the opposition between essence, all of the characters that define a thing, and accident, the qualities whose presence or absence does not modify the things essence. A commentator of the Aristotelian system, Porphyry (234-305), puts these distinctions to good use and tries to specify the hierarchy of the kinds and the species as defined by Aristotle. The famous Porphyrian Tree is the first abstract tree outlining these distinctions and illustrates the subordination existing between them (See Figure 1).

Fig. 1

Figure 1: The Porphyrius Tree

In a passage of his Commentary on Aristotle’s Categories (2014) Porphyry asked good questions at the origin of a hotly-debated controversy over whether or not universals were physical or immaterial substances. That is, a contention over whether universals are separated from sensible things or if they are involved in them, finding their consistency therein. In opposition to the traditional views (Platonic and Aristotelian or scholastic realisms), other solutions appeared. For example, Nominalism (Roscelin, 11th c.) claimed that universals are but words and that nothing corresponds to them in the Nature, which knows only the singular. Against that was Conceptualism (Abélard, 12th cn. and Ockham, 14th cn.), the view that kinds exist as predicates of subjects that, themselves, are real. In the last centuries of Middle Ages and in the Renaissance, we find also great scholars who work on classification. In particular, Francis Bacon (1561-1626), whose work on the classification of knowledge that has inspired the great librarians of the 19th century. But, the logic of classifications, which remains, in this time, the Aristotelian logic, receives practically no new development until the 18th century.

b. From Classical Age to Victorian Taxonomy

In the Classical Age, taxonomy as a fully-fledged discipline began to develop for several reasons. One important reason emerges from the birth of natural science and the need to organize floras and faunas in connection with the growth of the human population on Earth, in the context of the beginning of agronomy (Dagognet, 1970). In this period, naturalists like Tournefort (1656-1708), Linnaeus (1707-1778), De Jussieu (1748-1836), Desfontaines (1750-1833) and Cuvier (1769-1832) tried to classify plants and animals all around the world.

When classifying things or beings, you must get a criterion or an index, in order to make classes and separate varieties inside the classes. Indeed, all those naturalists differ on the criteria of their classifications. For example, concerning the classification of plants, Tournefort chose corolla, while Linnaeus chose the sexual organs of the plant. Concerning the animals, the classification of Cuvier violates Aristotle’s recommendations, by compositing vertebrates and invertebrates which, by chance, are something real. At the end of the century, Kant summarizes, in his Logic (1800), the main part of the knowledge about classifications in this period, by specifying the definitions of a certain number of terms and operations that the naturalists of the time empirically use. Kant was only interested in the forms of the classifications. In his Logic he defines a logical division of a concept as “the division of all the possible contained in it”. The rules of this division are the following: 1) members of the division are mutually exclusive, 2) their union restores the sphere of the divided concept, 3) each member of the division can be itself divided (the division of such divided members is a subdivision).  (1) and (2) seem to indicate that Kant was approaching our concept of a partition. But (3) shows that he does not have the concept of a chain of partitions, since he does not see that a subdivision of the same level forms one and the same partition.

These problems were also discussed, during the 19th century in Anglo-Saxon countries, even after Darwin’s theory of evolution. One may think that Darwin’s belief in branching evolution was based upon his familiarity with the taxonomy of his day, from which he was very aware. There were great taxonomists in England in the Victorian age and some of them−for instance, the paleontologist H. Alleyne Nicholson, a specialist of British Stromatoporoids−were prodigious and wrote monographs still in force today (Woodward 1903). At approximately the same time, H. Agassiz (Agassiz 1957), a scholar in classification theory, wrote about taxonomic concepts like categories, divisions, forms, homologies, analogies, and so on. Among different taxonomic systems mentioned in his Essay on Classification, include the classical systems of Leeuckart, Vogt, Linnaeus, Cuvier, Lamarck, de Blainville, Burmeister, Owen, Ehrenberg, Milne-Edwards, von Siebold, Stannius, Oken, Fitzinger, MacLeay, von Baer, van Bencden, and van der Hoeven. In The Origin of Species, Darwin himself said that it was a

truly wonderful fact…that all animals and all plants throughout all time and space should be related to each other in group subordinate to group, in the manner which we everywhere behold−namely, varieties of the same species most closely related together, species of the same genus less closely and unequally related together, forming sections and sub-genera, species of distinct genera much less closely related, and genera related in different degrees, forming sub-families, families, orders, subclasses, and classes. (1859, 128)

But what he called the “principle of divergence”–namely, the fact that during the modification of the descendants of any one species and during the incessant struggle of all species to increase in numbers, the more diversified these descendants become, the better will be their chance of succeeding in the battle of life−was illustrated by his famous tree-like diagram sketched in 1837 in the notebook in which he first posited evolution. From this time, tree-like structures, that has been also of great use in chemistry and would be formalized at the end of the century by the mathematician Arthur Cayley, tended to replace classifications.

c. The Beginning of Modernity

A new kind of classifications appeared at the end of the 18th century, with the development of Chemistry, namely, combinatorial classifications or cross multiple orders. This kind of classifications is either the crossing of two or more divisions, or the crossing of two or more hierarchies of divisions. In such a structure, as Granger (1967) said,  “elements are distributed according to two or several dimensions, giving rise to a multiplication table”. In a combinatorial classification, the elements themselves are not necessarily distributed into classes. Only the components of these elements are classified. For Granger, this model refers to the Cartesian plane and to the ordinal principle on which it is based. The Cartesian plane, results from a will of ordering a certain distribution of points in the space, by ordering points in every row and then by ordering the rows themselves. The virtue of multiple orders is to place what is classified in the intersection of a line and a column. So, as Dagognet (1969) has shown, when an element is absent or there is an empty compartment, it can be defined by its surroundings. This is what happened in the Mendeleyev table. This table has two main advantages. First, the table is creative, so the mass of a chemical element can be calculated from those which surround it (see Figure 2), and hence, chemical elements, which did not exist in Nature but were synthesized only 30 years later in laboratories, have already been accounted for by Mendeleyev. Second, the classification is not a purely spatial picture of the world. The temporality, in particular the future, is already present in it.

Fig. 2

Figure 2: The mass of an unknown element in the Mendeleyev Table

3. The Problem of Information Storage and Retrieval

At the end of the 19th century, the development of scientific research, which raised the question of information storage and retrieval, encouraged the constitution of voluminous librarian catalogues. This included the Dewey’s decimal classification, Otlet and La Fontaine’s universal decimal classification, and the Library of Congress classification. The aim of these kinds of classifications was to account for the whole of knowledge in the world. But, many problems arose from this attempt of library sciences to organize the whole knowledge. Three rules were commonly respected in more natural classifications: 1) Everything classified must appear in the catalogue (which must be, in principle, finite and complete), 2) there is no empty class, 3) nothing can belong to more than one class. Generally, these rules are not respected in library classifications. To face the extraordinary challenge of cataloguing knowledge growing indefinitely throughout the course of time, the big library classifications designed at the end of the 19th century adopted the principle of decimalization. This system was used because decimal numbers, used as numeral items, authorize indefinite extensions of classifications. Suppose you start with 10 main classes, from 0 to 9. If you add a zero to each number, you get the possibility of forming 100 classes (from 00 to 99) and if you go on, you can obtain 1000 classes (from 000 to 999). Then you can also put a comma or a point, and define items like: 150.234. After the point, the sequence of numbers is potentially infinite and you can go as far as is needed. Another difference is that library classifications can sometimes allow for vacant classes in their hierarchy, and also can, assume the inscription of classified subjects in several places. Vacant classes are used because a librarian must manage some place for new documents that are still temporarily unclassified. Multiple inscriptions are also used because readers, who sometimes do not know exactly what they are looking for, need to have a broad ranging accesses to knowledge. This made made way for the existence of entries like author, subject, time, place, and so forth. The previous requirement of decimalization is obvious in the Dewey Decimal Classification (DDC) proposed by Melvil Dewey in 1876 (Béthery 1982). This classification is made up of ten main classes or categories, each of them being divided into ten secondary classes or subcategories. These last ones contain in turn ten subdivisions. The partition of the ten main classes thus gives successively 100 divisions and 1000 sections.

DDC — main sections

  • 000 – Computer Science, Information and General Works
  • 100 – Philosophy and Psychology
  • 200 – Religion
  • 300 – Social Sciences
  • 400 – Language
  • 500 – Science (including Mathematics)
  • 600 – Technology and Applied Science
  • 700 – Arts and Recreation
  • 800 – Literature
  • 900 – History, Geography and Biography

In the same way, the Universal Decimal Classification (UDC) of Otlet and La Fontaine globally presents the same hierarchical organization, except in the fourth nodal class, which is left empty (thus, applying the previous principle of vacant classes).

As librarians have rapidly observed, one undesirable consequence of such decimal schemes is the increasing fragmentation of subjects as taxonomist’s work proceed. For example, the Dewey Classification, though having this useful advantage of being infinitely extendible, turns out rapidly to be a list or a nomenclature. This is also the case of the UDC of Otlet and La Fontaine, and of all the classifications of the same type. A first attempt to make up for such a disadvantage has consisted of allowing some junctions between categories in the classification. A second one is the possibility of using some tables (7 in the DDC) to aid in the search of a complex object, which may be located in different sites. For instance, a book of poetry, written by various poets from around the world, would appear in several classes, indexed thanks to the tables. In general, DDC used to combine elements from different parts of the structure, in order to construct a number representing the subject content. This one often combines 2 or more subject elements with linking numbers and geographical and temporal elements. The method consists of forming a new item rather than drawing upon a list containing each class and its meaning. For example, 330 (for Economics) + 9 (for Geographic Treatment) + 04 (for Europe) and the use of ‘/’ gives 330/94 (European Economy). Another example is the following: 973 (for United States) + 05 (division for periodicals) and the use of the point ‘.’ gives 973.05 (periodicals concerning the United States generally).

Other specific features occur in library classifications, which tend to make them very different from classical scientific taxonomies. One spectacular difference with hierarchical classifications in Zoology or Botany is, as we have already seen, that it is possible for subjects to appear in more than one class. For example, in DDC, a book on Mathematics could appear in the 372.7 section or in the 510 section, depending on if the book is a monograph instruction for teachers on how to teach mathematics, or a mathematics textbook for children. Another difference is a relative flexibility of library classifications.

Though there exist improvements, UDC and DDC, like most of the classifications constructed at the same time (see Bliss 1929) are based on a perception of knowledge and of the relationships between academic disciplines extant from 1890 to 1910. Moreover, though updated regularly, UDC and DDC, as decimal systems, are less hospitable to the addition of new subjects. These kinds of classification are based on fixed and historically dated categories. One may observe, for example, that none of the main concepts of our present library science (digital library, knowledge organization, automatic indexing, information retrieval, and so forth) were included in the index of the 2005 UDC edition, and that technical taxonomies generally require more complex features (Dobrowolski 1964).

4. Ranganathan and the PMEST Scheme

There have been many pursuits to solve the aforementioned librarian problems. Some of them are well known since the middle of the 20th century. In the course of the 20th century, new modes of indexing and original classification schedules appeared in library science with the Indian librarian Shiyali Ramamrita Ranganathan (1933, 2006) and his faceted classification – also called “Colon classification” (CC), because of its use of the colon to indicate relations between subjects in the former edition.

Ranganathan was at first a mathematician and knew little about the library. But he took charge of the Madras University Library, and was then deputed by his University to study Library Science in London. There, he attended the School of Librarianship in the University College and discovered, as he said later, the “charm of classifications”, and also its problems. He saw very quickly that Decimal Classifications did not give satisfaction to users. On the opposite, he had the vision of a meccano set, where, instead of having ready-made rigid toys, one can construct them with a few fundamental components. This made him think of a new kind of classification.

It appeared to Ranganathan that the new theory might be organized at the higher level in 5 fundamental categories (FC) called facets: Personality, Matter, Energy, Space and Time−in summary PMEST. In each isolate facet  a Compound Subject is deemed to be a manifestation of one (and only one) of one or other of the five fundamental categories. There is also subfacets, so that the facet scheme PMEST and the subfacets we may form from it, are then used to sort subclasses in the main classes of the classification.

The difference with previous classifications is in the way one defines ‘subfacets’. Rather than simply dividing the main classes into a series of subordinate classes, one subdivides each main class by particular characteristics into facets. Facets, labeled by Arabic numbers, are then combined to make subordinate classes as needed. For example, Literature may be divided by the characteristic of language into the facet of Language, including English, German, and French. It may also be divided by form, which yields the facet of Form, including poetry, drama and fiction. So CC contains both basic subjects and their facets, which contain isolates. A basic subject stands alone, for example: Literature in the subject English Literature, while an isolate, in contrast, is a term that modifies a basic subject, for example, the term ‘English’. Every isolate in every facet must be a manifestation of one of the five fundamental categories in the PMEST scheme.

The advantages of the CC are numerous. The first one is a greater flexibility in determining new subjects and subject numbers. A second is the concept of phases, which allows taxonomists to readily combine most of the main classes in a subject.  Consider for example a subject like Mathematics for biologist. In this case, single class number enumerative systems, as those predominating in US libraries, tend to force classifiers to choose either Mathematics or Biology as the main subject. However, CC supplies a specific notation to indicate this be-phased condition.

Indeed, some problems remain unsolved. In CC, facets, that is, small components of larger entities or units are similar to flat faces of a diamond which reflect the underlying symmetry of the crystal structure, so that the general structure of Ranganathan Classification, as that of a faceted classification in general, is a kind of permutohedron. In principle, all descriptions may be done, whatever the order of them. For example, if we have to classify a paper speaking about seasonal variations of the concentration of noradrenaline in the tissue of the rat, we must get the same access if we have the direct sequence: (1) Seasonal, variations, concentration, noradrenaline, tissue, rat, or the reversed one: (2) Rat, tissue, noradrenaline, concentration, variations, seasonal. In mathematical words, this means clearly that the underlying structure that makes this transformation possible must be a commutative group. But this is not always the case, and for some dihedral groups, this structure is even forbidden. Another potential worry is that the PMEST scheme, which certainly has some connections with Indian thought, is far from being universally accepted (see De Grolier 1962) and has not been very often implemented in libraries, even in India.

So, in spite of all the improvements they receive in the course of time, a lot of problems have been raised in front of library classifications. In particular, library classifications will be strongly questioned in the 20th century by the proliferating development of the knowledge. First, the ceaseless flux of new documents forbids a stiff topology for classifications. The problem, then, is to know how to construct evolutionary structures. Second, the successive orderings of the knowledge (groupings and revisions and not only ramifications) has called relational powerful and automated documentary languages. Classifications still remain necessary, because documentary languages cannot do everything. So the problem is still open. But, with the big development of mathematics in the last century, this general problem, which is the great problem of order, has to be investigated by the means of mathematical structures.

5. Order and Mathematical Models

First attempts to study orders in mathematics began to develop at the end of the 19th century with Peano, Dedekind and Cantor (especially with his theory of ordinals, which are linear ordered sets).  They go on with Peirce (1880) and Shröder (1890) and their works around the question of an algebra of logic. Then, in the first part of the 20th century, comes the notion of partial order with an article of MacNeille (1937) and the famous work of G. Birkhoff (1967) who introduced the notion of lattice, algebraically developed later in the great book of Rasiowa and Sikorski (1970). During the same period, mathematical models of hierarchical classifications, which have been investigated in the USA by Sokal and Sneath (1963, 1973) or, in England, by Jardine and Sibson (1971) were developed in France in the works of Barbut and Monjardet (1970), Lerman (1970, 1981), and Benzécri (1973). All these works supposed the big last century advances in mathematical order theory: especially the papers of Birkhoff (1935), Dubreil-Jacotin (1939), Ore (1942, 1943), Krasner (1953-1954) and Riordan (1958). The Belgian logician Leo Apostel (1963) and the Polish mathematicians Luszczewska-Romahnowa and Batog (1965a, 1965b) have also published important articles on the subject. The more and more important use of computers in the search of automatic classifications has also been, in those years, a reason for searchers to get interested in mathematical models.

As there are many forms of classifications in the world of knowledge (we can find them, as we have seen, in mathematics, natural sciences, library and information science, and so forth) there are also many possible mathematical models for classifications. We begin with the study of extensional structures.

a. Extensional Structures

In order to clarify the situation, we start with the weakest form of them and move to stronger forms. Mathematics allows us to begin with very few axioms, that usually define weak general structures, and afterwards, by adding new conditions, one can get other properties and stronger models. In our case, the weakest structure is just a hypergraph H = (X,P) in the sense of Berge (1970), with X a set of vertices and P a set of nonempty subsets called edges (See Figure 3).

Fig. 3

Figure 3: A Hypergraph

In this case, the set of edges P does not necessarily cover the set X, and some nodes (vertex of degree zero), may have no link to some edge. Assume the following conditions:

(C0)    X ∈ P,

(C1)    For all x ∈ P, {x} ∈ P,

Accordingly, we have a system of classes (in the sense of Brucker-Barthélemy 2007).

Add now the following new conditions: for every Pi ∈ P:

(C2)      Pi ∩ Pj = Ø,

(C3)      ∪ Pi = X,

Then P is a partition of X and the Pi are the blocks of the partition P.

Let now P(X) be the set of partitions on a nonempty finite set X. We may define on P(X) a partial order relation ≤ (reflexive, antisymmetric and transitive) such that P(X), ≤) is a lattice in the sense of Birkhoff (1967), that is, a partial order where every pair of elements has the same least upper bound and the same greatest lower bound. Then, one can prove that all the chains (all the linearly ordered sequences of partitions) of this lattice are equivalent to hierarchical classifications. So, the set C(X) of all these chains is exactly the set of all hierarchical classifications on a set. This set C(X) has itself a mathematical structure: it is a semilattice for set intersection. This model allows us to get all the possible partitions of P(X) and all the possible chains of C(X) (See Figure 4).

Fig. 4

Figure 4: The lattice of partitions of a 4-element set.

A first problem is that such partitions are very numerous. For |X| = 9, for example, there is already 21147 partitions. So, when we want to classify some domain of objects (plants, animals, books, and so forth), it is not very easy to examine what classification is the best one among, say, several thousands of them.

A second problem is that the world is not made of chains of partitions. If it were, of course, the game would be over. Everything could be inserted in some hierarchical classification. But, the real world has no reason to present itself as a hierarchical classification. In the real world, we have generally to deal with quite chaotic entities, complicated fuzzy classes and poor structured objects, all that form what we can call ‘rough data’. So when we want to get a clear order, we have to construct it,  such that it is extracted from the complicated data. For that, we have to compare objects, to know the degree to which they are similar, and to do so, we need of course a notion of ‘similarity’. In order to make empirical classifications we must evaluate the similarities or dissimilarities between elements to be classified. In the history of taxonomic science, Buffon (1749) and Adanson (1757) have tried to understand the meaning of this evaluation in the following way. First, they claim, we have to measure the distance between the objects by the means of some index, so that we can build classes. Afterwards, we have to measure the distance between classes themselves, so that we can group some classes into classes of classes, and so replace the initial set of objects with an ordered set of classes that is less numerous than them.

What old taxonomists were doing, only basis of observation, can now be carried out with the help of mathematics, using a modern notion of distance. Lerman (1970) and Benzécri (1973) showed that a hierarchical classification, that is, a chain of partitions, is nothing but a particular kind of distance or, a particular kind of dissimilarity (Van Cutsem 1994). It is an ultrametric distance, which gives tree representations (Barthélemy and Guénoche 1988) and also has the special property to correspond exactly with the chain, so that, when considering all the chains, the set of their corresponding distance matrices makes a semiring (R, +, ×) when we interpret the lattice operations min and max in an anusual but clever manner (+ for min, × for max) (Gondran 1976). Problems arise when the distance between the objects classified is not ultrametric. In such cases, we have to choose the closest ultrametric smaller than the given distance, and so, access to the best hierarchical classification we can get and which is the closest one to the data. However, this kind of approach leads, in general, to relatively unstable classifications.

Indeed, there are two kinds of instability for classifications. The first, Intrinsic instability,,is associated to the plurality of methods (distances, algorithms and so forth) that can be used to make the classifications of objects. The second is extrinsic instability, which is connected to the fact that our knowledge is changing with time, so the definitions of objects (or attributes of the objects) are evolving.

An answer to the question of intrinsic instability is a theorem of Lerman (1970) which says that if the number of attributes (or properties) possessed by the objects of a set X is constant, the associated quasi-order given by any natural metric is the same. But this result has two limits. First, when the sample variance of the number of attributes is a big one, of course, the stability is lost and second, if we classify the attributes, instead of classifying the objects, the reverse is not true.

For extrinsic instability the answers are more difficult to find. We may appeal to methods used in library decimal classifications (UDC, Dewey, and so forth), which make possible infinite ramified extensions, but these classifications, as we have seen, are apt to assume that higher levels are invariant and have also the disadvantage to be enumerative and to degenerate rapidly into simple lists. Also, pseudo-complemented structures (Hilman 1964) that admit some kinds of waiting boxes (or compartments) for indexing things that are not yet classified. We get as well structures whose transformations obey certain rules that have been fixed in advance. That is the case of Hopcroft 3-2 trees (Aho, Hopcroft, Ulmann 1983) for instance, or of structures close to these ones (Larson and Walden, 1979). In recent years, new models for making classifications came from conceptual formal analysis (Barwise and Seligman, 2003), computer science or views using non-classical logics in the domain of formal ontologies (Smith 1997, 2003). In computer science, for example, the concept of Abstract Data Type (ADT), related to the concept of Data Abstraction, important in object-oriented programming, may be viewed as a generalization of mathematical structures. An ADT is a mathematical model for data types, where a data type is defined by its behavior from the point of view of a user of the data. More formally, an ADT may be defined as a “class of objects whose logical behavior is defined by a set of values and a set of operations” (Dale-Walker 1996), which is strictly analogous to algebraic structures in mathematics. So, if we are not satisfied by a rough classification like the partition into collections, streams and iterators (support loops accessing data items) and relational data structures that capture relationships between data items, we must admit that ADT can also be regarded as a generalized approach of a number of algebraic structures, such as lattices, groups, and rings (Lidi 2004). Hence, classifications of ADT turn into classifications in algebraic specifications of ADT (Veglioni 1996). In this context, computer science adds nothing to mathematics and the problem is now that a classification of mathematical structures using, for instance, Category theory, as Pierce (1970) tried does not bring a sufficient answer because a category may exist while its objects are not necessarily constructible (Parrochia-Neuville 2013).

So, none of the previous approaches is very convincing for solving the basic problem, which always remains the same. We are lacking a general theory of classifications, which would only be able to study and, in the best case, solve some the main problems of classification.

b. A Glance at an Intensional Approach

Instead of making partitions by dividing a set of entities, so that the classes obtained in this way are extensional classes, as we saw in the previous section, we can instead proceed by associating a description to a set of entities. In this case, the classes are called intensional classes. Aristotle himself mixed the two points of view in his logic but Leibniz was the first to propose a purely intensional interpretation of classes. For a long time, that view was a minority and has never won unanimous support among the Ancient philosophers and logicians (as the numerous discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph demonstrate). However, the development of computer science brought this view back, since for declarative languages and particularly object-oriented languages, pure extensional classes or sets are rather uncommon. In this approach, the intension can be given either a priori, for example by a human actor from his knowledge of the domain, or a posteriori, when it is deduced from the analysis of a set of objects. In object-oriented modeling and programming, classes are traditionally defined a priori, with their extension mostly derived at running stage. This is usually done manually (intension being represented by logical predicates or tags), but techniques for a posteriori class discovery and organization also exist. In the context of programming languages, they deal with local class hierarchy modification by adding interclasses and use similarity-based clustering techniques or the Galois lattice approach (Wille 1996).

When there is an unrelated collection of sets, which is the case in artifact-based software classification, an issue is to compare and organize these sets simply by inclusion, or to apply conceptual clustering techniques. However, most of objected oriented languages are concerned with hierarchies, whose structure may be a tree, a lattice, or any partial order. The reason is that such structures reflect the variety of languages, some of them admitting multiple inheritance (C++, Eiffel), others only single inheritance (Smalltalk). Java has a special policy concerning this point: it admits two kinds of concepts, classes and interfaces, with single inheritance for classes and multiple inheritance for interfaces.

The viewpoint of Aristotle was the following: the division must be exhaustive, with parts mutually exclusive, and an indirect consequence of Aristotle’s principles is that only leaves of the hierarchy should have instances. Furthermore, the divisions must be based on a common concern whose modern name is the ‘discriminator’ in Unified Modeling Language (UML). But usual programming practices do not necessarily satisfy those principles. Multiple inheritance, for example, is contradictory with the assumption of mutually exclusive parts, and instances may in general be directly created from all (non-abstract) classes. Direct subclasses of a class can be derived according to different needs with different discriminators, but there is no evidence that this approach leads to relevant classifications. Objected oriented approaches, which transgress Aristotelian principles, are almost always practical storage modes but do not satisfy the main requisites of good classifications.

There are main principles that yield good classification, which are described in the intensional perspective. First–with Apostel [1963]– are some basic definitions.

From an intensional viewpoint, a division (or partition) is a closed formula F, which contains some assertion of the type (P ⊃ (Q1 ∨ Q2 ∨…∨ Qn)). So, a classification is a sequence of implicative-disjunctive propositions which takes the following form: everything which has the property P has also one of the n properties Q1 … Qn. Everything which has the property Qr  has also the property S, and so on (Apostel 1963, 188).

A division is essential if the individuals having the property P – and only this individual – may also have one of the properties Qi. So, we can see that there are degrees in essentiality insofar as the number of individuals having the Q’s without having the P’s is greater or less. At every level, a classification may be probably or necessarily essential or exhaustive, or exclusive.

We call intensional weight w(P) of a property P,  the set of disjunctions implied by this property (with necessity, factuality or probability). Properties defining classes in the same level may have extremely variable intensional weights. The basis of a division is the constant relation R, if any, between the properties of two different classes of this division.

A basis of division is (partially or totally) exhausted in some level insofar as, for this level, we do not find, in any case, true disjunctive propositions that are implied by the properties of this level and whose terms are connected by this very relation R.

A division is said to follow another one immediately (or to be immediately subsequent) if, for all P properties of the first, and for all Q properties of the second that are disjunctively implied by the P’s, there exists no sequence of R properties disjunctively implied by the P’s and disjunctively implying the Q’s.

The form of a property defining a class is the logical form of this property (conjunction of properties, disjunction of properties, negation of properties, single property).

For Apostel, an optimal classification should satisfy the following requisites:

  1. Every level needs a basis for division;
  2. No new basis for division shall be introduced before the previous one is exhausted;
  3. Every division is essential;
  4. Intensional weights of classes in a given level are comparable and relations between intensional weights of subsequent division properties in the classification must be constant.
  5. Properties used to define classes are conjunctive ones, and not negative ones.
  6. From the intensional viewpoint, divisions must be immediately subsequent.

In real domains, these requirements, or some of them, fail to hold. Levels are often extensionally equivalent but intensionally, the basis of division, the intensional weight, and so forth may change or not.

A natural classification is such that the definition of the domain classified determines in one and the same way the choice of the criteria of classification. It means that the fundamental set may be divided such that the division in the first level of the classification is an essential and subsequent one.

Intensional and extensional classifications are intimately related. Gathering entities in sets to produce extensional classes implies tagging these entities by their membership to these classes. But, intensional classes, built according to these descriptions, have an extension, which may be different from the initial extensional classes. So, in fact, both perspectives are not totally isomorphic and from Peirce (Hulwitt 1997) to Quine (1969), and presently, the question of natural classes remains an open and somewhat controversial question.

6. The Idea of a General Theory of Classifications

The idea of a general theory of classifications is not new. Such a project has been anticipated by Kant’s logic at the end of the 18th century. Then it was followed by many attempts to classify sciences at the beginning of the 19th century (Kedrov 1977) and had been posed by Auguste Comte in his Cours de philosophie positive (Comte 1975) as a general theory based on the study of symmetries in nature. Comte was inspired by mathematician Gaspard Monge and his classification of surfaces in geometry. However, this remains, in the work of Comte, a wishful thought. In the same way, the French naturalist Augustin-Pyramus de Candolle, published in 1813 an Elementary Theory of Botany, a book in which he introduced the term ‘taxonomia’, used in this work for the first time (de Candolle 1813). De Candolle showed that Botany had to leave artificial methods for natural ones, in order to get a method independent from the nature of the objects. Unfortunately, nothing very concrete or precise followed his remarks. Moreover, the previous projects were only concerned with finite classifications, particularly, biological ones. A higher and more general view came into light around the 1960s with the Belgian logician Leo Apostel. Apostel (1963) wanted to write a concrete version of Set theory, and, in order to do that, needed axioms that allow him to include in the theory only the classes actually existing in the world. As such, Apostel was led to ask some questions about the well-known axioms of Zermelo-Fraenkel’s Set theory. He did not reject the whole ZF-axiomatics but however suspected axioms like the pairing axiom, the axiom of separation and the power set axiom. He also left optional the axiom of infinity and had rather a negative opinion about the axiom of choice. This project got a new revival with the recent book of Parrochia-Neuville (2013).

The hardships of solving the problem of instability of classifications provided motivation for a search for some clear composition laws to be defined on the set of classifications over a set and to a true algebra of classifications, if possible, which is very difficult because this algebra would have to be, in principle, commutative and non-associative. This search is all the more crucial that a recent theorem proved by Kleinberg (2002) shows that one cannot hope to find a classifying function which would be together scale invariant, rich enough and consistent. This result means that we cannot find empirical stable classifications by using traditional clustering methods.

In the past, some attempts have been made to formalize non-commutative parenthesized products: Comtet (1970) and Neuville, in the 1980s used the Lukasiewicz’s Reverse Polish Notation (RPN), named also Postfix Notation, whose advantage is not only to make brackets or parentheses superfluous, but also to perform calculations on trees in the required order. But, a general algebra of classifications on a set is not known, even if some new models−Loday’s dendriform algebras, for example, which work very well for trees (See Dzhumadil’daev-Löfwall 2002)−are good candidates. In any event, we are invited to look for it, for two reasons. First, the world is not completely chaotic and our knowledge is evolving according to some laws. Second, there exist quasi-invariant classifications in physics (elementary particle classification), chemistry (Mendeleyev table of Elements), crystallography (the 232 groups of crystallographic structures) among others. Most of these good classifications are based on some mathematical structures (Lie groups, discrete groups, and so forth.). To address questions concerning classification theory, and clarify the different domains of it, one may propose this final view (See Figure 6):

  • When our mathematical tools apply only to sense data, we get phenomenal classifications (by clustering methods): these are generally quite unstable.
  • When our mathematical tools deal with crystallographic or quantum structures, we get what we call, using a Kantian concept, noumenal classifications (for instance, by invariance of discrete groups or Lie Groups). These are generally more stables.
  • When we search a general theory of classifications (including infinite ones), we are in the domain of pure mathematics. In this field, ordering and articulating the infinite set of classifications comes to construct the continuum.

Figure 6

Figure 6: Metaclassification

This problem is far from being solved because there are a lot of unstable theories (Shelah 1978, 1998). However, the recent work of Parrochia-Neuville (2013) assumes the conjecture that a metaclassification, that is, a classification of all mathematical schemes of classifications, does exist. The reason is that all these forms may be expressed as ellipsoids of an n-dimensional space (Jambu 1983) that must converge necessarily on a point, the index of the classification. If the real proof comes, this will give a theorem of existence of such a structure from which a number of important results could follow.

7. References and Further Readings

  • Adanson, M. 1757. Histoire naturelle du Sénégal. Paris: Claude-Jean-Baptiste Bauche.
  • Aho, A.V., Hopcroft, J.E, Ulmann, J.D. 1983. Data Structures and algorithms. Reading (Mass.): Addison-Wesley Publishing Company.
  • Agassiz, L. 1962. Essay on Classification (1857), reprint. Cambridge: Harvard University Press.
  • Apostel, L. 1963. Le problème formel des classifications empiriques. La Classification dans les Sciences. Gembloux: Duculot.
  • Aristotle, 1984. The Complete Works. Princeton: Princeton University Press.
  • Barbut M., Monjardet, B. 1970. Ordre et classifications, 2 vol. Paris: Hachette.
  • Barthélemy, J.-P., A. Guénoche. 1988. Les arbres et les représentations des proximités. Paris: Masson.
  • Barwise, J., Seligman, J. 2003. The logic of distributed systems. Cambridge: Cambridge University Press.
  • Béthery, A. 1982. Abrégé de la classification décimale de Dewey. Paris: Cercle de la librairie.
  • Bliss, H. E. 1929. The organization of knowledge and the system of the sciences. New York: H. Holt and Company.
  • Benzécri, J.-P., et alii. 1973. L’analyse des données, 1, La taxinomie, 2 Correspondances. Paris: Dunod.
  • Birkhoff, G. 1935. On the structure of abstract algebras. Proc. Camb. Philos. Soc. 31, 433-454.
  • Birkhoff, G. 1967. Lattice theory (1940), 3rd ed. Providence: A.M.S.
  • Brucker F., Barthélemy, J.-P. 2007. Eléments de Classification, aspects combinatoires et algorithmiques. Paris: Hermès-Lavoisier.
  • Buffon, G. L. Leclerc de, 1749. Histoire naturelle générale et particulière (vol. 1). Paris: Imprimerie royale.
  • Candolle (de), A. P. 1813. Théorie élémentaire de la Botanique ou exposition des principes de la classification naturelle et de l’art d’écrire et d’étudier les végétaux, first edition. Paris: Deterville.
  • Comte, A. 1975. Philosophie Première, Cours de Philosophie Positive (1830), Leçons 1-45. Paris: Hermann.
  • Comtet, L. 1970. Analyse combinatoire. Paris: P.U.F..
  • Dagognet, F. 2002. Tableaux et Langages de la Chimie (1967). Seyssel: Champ Vallon.
  • Dagognet, F. 1970. Le Catalogue de la Vie. Paris: P.U.F..
  • Dagognet, F. 1984. Le Nombre et le lieu. Paris: Vrin.
  • Dagognet, F. 1990. Corps réfléchis. Paris: Odile Jacob.
  • Dahlberg, I., 1976. Classification theory, yesterday and today. International Classification 3 n°2, pp. 85-90.
  • Dale, N., Walker, H. M. 1996. Abstract Data Types: Specifications, Implementations, and Applications. Lexington, Massachusetts: D.C. Heath and Company.
  • Darwin, C.R., 1964. On the Origin of Species (1859), reprint. Cambridge: Harvard University Press.
  • De Grolier, E. 1962. Etude sur les catégories générales applicables aux classifications documentaires, Unesco.
  • Dobrowolski, Z. 1964. Etude sur la construction des systèmes de classification. Paris, Gauthier-Villars.
  • Dubreil, P., Jacotin, M.-L. 1939. Théorie algébrique des relations d’équivalence. J. Math. 18, pp. 63-95.
  • Dzhumadil’daev,A. et Löfwall, C. 2002. Trees, free right-symmetric algebras, free Novikov Algebras and Identities. Homology, homotopy and Applications, vol.(4(2), pp. 165-190.
  • Foucault, M. 1967. Les Mots et les Choses. Paris: Gallimard.
  • Gondran, M. 1976. La structure algébrique des classifications hiérarchiques. Annales de l’Insee, pp. 22-23.
  • Granger, G.-G. 1980. Pensée formelle et Science de l’Homme (1967). Paris: Aubier-Montaigne.
  • Hilman, D.J. 1965. Mathematical classification technics for non static document collections, with particular reference to the problem of revelance. Classification Research, Elsinore Conference Proceedings, Munksgaard, Copenhagen, pp. 177-209.
  • Huchard, M., R. Godin, , A. Napoli, A. 2003. Objects and Classification. ECOOP 2000 Workshop reader, J. Malenfant, S. Moisan, A. Moreira (Eds), LNCS 1964. Berlin-Heidelberg-New York: Springer-Verlag, pp 123-137.
  • Hulswit, M. 1997. Peirce’s Teleological Approach to Natural Classes. Transactions of the Charles S. Peirce Society, pp. 722-772.
  • Jambu, M. 1983. Classification automatique pour l’analyse des données, 2 vol.. Paris: Dunod.
  • Jardine N., Sibson, R. 1971. Numerical Taxonomy. New York: Wiley.
  • Joly, R. 1956. Le thème philosophique des genres de vie dans l’Antiquité grecque. Bruxelles: Mémoires de l’Académie royale de Belgique, classe des Lettres et des Sciences mor. et pol., tome Ll, fasc. 3.
  • Kant, E. 1988. Logic. New York: Dover Publications.
  • Kedrov, B. 1977. La Classification des Sciences (vol. 2). Moscou: Editions du Progrès.
  • Kleinberg, J. 2002. An impossibility theorem for Clustering. Advances in Neural Information Processing Systems (NIPS), 15, pp. 463-470.
  • Krastner M. 1953-1954. Espaces ultramétriques et ultramatroïdes. Paris: Séminaire, Faculté des Sciences de Paris.
  • Larson, J.A., Walden, W.E. 1979. Comparing insertion shemes used to update 3-2 trees. Information Systems, vol.4, pp. 127-136.
  • Lerman, I.C. 1970. Les bases de la classification automatique. Paris: Gauthier-Villars.
  • Lerman, I.C. 1981. Classification et analyse ordinale des données. Paris: Dunod.
  • Lidi R., 2004. Abstract Algebra. Berlin-Heidelberg-New York: Springer-Verlag.
  • Luszczewska-Romahnowa S., Batog T. 1965a. A generalized classification theory I. Stud. Log., tom XVI, pp. 53-70.
  • Luszczewska-Romahnowa S., Batog T. 1965b. A generalized classification theory II. Stud. Log., tom XVII, pp. 7-30.
  • MacNeille 1937. Partially ordered sets. Transaction Amer. Math. Soc., vol. 42, pp. 416-460.
  • Ore O. 1942. Theory of equivalence relations. Duke Math. J. 9, pp. 573-627.
  • Ore O. 1943. Some studies on closer relations. Duke Math. J. 10, pp. 761-785.
  • Parrochia, D., Neuville, P. 2013. Towards a general theory of classifications. Bäsel: Birkhaüser.
  • Peirce C. S. 1880. On the Algebra of Logic. American Journal of Mathematics 3, pp. 15-57.
  • Pierce, R.S. 1970. Classification problems. Mathematical System theory, vol. 4, n°1, March, pp. 65-80.
  • Plato, 1997. The Complete Works. Cambridge: Hacking publishing Company
  • Porphyry, 2014. On Aristotle’s Categories. London, New York: Bloomsbury Publishing Plc.
  • Quine, W.V.O. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Ranganathan, S. R. 1933. Colon Classification. Madras: Madras Library Association.
  • Ranganathan, S. R. 2006. Prolegomena to Library Classification (1937), Reprint. New Delhi: Ess Pub..
  • Rasiowa H., Sikorski, R. 1970. The Mathematics of Metamathematics. Cracovia: Drukarnia Uniwersytetu Jagiellonskiego.
  • Riordan, J. 1958. Introduction to combinatorial analysis. New York: Wiley.
  • Roux, M. 1985. Algorithmes de classification. Paris: Masson.
  • Shelah, S. 1988. Classification Theory (1978). Amsterdam: North Holland.
  • Shröder, E. 1890. Vier Kombinatorische Probleme. Z. Math. Phys. 15, pp. 361-376.
  • Smith, B. 1997. Boundaries: An Essay in Mereotopology. L. Hahn (ed.), The Philosophy of Roderick Chisholm. La Salle, Open Court: Library of Living Philosophers, pp. 534-561.
  • Smith, B. 2003. Groups, sets and wholes. Revista di estetica, NS (P. Bozzi Festschrift), 24-3, 1209-130.
  • Sokal R. R., Sneath, P.H. 1963. Principle of numerical taxonomy. San Francisco: W. H. Freeman.
  • Sokal, R. R., and Sneath, P. H. 1973. Numerical Taxonomy, the principles and practice of numerical classifications. San Francisco: W. H. Freeman.
  • Van Cutsem B. (ed.) 1994. Classification and dissimilarity analysis. New York-Berlin-Heidelberg: Springer Verlag.
  • Veglioni, S. 1996. Classifications in Algebraic specifications of Abstract Data Types. CiteSeerX
  • Windsor, M. P. 2009. Taxonomy was the foundation of Darwin’s evolution. Taxon 58, 1, pp. 43-49.
  • Wille, R. 1996. Restructuring lattice theory: an approach based on hierarchy of concepts. Rival, I (ed.) Ordered Sets. Boston: Reidel, pp. 445-470.
  • Woodward, H. 1903. Memorial to Henry Alleyne Nicholson. M.D., D.Sc., F.R.S. Geological Magazine, 10, pp. 451-452.

 

Author Information

Daniel Parrochia
Email: daniel.parrochia@wanadoo.fr
Université Jean Moulin – Lyon III
France

The Aim of Belief

It is often said that belief has an aim. This aim has been traditionally identified with truth and, since the late 1990s, with knowledge. With this claim, philosophers designate a feature of belief according to which believing a proposition carries with it some sort of commitment or teleological directedness toward the truth (or knowledge) of that proposition. This feature is taken to be constitutive of belief (that is, it is part of what a belief is that it is an attitude having this aim) and individuative of that type of mental state (that is, it is sufficient for distinguishing beliefs from other types of mental attitude like desire and imagining). Philosophers appeal to belief’s aim mainly for explanatory purposes: the aim is supposed to explain a number of other features of belief, such as the impossibility of believing at will, the infelicity of asserting Moorean sentences (for example, “I believe that it is raining, but it is not raining”), and the normative force of evidential considerations in the processes of belief-formation and revision.

Though many tend to agree on the above aspects of the aim, there are major disagreements over two further issues: (1) how to interpret the claim that belief has an aim, and (2) what this aim is. With respect to (1), the claim has received very different interpretations. Some have interpreted it literally, taking the aim as an intentional purpose of believers or a functional goal of beliefs; others have interpreted it metaphorically, as some kind of commitment or norm governing beliefs and their regulation (formation, maintenance, and revision); still others deny that beliefs aim at truth in a substantive sense and endorse minimalist accounts of belief’s truth-directedness. With respect to (2), there is an ongoing debate on whether the aim of belief is truth, knowledge, or some other condition such as epistemic justification.

Table of Contents

  1. The Truth-Directedness of Belief
    1. The Aim as Constitutive and Individuative of Belief
    2. Differences between the Aim and Other Properties of Belief
    3. The Explanatory Role of the Aim
  2. Interpretations of the Aim
    1. Teleological Interpretations
    2. Normative Interpretations
    3. Minimalist Interpretations
  3. What Does Belief Aim At?
  4. Relevance of the Topic
  5. References and Further Reading

1. The Truth-Directedness of Belief

The claim that “belief aims at truth” was first coined by Bernard Williams (1973) to designate a set of properties of beliefs, namely (1) that truth and falsehood are dimensions of assessment of beliefs as opposed to other psychological states and dispositions; (2) that to believe that p is to believe that p is true; and (3) that to say “I believe that p” carries, in general, a claim that p is true; that is, it is a qualified way of asserting that p is true (Williams, 1973, p. 137).

Since Williams, many have taken up the claim that belief aims at truth. However, with such an expression, these philosophers do not refer to a set of properties as Williams did, but to a unique feature of belief (sometimes also called truth-directedness). This feature (that is, aiming at truth) is supposed to capture the specific relation of belief with truth. This relation seems to be peculiar to belief, and to play an important role in the characterization of this type of attitude. No other attitude seems to entertain such a special relation with truth. Like belief, the content of attitudes like (propositional) desires, imaginings, and mere thoughts can be true or false. But differently from these attitudes, beliefs are considered defective if their content is false, or correct if it is true: if I imagine that snow is black, there is nothing defective in my imagination; but if I believe that snow is black, there is something wrong with my belief. Also, we can arbitrarily decide to form or revise attitudes like imagining and assuming regardless of whether we take their contents to be true or false, but this seems not to be possible for beliefs. In short, these attitudes are not sensitive to truth-regarding considerations in the way beliefs are (in both normative and descriptive ways). The relation of belief with truth also differs from that of factive attitudes like knowledge and regret. Differently from beliefs, these attitudes imply the truth of their content. If I know that it is raining now in Paris, then it is true that it is raining now in Paris. But if I believe that, the content of my belief may be false. The relation of belief to truth is thus neither as weak as that of other attitudes like imagining, nor as strong as that of knowledge. This is why it is often conceived as an aim or a commitment toward the truth (or knowledge) of the believed proposition: beliefs may fail to be true (to achieve that aim), not that they may fail to aim at truth.

That granted, a further question is how to interpret the claim that beliefs aim at truth. Philosophers conceive of truth-directedness in very different ways: as an intentional aim of the believer to accept a proposition if and only if it is true; as a function regulating our cognitive processes; as a norm requiring one to believe a proposition only if true; as a value attached to believing truly. In this section I remain neutral on the specific interpretations of the aim, postponing a discussion of these interpretations to §2. The objective of the present section is to introduce some properties commonly attributed to truth-directedness, independent of its specific interpretation. For ease of exposition, it will also be assumed that truth is the aim of belief until §3, where alternative candidates are considered.

Section 1.a introduces two properties commonly attributed to truth-directedness: (1) that it is a constitutive or essential feature of belief, and (2) that it is individuative of belief with respect to other mental attitudes. Section 1.b considers the differences between truth-directedness and other truth-related properties of belief such as the direction of fit and the value of having true beliefs. The truth-aim is usually attributed to belief in order to explain a number of characteristics of this attitude concerning its relation with truth. Section 1.c lists the main features that truth-directedness is supposed to explain.

a. The Aim as Constitutive and Individuative of Belief 

When philosophers attribute an aim to belief, they conceive of this property as constitutive of this type of attitude. This means, roughly, that it is part of what a belief is (that is, part of the essence or the concept of belief) that it is a mental attitude directed at the truth. Let us label this the constitutivity thesis. Depending on how we conceive truth-directedness, there will be different ways of working up to this thesis. If, for example, we interpret truth-directedness as a goal of the agent (compare §2.a), we can conceive of beliefs as analogous to acts like concealing (Steglich-Petersen, 2006, p. 512). Part of what it is to conceal an object X is that it is a type of act involving the goal that someone will not find X. It is in virtue of this goal that an action counts as an instance of concealing. Similarly, a way of stating the constitutivity thesis for belief is that it is part of what S’s believing that p is that S has an aim or goal (or that it is a function of S’s cognitive system) to retain that attitude only if it is true. It is in virtue of this aim of the agent who believes (or this function of her cognitive system) that that attitude counts as belief.

Alternatively, if one interprets truth-directedness as a norm to believe only the truth (compare §2.b), the constitutivity thesis amounts to understanding this norm by analogy to rules constitutive of practices like games (Wedgwood, 2002, p. 268). A practice is constituted by a set of rules if and only if it is part of what that practice is that this set of rules is in force for agents engaged in that practice (Glüer & Pagin, 1998). Consider a specific example: chess is a game constituted by a set of rules stating which moves are legal or permissible in the game. If one plays chess, one is thereby committed by the rules of the game to perform only legal moves. The performance of a particular act does not count as a chess-move if it cannot be assessed (justified, criticized…) according to the constitutive rules of the game. Similarly, if it is part of what a belief is that it is an attitude governed by a norm to believe only the truth, a mental attitude does not count as a belief if it cannot be assessed (criticized, justified…) on the basis of this norm, as right or correct if true and wrong or incorrect if false. One can also conceive of the constitutivity thesis by analogy to other types of entity essentially constituted by norms or values. For example, it is constitutive of what it is to be a citizen to be subject to certain rights and commitments, and it is constitutive of murder to be an act of killing in a wicked, inhumane, or barbarous way (for the latter example, see Dretske, 2000, pp. 243-245).

The claim that truth-directedness is constitutive of belief can be conceived of in at least two ways, as relative to the concept of belief or to its nature. According to the conceptual interpretation, it is a condition of understanding the concept of belief that we conceive of beliefs as mental attitudes directed toward truth (Boghossian, 2003; Engel, 2004; Shah, 2003). A proper understanding of the concept of bachelor implies conceiving of a bachelor as an unmarried man. Analogously, if one has a correct grasp of the concept of belief and conceives of a mental attitude as a belief, she understands it as one that, in some sense to be specified, is directed toward truth.

Other philosophers consider truth-directedness as constitutive of the nature or essence of belief (Brandom, 2001; Railton, 1994; Velleman, 2000a; Wedgwood, 2002, 2007). The relation between belief and truth-directedness is here conceived of as one of metaphysical dependence of the former on the latter: as it is essential to water that it has a certain chemical composition (H2O), it is essential to belief that it is an attitude involving a commitment to or an aim at truth. A mental attitude counts as a belief at least partially in virtue of aiming at the truth. It is simply impossible for an attitude to be a belief if it lacks this property.

It is usually held that the essentialist interpretation of the thesis does not entail the conceptual one (for example, Wedgwood, 2007, ch. 6). It is part of the essence of water, but not of its concept, that water is H2O—we can understand the concept of water without conceiving water as having that specific chemical composition. Similarly the truth-aim may be constitutive of the essence of belief but not of its concept (see Zangwill, 2005 for a similar view). Also, some philosophers have argued that the conceptual interpretation does not entail the essentialist one (Papineau, 2013; Shah, 2003, fn. 41; Shah & Velleman, 2005, fn. 43; Wedgwood, 2007).

The second property commonly attributed to truth-directedness is the individuativity of belief: the aim is the feature that individuates belief as that type of mental state and distinguishes beliefs from other mental attitudes (Engel, 2004; Lynch, 2009; Railton 1994; Velleman, 2000a; Wedgwood, 2002). Though many other attitudes entertain relations with truth (compare §1.b), it is claimed that belief is the only attitude aiming at truth. The truth-aim plays a fundamental role in sorting out beliefs from other mental attitudes, being the distinctive feature of beliefs with respect to other types of attitude like thoughts, suppositions, desires, and imagining.

Philosophers usually appeal to the individuativity of truth-directedness for belief for two main reasons: (1) singling out the aim as a peculiarly distinctive property of belief helps to achieve a better grasp of what truth-directedness is and to distinguish this property from other properties of belief (a philosopher who assumes individuativity in order to define truth-directedness is Velleman, 2000a, pp. 247-252); and (2) individuativity provides an argument to the best explanation for the claim that belief aims at truth: as the argument goes, without assuming that belief’s truth-directedness has this peculiar individuative role, one cannot account for the difference between beliefs and other attitudes (Engel, 2004; Railton, 1994).

It has also been suggested that if truth-directedness is the distinctive feature of belief with respect to other mental attitudes, this would provide an argument for the claim that this property is also constitutive of belief (Lynch, 2009b, 81; McHugh & Whiting, 2014; Velleman, 2000a; Wedgwood, 2002). Here is a way in which this argument may proceed: if the truth-aim were not a necessary and constitutive feature of belief, it would be possible for a belief not to aim at truth. But then, assuming that the aim is the only feature distinguishing beliefs from other mental attitudes, it would be impossible to classify that attitude as a belief rather than as a different type of attitude. Thus, the truth-aim must be a feature that beliefs possess necessarily and essentially. The argument from individuativity is not the only one supporting the constitutivity of truth-directedness for belief. Since other arguments partially depend on normativist interpretations of the aim, they will be considered in 2.b.

A number of critics have pointed out that it is possible to distinguish beliefs from other types of attitude without stipulating that it involves a constitutive aim at truth. These philosophers identify the attitude of believing a proposition with that of merely holding it true or accepting it (Glüer & Wikforss, 2013; Vahid, 2009), or they take other dispositional or motivational properties of belief as distinctive of this type of attitude. For a discussion of some of these views see §2.c.

b. Differences between the Aim and Other Properties of Belief

According to many philosophers engaged in the present debate (in particular those endorsing teleological and normative interpretations), truth-directedness is supposed to characterize and distinguish belief from other types of mental attitude. This property is conceived of as unique to belief, not possessed by any other attitude. These philosophers are careful to distinguish it from other properties relating belief to truth that other attitudes also possess. In this subsection I will introduce some of these properties and explain in which respects they are supposed to differ from the aim of belief. Mentioning these other properties will provide a rough idea of what truth-directedness is not. However, before considering these properties, it is worth mentioning that some philosophers endorsing minimalist conceptions of truth-directedness tend to identify the aim with some of these properties; these alternative interpretations of the aim will be briefly mentioned in this subsection and considered in more detail in §2.c.

An obvious truth-related feature of belief is the fact that believing something is believing it to be true (Velleman, 2000a). In other words, beliefs have propositions as content, and propositions can be true or false. This property is obviously not individuative of belief, and thus cannot be identified with truth-directedness. All propositional attitudes share it with beliefs. For instance, believing that p is believing that p is true, hoping that p is hoping that p is true, imagining that p is imagining that p is true, and so on (Engel, 2004; Velleman, 2000a).

It is also commonly held that beliefs involve specific causal, functional, and dispositional-motivational roles with respect to action and behavior. Some of these roles determine another aspect under which beliefs are related to truth. Using Ramsey’s (1931) metaphor, beliefs are like maps by which we steer in the world and upon which we are disposed to act. Belief is an attitude involving dispositions to act and behave as if its content were true and to use it as premise in reasoning (Armstrong, 1973; Stalnaker, 1984). Some have argued that belief’s aim at truth can be identified with the possession of similar dispositional and functional properties. In response to this challenge, it has been argued that these properties are not sufficient to set belief apart from other mental attitudes, and thus to capture the distinctive relationship between belief and truth (Engel, 2004; Velleman, 2000a). Other types of attitude seem to possess these very same properties. For instance, attitudes like acceptance and pretense all seem to dispose the subject to act as if their content were true and have the same motivational role.

Another property commonly attributed to belief, and concerning the way it is related to truth, is its mind-to-world direction of fit. On the one hand, some attitudes, like desires, have a world-to-mind direction of fit: if what is desired is not the case, the world should be changed in order to fit what is desired, and not vice versa. On the other hand, other attitudes, like beliefs, have a mind-to-world direction of fit: if what is believed is not the case (that is, it does not fit what it is supposed to represent), the belief’s contents should be revised to fit the world, and not vice versa. This is only one way of fleshing out the distinction (see Frost, 2014 and Humberstone, 1992 for overviews of the distinction). Another popular way is to distinguish between cognitive and conative states, where cognitive states are such that the proposition in their content is regarded as something that is true, while conative states are such that they involve regarding the proposition in the content as something to be made true (Velleman, 2000a). It is difficult to evaluate the relation of the truth-directedness of belief with direction of fit, since this depends on which account of direction of fit one accepts, and there is no unique and undisputed account. Some philosophers seem to identify belief’s direction of fit with its aim at truth (Humberstone, 1992; Platts, 1979). Others (Engel, 2004; Shah & Velleman, 2005; Velleman, 2000a) distinguish the two features, arguing that other mental attitudes such as suppositions, assumptions, and imagining possess the same direction of fit as beliefs, and thus this property cannot be identified with truth-directedness, which is distinctive of beliefs. Notice that the persuasiveness of this argument depends on whether one endorses an account of direction of fit according to which other attitudes would have the same direction of fit as belief.

It is also important to distinguish the truth-directedness of belief from the value of possessing true beliefs. It has been argued that having true beliefs is something valuable (David, 2005; Horwich, 2006; Kvanvig, 2003; Lynch, 2004). We naturally prefer to have true rather than false beliefs, and tend to attribute some sort of value to true beliefs and disvalue to false ones. It seems to be a platitude that true beliefs are at least extrinsically and instrumentally valuable. For example, we might prefer true beliefs to false ones because the former are more conducive to the satisfaction of one’s desires and the avoidance of dangers. Some philosophers have argued that true beliefs have also epistemic value. For example, it has been argued that believing the truth is an intrinsically valuable cognitive success. Though one might expect there to be important connections between the two topics, the issue of whether true beliefs are valuable must be distinguished from the further issue of whether truth is the aim of belief. While the former is a matter of aims, goals, and evaluations extrinsic to the notion of belief (for example, the goal of believing truths and not believing falsehoods), the latter is a property intrinsic and constitutive of such a mental state (Vahid, 2006, 2009, p. 19). Another respect in which the two features must be distinguished is that the value of true beliefs is hardly individuative of beliefs: other types of mental state such as guesses, hypotheses, and conjectures are evaluable according to their being true or false. In spite of these important differences, some philosophers have suggested that the value of true beliefs can be at least in part related to and explained by the constitutive aim of belief, even if not identified with it (Engel, 2004; Lynch, 2004 Railton, 1994; Williams, 2002).

c. The Explanatory Role of the Aim

The hypothesis that beliefs involve an aim at truth has been used to explain a number of features specific to this mental attitude. Before considering such features, it is important to stress that not everyone who endorses some version of this hypothesis thinks that it can explain all of these features. The main features supposed to be explained by truth-directedness are the following:

  • The difficulty or impossibility of believing at will,
  • The infelicity of asserting Moorean sentences and the absurdity of having Moorean beliefs,
  • The normativity of mental content,
  • The motivational force of evidential considerations in deliberative contexts,
  • The nature of epistemic normativity and the norms governing belief and theoretical reasoning, and
  • The correctness standard of belief.

(1) As famously argued by Williams (1973), belief’s truth-aim would enable one to explain the difficulty of believing at will (see also Velleman, 2000a). Believing a proposition p at will would entail believing it without regard to whether p is true. However, if beliefs constitutively involve aiming at truth, the only considerations relevant to forming and maintaining a belief would be those in conformity to its constitutive aim; that is, truth-relevant considerations. Believing at will would thus be either impossible or very difficult. This line of argument has been widely discussed in the literature. For critical discussions see, for example, Frankish (2007); Hieronymi (2006); Setiya (2008); and Yamada (2012).

(2) Belief’s truth-directedness could also explain the infelicity of asserting Moorean sentences and the absurdity of thinking Moorean thoughts—sentences and thoughts having the form “I believe that p, but not p” (for example, Baldwin, 2007; Littlejohn, 2010; Millar, 2009; Moran, 1997; Railton, 1994). Though these sentences are not self-contradictory, if asserted, they sound odd and infelicitous. As Moore (1942, p. 543) observes, this feature of belief-ascription seems to show that self-ascribing a belief in the first person carries with it an implied claim to the truth of the believed proposition. Similar ascriptions relative to many other mental states involve either no infelicity (there is no paradox in asserting “I assume that p but it is false that p”) or a contradiction (it is contradictory to assert “I know that p but it is false that p”). The infelicity of asserting Moorean sentences can be explained as follows: on the one hand, an assertion is an act by which the speaker commits herself to the truth of what she says; on the other hand, a belief is a mental state involving an aim at the truth of the believed proposition. We can also think of this aim as a sort of commitment (Baldwin, 2007; Millar, 2009; see §2.a for normative interpretations of the aim). The infelicity would thus be due to a conflict between the respective constitutive commitments or aims of assertion and belief. By asserting a Moorean sentence like “p and I do not believe that p,” a speaker would both endorse a commitment to the truth of p and deny such a commitment at the same time. This explanation can be easily extended to an explanation of the unreasonableness of Moorean thoughts and judgments, since a judgment, like an assertion, can be considered an act involving a commitment to the truth of what is adjudged.

(3) Many philosophers argue that mental content is normative (for an overview and references, see Glüer & Wikforss, 2010). This thesis is often interpreted as the claim that there are norms governing the correct use of concepts in the content of propositional mental attitudes. An example of such norms is, for instance, that the concept white is correctly applied to an object x if and only if x is white. Some have suggested that the aim of belief can provide an explanation of the normativity of mental content. In particular, Velleman (2000a) has suggested that the normativity of content can be entirely reduced to the truth-directedness of belief: if there is a norm governing mental content, this norm applies only to the contents of attitudes that aim at truth; that is, to beliefs. Boghossian (2003) has provided an argument according to which the normativity of mental content would derive from that of belief. First, he argues that the truth-directedness of belief has to be conceived as a norm constitutive of the concept of belief. Second, he argues that there is a constitutive connection between the notions of content and belief: our grasp of the concept of content depends on the grasp of the concept of belief. The normativity of content would thus be inherited by the normativity of belief. This argument has been the target of several criticisms; see, in particular, Glüer & Wikforss (2009) and Miller (2008).

(4) Belief’s truth-directedness has also been invoked to explain certain aspects of doxastic deliberation (namely deliberation concerning what to believe). One such aspect is the motivational force of evidential considerations in deliberative contexts. In particular, Shah (2003) and Shah and Velleman (2005) have argued that truth-directedness can explain doxastic transparency, the phenomenon according to which, in the context of doxastic deliberation, the question whether to believe that p is invariably settled by the answer to the further question whether p is true. Roughly, the idea is that when an agent engages in deliberation whether to believe a given proposition, only evidential (truth-regarding) considerations can be treated as reasons for believing. Other types of considerations (for example, practical) have no motivational force in the deliberation. This can be explained by the hypothesis that the concept of belief is constitutively governed by a norm to believe p only if p is true, and that in doxastic deliberation, the agent deploying that concept in the question whether to believe that p is motivated by the truth-norm to form a belief only if it is true. This in turn explains why only truth-relevant considerations matter in answering the question. Other philosophers have provided similar explanations of doxastic transparency—and more generally of the central role of evidence in deliberative belief-formation processes—compatible with non-normative interpretations of the truth-aim (for example, Steglich-Petersen, 2006, §5). It is worth noting here that similar explanations of the impossibility of believing in response to non-evidential considerations can also be used to explain the impossibility of believing at will (see (1) above).

(5) Belief’s aim has also been invoked to explain the various norms governing belief and theoretical reasoning, and to shed light on the nature of epistemic normativity in general. For example, according to Velleman, belief’s truth-directedness accounts for the justificatory force of theoretical reasoning. Theoretical reasoning justifies a belief by adducing considerations that indicate it to be true (2000a, p. 246). This is the case because being true is what satisfies the aim of belief. Other philosophers have argued that belief’s aim helps to explain norms of rationality and justification governing beliefs, and, more generally, the nature of epistemic normativity (Boghossian, 2003; Millar, 2004, 2009; Shah & Velleman, 2005; Sosa, 2007; Wedgwood, 2002, 2013). A common explanation takes these norms as instrumentally conducive to the satisfaction of the constitutive truth-aim of belief. This approach to epistemic normativity is not new in the literature. Many philosophers of the past have argued that epistemic standards of justification and rationality would be derivable from the fundamental goal of believing truly and avoiding falsehoods (for an overview see Alston, 2005, chs. 1 and 2). Criticisms of this type of approach to epistemic normativity typically mirror arguments against similar approaches in the practical domain. See, for example, Berker (2013); Firth (1981); Kelly (2003); and Maitzen (1995).

The various attempts to reduce or explain epistemic normativity in terms of a fundamental aim or norm of truth governing belief are considered by some philosophers as part of a wider project directed at providing analogous accounts for other normative domains. In particular, some have argued that practical normativity can be tracked back to constitutive norms of action and agency, which in turn would determine derivative norms of practical rationality and justification (Korsgaard, 1996; Shah, 2008; Velleman, 2000a; Wedgwood, 2007).

(6) According to some philosophers, the aim at truth would also explain why a belief is correct if and only if it is true, that is, the so-called correctness standard of belief (Steglich-Petersen, 2006, 2009; Velleman, 2000a). Philosophers endorsing teleological interpretations of the aim hold that the standard would be an instrumental assessment indicating the measure of success that a belief must attain in order to achieve its constitutive aim. However, this thesis is the subject of major disagreements. Philosophers giving normative interpretations of truth-directedness either identify the correctness standard with the constitutive aim of belief (Engel, 2007; Wedgwood, 2002), or argue for the independence of the two (Shah & Velleman, 2005).

2. Interpretations of the Aim

In the contemporary debate, there is a wide disagreement on how to interpret the claim that belief aims at truth. There are two main interpretations of the aim: teleological and normativist. According to teleological accounts, the aim of belief is an intentional purpose of subjects holding beliefs, or a functional goal of cognitive systems regulating the formation, maintenance, and revision of beliefs. Normativist accounts hold that the claim that beliefs have an aim must be interpreted metaphorically. According to normativists, truth-directedness is better understood as a commitment, a norm governing the regulation of beliefs (their formation, maintenance, and revision). Other philosophers have endorsed minimalist accounts of truth-directedness, denying that beliefs aim at truth in a substantive sense.

a. Teleological Interpretations

Teleological (also called “teleologist”) interpretations hold that beliefs are literally directed at truth as an aim, an end, or a goal (telos in Greek). This aim would be realized in truth-conducive processes and practices of belief-regulation, whose role is the formation, maintenance, and revision of beliefs. An attitude would count as a belief only if it is formed and regulated by these processes and practices. An advantage commonly attributed to teleological interpretations is that these interpretations seem more compatible with a naturalistic account of belief than rival interpretations (in particular, normativist ones). The thought is that intentions, goals, or functions can be accounted for in naturalistic terms. Furthermore, this interpretation would naturally fit with broadly instrumentalist, naturalistically unproblematic conceptions of epistemic normativity and epistemic rationality (note however that these conceptions have been the target of many criticisms; for example, Berker, 2013; Kelly, 2003).

Teleological interpretations differ with respect to how they conceive the aim at truth. Some teleologists interpret the aim of belief as an intentional goal of the subject, like an interest to accept a proposition only if it is true. For example, according to Steglich-Petersen (2006), believing is accepting a proposition with the purpose of getting its truth-value right. According to such an interpretation, the aim is realized through deliberative practices like judgments, in which an agent accepts a proposition only if she has evidence in support of its truth, and maintains that acceptance in the absence of contrary evidence. Steglich-Petersen recognizes that many of our beliefs are regulated in entirely sub-intentional ways. However, he argues that only beliefs considered at an intentional level are connected to a literal aim:

cognitive states and processes that are not connected with any literal aim or intention of a believer can nevertheless count as ‘beliefs’ in virtue of […] being to some degree conducive to the hypothetical aim of someone intending to form a belief in the primary strong sense. (2006, p. 515)

Other philosophers have advanced sub-intentional interpretations of the aim, conceiving it as a functional goal of the attitude or the psychological system to form true beliefs and revise false ones. This function would be regulated at a sub-personal, often unconscious level. A similar approach has been defended by Bird (2007) and McHugh (2012b). Some authors also interpret certain functionalist accounts such as those of Burge (2003), Millikan (1984), and Plantinga (1993) as teleological in this sense (see, for example, McHugh, 2012a, fn. 6, 2012b, fn. 49).

The most popular interpretation of the aim is a “mixed” one, according to which truth-directedness would be constituted by both intentional and sub-intentional processes. In particular, Velleman (2000a) maintains that there is a broad spectrum of ways in which the aim can be regulated. While sometimes it is realized in the intentional aim of a subject in an act of judgment about a certain matter, at other times there are cognitive systems in charge of the regulation of belief designed to ensure the truth of such mental states. Such systems would carry out this function more or less automatically, not relying on the subject’s intentions. Other philosophers who distinguish between intentional and sub-personal levels of regulation of the aim are Millar (2004, ch.2); Sosa (2007, 2009).

A well-known objection to teleological accounts, provided by Owens (2003), is specifically directed at intentional and mixed interpretations of the aim (for similar objections see Kelly, 2003). Owens observes that if beliefs aim at truth as argued by teleologists, believing would be similar to guessing. Guesses are mental acts aiming at truth, in the sense that when one guesses, one strives to give the true answer to a question. As Owens writes,

a guesser intends to guess truly. The aim of a guess is to get it right: a successful guess is a true guess and a false guess is a failure as a guess. Someone who does not intend to guess truly is not really guessing. (2003, p. 290)

 According to a teleologist perspective, similar considerations are valid for belief, which is a mental state held with the purpose of holding it only if true. But there are at least two important disanalogies between the intentional aim involved in guessing and the aim of belief.

First, the aim of belief does not interact with other aims of the subject the way the truth-aim of guesses does. The aim of guessing (as well as that of other goal-directed activities) can interact with other goals and objectives of the subject, it can conflict with these other goals, and it can be weighed with them. In particular, when we guess, we integrate the truth-aim constitutive of guessing with other purposes, such as the practical relevance of guessing, and we consider guessing that p reasonable when aiming at the truth by means of a guess that p would maximize expected utility (Owens, 2003, p. 292). If beliefs, like guesses, constitutively involve an aim at truth, then we should expect that, on at least some occasions, we would weigh the aim of belief against other aims. For example, when engaged in deliberation about whether to believe a given proposition, our pursuit of the truth-goal may be constrained by other goals and purposes of the subject in the usual way. But belief’s aim does not work like this. A large reward to believe that today it is not sunny gives me a reason to try to believe it, but, in deliberation about what to believe, these considerations do not interact and cannot be weighed with the truth-aim of belief in the way they do with other aims and purposes of the subject. In this respect, belief appears to be “insulated” from all but one aim, in a way that aim-directed behaviors in general are not (McHugh, 2012a, p. 430).

The second disanalogy suggested by Owens is that, in guessing, we can exercise a kind of voluntary control that is not possible in the case of belief. The guesser can compare different considerations and then decide whether to terminate her inquiry and guess. Nothing similar happens in deliberation about whether to believe a given proposition, where one cannot decide when to conclude her inquiry and start believing a proposition. The deliberation is concluded more or less automatically and cannot be controlled by reflection on how best to achieve the aim. Given these disanalogies, Owens concludes that while a guess is an attitude regulated by an intentional aim at truth, belief is not.

Teleologists have provided some replies to Owen’s argument. In particular, it has been argued that the aim of belief does in fact interact and can be weighed with other aims (Steglich-Petersen, 2009); it has been denied that evidential considerations play the exclusively prominent role in belief-formation suggested by Owen’s argument (McHugh, 2012a; for a similar point, though not directly related to Owen’s argument, see Frankish, 2007); and it has been argued that the direct form of control we have on the formation of guesses, but not of beliefs, can be explained by the fact that belief is a mental state, while guessing is a mental act (Shah & Velleman, 2005).

A related problem for a teleological interpretation of the aim is that sometimes we are completely indifferent to certain matters, and sometimes we even prefer (have the goal or aim) not to have any belief on certain matters. Nevertheless, evidence for these truths constitutes reasons for us to believe them, and if presented with such evidence in normal circumstances, we cannot refrain from forming beliefs on these matters. This seems to show that truth-directedness, and more generally epistemic rationality, cannot be reduced to aim-directed activities in the common sense of the term (Kelly, 2003).

Another very popular argument against teleological accounts of truth-directedness is Shah’s (2003) “teleologist’s dilemma”. The dilemma relies on the following observation: on the one hand, in practices of doxastic deliberation—deliberation directed at forming a belief about a certain matter—considerations concerning the evidence in support of the truth of a given proposition are the only ones that are relevant in order to answer the question whether to believe that proposition (this is what Shah calls the phenomenon of doxastic transparency; compare §1.c). On the other hand, some belief-formation processes can be influenced by non-evidential factors (for example, cases of wishful thinking). In an attempt to explain these two types of belief formation, the teleologist is pushed in two incompatible directions: she can consider the truth-aim as a disposition so weak as to allow cases in which beliefs are caused by non-evidential processes, in which case she cannot account for the exclusive influence of evidential considerations in deliberative contexts of belief-formation; alternatively, in order to account for the exclusive role of evidence in doxastic deliberation, she can strengthen the disposition that constitutes aiming at truth so that it excludes the influence of non-truth-regarding considerations from such kinds of reasoning—but then she cannot accommodate non-deliberative cases in which non-evidential factors influence belief-formation. In either case, the teleologist cannot explain the truth-regulation of belief in both deliberative and non-deliberative contexts. Therefore, a teleologist interpretation of the aim is not sufficient alone to provide an explanation for the truth-directedness of beliefs in all processes of belief formation.

In order to address this problem, Shah & Velleman (2005) argue that belief is regulated by two levels of truth-directedness: a sub-intentional teleological mechanism responsible for weak regulation in non-deliberative contexts, and one conceived in normative terms, able to explain the strong truth-regulation in deliberative contexts (see §2.b). For accounts of the dilemma compatible with a teleologist perspective, see, for example, Steglich-Petersen (2006) and Toribio (2013). For other objections to the teleological account, see Engel (2004) and Zalabardo (2010).

b. Normative Interpretations

Another way of interpreting belief’s truth-directedness has been through normative terms. According to normativist accounts of the aim of belief, the claim that “belief aims at truth” is just a metaphorical way of expressing the thought that beliefs are constitutively governed by a norm prescribing (or permitting) one to believe the truth (or only the truth). For example, if Mary forms the belief that it is now 12 a.m., she does what the norm requires (that is, she possesses a right belief) if that proposition is true, and she violates the norm if that proposition is false. Many normativists identify the norm of belief with a standard of correctness:

(C) a belief is correct if and only if the believed proposition is true

These philosophers take this standard to be constitutive of the essence or the concept of belief: belief would be a mental state characterized by the fact of being correct if and only if it is true (see §1.a for more details on normativist interpretations of the constitutivity thesis). This interpretation of the aim is probably the most popular in the early 21st century. It has been defended by, among others, Boghossian (2003); Engel (2004, 2013); Gibbard (2005); Millar (2004); Shah (2003); Wedgwood (2002, 2007, 2013).

Let us here clarify a common confusion about the claim that belief is constitutively governed by a norm: that a truth-norm constitutively governs belief does not mean that all beliefs necessarily satisfy that norm. What is constitutive of belief is not the satisfaction of the norm (as a matter of fact, many beliefs happen to be false, and thus incorrect), but that the norm is in force and believers and their beliefs can be assessed and criticized according to it—as correct if the belief is true, and incorrect if it is false.

One of the best known arguments for a normative interpretation of truth-directedness, suggested by Shah (2003), is the argument to the best explanation of doxastic transparency. As mentioned in §2.a, transparency is the (alleged) phenomenon according to which the deliberative question whether to believe a given proposition p is invariably settled by answering the further question whether it is true that p. The two questions are answerable to the same set of considerations; that is, considerations concerning the evidence for or against the truth of p. This phenomenon is specific to deliberative contexts in which an agent explicitly considers whether to believe a given proposition. In such contexts, only evidential (truth-relevant) considerations can influence belief-formation. In contexts in which a subject forms a belief without passing through a deliberative process, on the contrary, non-evidential considerations could influence the belief-formation.

According to Shah, only a normative interpretation of the aim of belief can explain these facts—doxastic transparency, why this phenomenon is specific to doxastic deliberation, and the exclusive role of evidential considerations in deliberative contexts. The explanation is the following: let us assume that it is constitutive of the concept of belief that a belief is correct if and only if it is true. This is interpreted as the claim that someone believing a proposition p is under a normative commitment to believe p only if it is true. When a subject engages in doxastic deliberation and asks herself whether to believe a given proposition, she deploys the concept of belief. Assuming she understands this concept and is aware of its application conditions, she interprets this question as whether to form a mental attitude that she should have only if the proposition is true. This in turn determines a disposition to be moved only by considerations relevant to the truth of p. This explains transparency and the exclusive role of evidential considerations in deliberative contexts. In contrast, in non-deliberative contexts where belief-formation works at a sub-intentional level, the subject does not explicitly consider the question whether to believe p, does not deploy the concept of belief, and is not thereby motivated by the norm to regard only truth-relevant considerations as relevant in the process of belief-formation. For this reason, non-evidential factors can influence belief-formation in these contexts. In sum, Shah’s normativist account allows him to explain both the strong role of truth in the belief regulation in deliberative contexts, and its weak role in non-deliberative ones.

An objection to Shah’s argument is that it assumes an implausibly strong form of motivational internalism according to which the norm of belief necessarily and immediately motivates the agent when she recognizes and accepts it. This contrasts with the ways in which, in general, norms tend to motivate agents (McHugh, 2013; Steglich-Petersen, 2006).

Another argument for a normativist account of truth-directedness, suggested by Wedgwood (2002), is composed of two steps. First, it is argued that the correctness standard of belief expresses a relation of strong supervenience (correctness of a belief strongly supervenes on the truth of that belief’s content). The standard thereby articulates a necessary feature of belief: necessarily, all true beliefs are correct and all false beliefs are incorrect. Second, since the standard articulates a necessary feature of belief, it is an essential feature of beliefs. Both steps of the argument have been criticized (for example, Steglich-Petersen, 2008, pp. 277-278). Against the second step, one cannot infer from a thing necessarily possessing a certain property to the property being essential to the thing it is a property of—using a well-known example of Fine (1994, pp. 4-5), one cannot infer from the necessary claim that Socrates is the only member of the singleton having as its only member Socrates to the claim that it is essential to Socrates that he is the only member of that singleton. Against the first step, it has been argued that it relies on contentious assumptions about normative supervenience: it is an error to deduce from the supervenience of a normative property N over a non-normative property G the necessity of the claim that every object having property G also has property N. The most one can conclude is that, necessarily, if some object has property N in virtue of having property G, then anything with property G also has property N (where necessity here takes a wide scope on the conditional). For similar considerations on normative relations of supervenience, see Blackburn (1993, p. 132); and Steglich-Petersen (2008).

Other arguments often used in support of the normativist interpretation do not clearly favor this interpretation over alternative substantive conceptions of truth-directedness, such as teleological ones. For example, it has been argued that unless one assumes that belief is constitutively governed by a truth-norm, one is not in a position to distinguish beliefs from other cognitive propositional attitudes, such as assuming, thinking, or imagining. The assumption that belief is constitutively governed by a truth-norm has also been used in arguments to the best explanation of a number of features of belief such as (1) the infelicity of asserting Moorean sentences; (2) the disposition to rely on a believed proposition as a reason for action and a premise in practical reasoning (Baldwin, 2007, p. 83); and (3) the relation between belief, assertion, evidence, and action (Griffiths, 1962). See §1.c for discussion of some of these arguments. These various arguments have received formulations both in normativist and teleological terms (for teleological formulations, simply replace occurrences of “truth-norm” with “truth-aim”); for this reason, they do not favor either interpretation. It is also worth mentioning that the claim that belief is constitutively normative has received indirect support from views that, for independent reasons, hold that intentional attitudes in general are constitutively normative (Brandom, 1994; Millar, 2004; Wedgwood, 2007).

Though the normativist interpretation has been the most popular in the last two decades, it has also been the target of several criticisms. According to the No Guidance Argument, a truth-norm is incapable of guiding an agent in the formation and revision of her beliefs. One can conform one’s beliefs to a norm requiring one to believe only true propositions only by first forming beliefs about whether these propositions are true. The only way to follow this norm will thus be continuing to believe what one already believes. Such a norm would not provide any guidance as to what a subject should do in order to comply with it. More precisely, this norm would have no guiding role in processes of belief regulation (formation, maintenance, and revision). Versions of this argument have been given by Glüer & Wikforss (2009, pp. 44-45); Horwich (2006, p. 354); and Mayo (1963, p. 141). A reply to this argument consists in arguing that even if the truth-norm does not provide any direct guidance, it can guide belief regulation indirectly, via some other derived principle like norms of evidence and rationality (Boghossian, 2003; Wedgwood, 2002); or it could guide in specific contexts, such as in doxastic deliberation where an agent explicitly considers her evidence for a given proposition p with the aim of making up her mind about whether p (Shah & Velleman, 2005). For a further defense of the argument see Glüer & Wikforss (2010b, 2013).

Another criticism of the normative interpretation is that, in general, an agent subject to a norm should have some form of intentional control over the actions necessary to satisfy it and be free to choose whether to conform to the norm or not. These conditions on control and freedom to comply seem to be constraints on norms in general. However, belief formation is (at least often) an involuntary process and is realized at an automatic, non-inferential level. It is thus unclear how a truth-norm governing belief can satisfy the above constraints. For versions of this objection, see Glüer & Wikforss (2009); and Steglich-Petersen (2006).

Another problem for normative interpretations of truth-directedness concerns the formulation of the alleged norm of belief. If beliefs are constitutively governed by a truth-norm, it should be possible to state this norm in terms of some duty, prescription, or permission. However, all the suggested formulations seem to be affected by some problem. Bykvist and Hattiangadi (2007), in particular, consider several possible formulations of the norm and conclude that none of them is free from problems. The best known formulations are the following:

(1) For any S, p: S ought to (believe that p) iff p

(2) For any S, p: if S ought to (believe that p), then p

(3) For any S, p: S ought to (believe that p iff p)

All these formulations are flawed in some way. (1) implies that one ought to believe every true proposition, included trivial and uninteresting ones (see also Sosa, 2003 for a similar point). Furthermore, provided there are true propositions that it is impossible to believe (for example, it is raining and nobody believes that it is raining), (1) violates the commonly accepted rule according to which “ought” implies “can.” (2) seems not to be normatively interesting because it is unable to place any requirement on believers—if p is true, nothing follows from it about what S ought to believe; and if p is false, it only follows that it is not the case that S ought to believe that p; it does not follow that S ought not to believe that p. (3) is problematic for it does not allow one to derive claims about what one ought or ought not to believe. For example, from (3) and the falsity of p, one cannot derive that one ought not to believe p. Furthermore, (3) seems to be subject to the same objections raised against (1).

Bykvist and Hattiangadi raise similar objections to other formulations of the truth-norm. From this, they conclude that this general failure could be considered a clue that belief is not at all a normative concept, at least not in the way suggested by normativists. Many have considered this conclusion too hasty. First, even if all the available formulations of the norm are wrong, this does not mean that it is impossible to formulate the norm of belief in “ought” terms; it could simply mean that the right formulation is yet to come. Second, some have suggested alternative formulations that seem to avoid the above problems. For example, Whiting (2010) has suggested that interpreting the truth-norm as a norm of permissibility could avoid most of the problems. Other formulations avoiding these problems have been suggested by Littlejohn (2010), Fassio (2011), and Raleigh (2013). For a discussion, see Bykvist and Hattiangadi (2013). A third way of avoiding these objections is to deny that the norm of belief is a truth-norm (see §3).

A reply to the various considered objections to normative interpretations consists in abandoning a deontic conception of the truth-norm, according to which the norm would be like a prescription, directive, or permission. Adopting alternative non-deontic interpretations of the norm would allow one to avoid the various objections considered above. Some have suggested interpreting the normativity of belief in evaluative terms; that is, in terms of what it is good (in a certain sense of “good” to be specified) to believe (Fassio, 2011; McHugh, 2012b). Others have interpreted the norm of belief as involving a type of normativity sui generis (McHugh, 2014; Rosen, 2001, p. 621), as an ideal of reason (Engel, 2013), or as an “ought to be” in the Sellarsian sense, not requiring addressees of the norm to be capable of voluntarily following it (Chrisman, 2008).

For other criticisms of normativist interpretations of truth-directedness, see also Davidson (2001); Dretske (2000); Horwich (2006, 2013); and Papineau (1999, 2013). The common factor in these criticisms is the defense of the thesis that if there are norms governing beliefs, these are practical, contingent, and not constitutive of belief. Replies to some of the above objections are in Engel (2007, 2013); Shah & Velleman (2005); and Wedgwood (2013).

c. Minimalist Interpretations

The label “minimalist interpretations” is used here for a range of different views. The common factor of these views is that they deny that there is such a property as a truth-aim of belief, at least if one identifies it with some feature different from those considered in §1.b. Minimalists hold that the features supposedly explained by the aim of belief (see §1.c) can be explained by other properties commonly ascribed to these mental states, such as their causal, dispositional, functional, or motivational roles, or their direction of fit (Davidson, 2001; Dretske, 2000; Papineau, 1999).

Given the present characterization, one may wonder whether there is a clear-cut dividing line between teleological and minimalist views; in particular, between sub-intentional teleological views, identifying the aim at truth with functional mechanisms of the cognitive system, and dispositionalist and functionalist minimalist accounts. A way of distinguishing these two approaches is by looking at the dispositional or functional role distinctive of belief (compare McHugh, 2012a, fn. 6): while functionalist accounts congenial to teleological approaches to the truth-aim focus on the input side of belief’s functional role, and exclusively identify this role with a truth-directed goal (for example, forming true beliefs), minimalists think that the role that individuates belief is at least partially on the output side, and they are mainly concerned with practical roles of belief, such as satisfying the subject’s desires or providing reasons for action.

Some philosophers have endorsed accounts of belief according to which causal, dispositional, and/or functional roles of beliefs with respect to action and behavior would be sufficient to characterize and individuate this type of mental attitude. Some have argued that the specific relation between belief and truth can be fully explained by the functional role of beliefs of providing reasons for action. Others have argued that all that is necessary for an attitude to qualify as a belief is that it dispose the subject to behave in ways that would promote the satisfaction of his desires if its content were true. For similar views see, for example, Stalnaker (1984). Armstrong (1973) argues that an essential function of beliefs is moving a subject to action given the presence of suitable dominant desires and purposes, and locates in this causal role the peculiar difference between belief and other mental attitudes such as mere thoughts. Still others have identified the aim at truth of belief with its direction of fit (Humberstone, 1992; Platts, 1979).

A “deflationary” interpretation of truth-directedness has been defended by Vahid (2006, 2009). Vahid first considers the feature of accepting-as-true, introduced by Velleman (2000a), as common to all cognitive states (beliefs, assumptions, conjectures, imaginations…). He suggests that to capture the truth-directedness of belief, one should not add any further (teleological or normative) property to the fact that belief is an attitude of regarding-as-true. Rather, what is distinctive of belief according to Vahid is the specific way in which one regards-as-true a given proposition. While other attitudes involve regarding a proposition as true for the sake of something else, in order to reach certain specific goals (for example, assuming is regarding-as-true for the sake of argument, imagining involves regarding a proposition as true for motivational purposes), believing is regarding a proposition as true for its own sake, as an end in itself.

The main criticism directed at minimalist interpretations is that other mental states such as suppositions and assumptions possess these same properties (same causal, functional, dispositional, and motivational roles; same direction of fit) and, thus, that these properties are not sufficient alone to individuate the peculiar truth-directedness of belief, to explain the special features of belief listed in §1.c, and to distinguish beliefs from other types of mental attitude (Engel, 2004; Velleman, 2000a). For a reply, see, for instance, Zalabardo (2010, §10), who challenges the claim that a purely motivational conception of belief would not be sufficient to distinguish beliefs from other mental attitudes. See also Glüer & Wikforss (2009, p. 42).

3. What Does Belief Aim At?

There is debate concerning whether the aim of belief is truth, as has been traditionally argued, or some other property. Since the late 1990s, an increasing number of philosophers have defended the claim that knowledge is the fundamental aim or norm of belief. Upholders of this view are, among others, Adler (2002); Bird (2007); Huemer (2007); Littlejohn (2013); Peacocke (1999); Sutton (2007); and Williamson (2000). The best known defender of the thesis that belief would aim at knowledge is Williamson (2000). Williamson’s main motivations to hold this thesis derive from his view about the nature of knowledge and its relation to belief. Williamson criticizes the idea that it is possible to provide an analysis of knowledge in terms of other more fundamental notions. Rather, other epistemic notions such as belief and justification should be understood as derivative from the more fundamental notion of knowledge—this is the so-called Knowledge First approach in epistemology. In particular, Williamson suggests that belief be considered roughly as the attitude of treating a proposition as if one knew it. Knowledge would thus fix the standard of appropriateness or the success condition for a belief, and merely believing p without knowing it would be a sort of “botched knowing.” (2000, p. 47). In this sense, belief would not aim merely at truth but at knowledge.

A well-known argument for the knowledge aim is based on a parallel between assertion and belief. On the one side, many have argued that assertion is constitutively governed by a knowledge norm (Adler, 2002; Bird, 2007; Sutton, 2007; Williamson, 2000):

(KNA) one should assert p only if one knows p.

On the other hand, some philosophers have suggested that occurrent belief is the inner analogue of assertion (for example, Williamson, 2000, pp. 255-256). More precisely, the idea is that (flat-out) assertion is the verbal counterpart of a judgment, and a judgment is a form of occurrent (outright) belief. If so, it is plausible that assertion and belief are governed by the same norm, and knowledge would be the norm of belief too:

(KNB) one should believe p only if one knows p.

Similar arguments have been suggested by Adler (2002); Bird (2007); McHugh (2011); Sosa (2010, p. 48); Sutton (2007); and Williamson (2000). To this line of argument, it may be objected that knowledge is not the norm of assertion. Some philosophers have suggested counterexamples to this thesis (Brown 2008; Lackey 2007). Others have argued that assertion is governed by other norms such as truth or justification (Douven, 2006; Weiner, 2005). Another way to criticize this argument consists in challenging the similarity between belief and assertion, arguing in particular that they would not be governed by the same norms. Whiting (2013, pp. 187-188) provides some reasons why one should expect standards of belief and assertion to diverge: since assertion is an “external” act, involving a social dimension, in evaluating an assertion one might have to take into account the expectations and needs of interlocutors and the role of speech acts in the unfolding conversation. Furthermore, assertion is a potential source of testimony. In asserting, one takes on responsibility for others’ beliefs. All these considerations are extraneous to belief, which is a “private” state of mind. It would thus not be surprising if assertions were governed by more demanding epistemic standards than belief due to their social character and their communicative role. Brown (2012, pp. 137-144) provides another argument against the claim that assertion and belief share the same epistemic standard: she first argues that whether an assertion or belief is epistemically appropriate partially depends on its consequences (for example, the epistemic propriety of asserting varies with the stakes), and second, that the consequences of asserting p may differ from those of believing p. It follows that there can be cases in which it is epistemically appropriate for a subject to believe that p, but not to assert that p, and vice versa.

Considerations about versions of Moore’s paradox with “know” in place of “belief” provide another argument for the claim that knowledge is the aim or norm of belief (Adler, 2002; Gibbons, 2013; Huemer, 2007; Sutton, 2007). As it sounds absurd or infelicitous to assert sentences like “it is raining but I do not know it is raining,” it seems incoherent to believe that it is raining and at the same time that one does not know that it is raining. An explanation of this fact could be that knowledge is the aim or norm of belief. A subject believing that it is raining but that she does not know it would violate (KNB). This type of argument has been the target of two objections: first, some have argued that a weaker standard, like a truth-norm, would be sufficient to explain the absurdity of this sort of Moorean belief (see, for example, the explanation considered in §1.c). Second, it has been argued that while asserting Moorean propositions of the form “it is raining but I do not know it is raining” sounds absurd, there is no such absurdity in believing these same propositions. It seems both reasonable and appropriate to believe something even while believing not to know it (McGlynn, 2013; Whiting, 2013, pp. 188-189). Using an example in McGlynn (2013, p. 387), there is nothing unreasonable or incongruous for Jane to believe that her ticket will lose, that this belief is justified, and that nonetheless this belief fails to count as knowledge. A reply to the latter criticism consists in distinguishing between outright (or full) belief and partial belief. Only outright belief would be subject to a knowledge norm. For a reply, see McGlynn (2013, §3) and Whiting (2013, p. 189).

Another argument for the knowledge aim/norm of belief is provided by the way in which we tend to assess (justify and criticize) our beliefs. Williamson (2005, p. 109) provides the following case: John is at the zoo and sees what appears to him to be a zebra in a cage. The animal in the cage is really a zebra. However, unbeknownst to John, to save money, most of the other animals in the zoo have been replaced by cleverly disguised farm animals. In this scenario, John’s belief is true and fully reasonable (after all, he has no reason to believe that the animal in the cage could not be a zebra). Still, John does not know it is a zebra. Intuitively, John needs an excuse for believing that the animal is a zebra. He can excuse himself by pointing out that he is in no position to distinguish his state from one of knowing. The need for an excuse indicates that it is wrong for John to believe that the animal is a zebra. Despite the fact that John’s belief is both reasonable and true, it is somewhat defective. This contrasts with knowing that it is a zebra, which provides a full justification for believing it, not a mere excuse. A reply to this argument is that John is not wrong in believing that the animal is a zebra (and thus does not need an excuse for this). Rather, if John stands in need of correction, this is due to the false background beliefs he has—for example, the belief that animals in this area are all zebras (Littlejohn, 2010; Whiting, 2013).

Some philosophers have argued that knowledge is the aim or norm of belief on the ground that knowledge has more value than mere true belief (Bird, 2007). However, on the one hand, there is no general agreement on whether knowledge is more valuable than true belief. On the other hand, from the fact that knowledge is more valuable than true belief, it does not follow that belief is governed by a norm or involves a constitutive aim at knowledge.

For other arguments in support of the claim that the aim or norm of belief is knowledge, see Bird (2007); Engel (2004); McHugh (2011); and Sutton (2007). For other criticisms of the claim, see Littlejohn (2010); McGlynn (2013); and Whiting (2013).

Though truth and knowledge are widely identified as the main candidates for being the aim or norm of belief, some philosophers have suggested other properties. Another available option is that the aim of belief is reasonability or justification. For a defense of the claim that non-factive justification is the condition of epistemic success for belief, see, for example, Feldman (2002, pp. 378-379). It should be noted that this view has not found many proponents in the 21st century literature, at least if we exclude philosophers for whom justified belief is factive or requires knowledge (for example, Gibbons, 2013; Littlejohn, 2012; Sutton, 2007). An original approach is that of Smithies (2012), who argues that the fundamental norm of belief is that one be in a position to know what one believes, where “[o]ne is in a position to know a proposition just in case one satisfies all the epistemic, as opposed to psychological, conditions for knowledge, such as having ungettiered justification to believe a true proposition.” (2012, p. 4)

Some philosophers have identified the aim of belief with some specific kind of epistemic virtue one could manifest in the possession of belief (Zagzebski, 2004) or with understanding (Kvanvig, 2003). According to other philosophers, the fundamental aim of belief consists in the satisfaction of practical goals such as survival, utility, or the satisfaction of desires and wants. Similar views are particularly popular among philosophers favoring naturalistic approaches in the philosophy of mind. For example, Millikan (1984) argues that beliefs are integrated in a naturally selected cognitive system having the function of tracking features of the world in order to help in the satisfaction of biological needs such as survival and reproduction. A similar view has been more or less explicitly endorsed by, for example, Horwich (2006); Kornblith (1993); and Papineau (1999, 2013).

Another option is that belief possesses multiple aims or norms equally fundamental and irreducible to each other. For example, Weiner (2014) endorses pluralism about epistemic norms, arguing that there are many different epistemic norms, each valid from a different standpoint, and that no one of these standpoints need be better than another. In a virtue-theoretic framework, Wright (2014) argues that there are two fundamental epistemic aims: believing in accordance with the intellectual virtues (such as intellectual courage, carefulness, and open-mindedness), and believing the truth and avoiding falsehoods.

4. Relevance of the Topic

The topic discussed in the present article has relevance for several more general philosophical debates. It is clearly relevant in those areas of philosophy directly concerned with the notion of belief, such as the Philosophy of Mind, and in particular, the ontology of mental attitudes. One of the issues that traditionally have most concerned philosophers is that of individuating the distinctive feature of belief with respect to other mental attitudes such as trusting, mere thinking, imagining, guessing, and so on. As David Hume admits (1739, book I, part III, §7), the distinction between belief and mere thought was the first philosophical problem that the Scottish philosopher posed himself, and also one of the hardest he found to solve (for a discussion, see Armstrong, 1973, part I, §5). Whereas the difference between belief and other types of mental attitude seems to reside in the specific relationship that belief entertains with truth (or knowledge), it has been extremely difficult to grasp the peculiar nature of such a relationship. The 21st century debate on the aim of belief promises to provide an answer to such a problem. In answering questions about the nature of the aim (see §2), it also promises to shed some light on the issue of whether belief is a normative attitude or whether it can be characterized by a fully naturalistic account.

The progress in the debate about the aim/norm of belief also substantially contributes to the study of norms and aims of other attitudes such as desires, emotions, and intentions. The aim of such studies is to provide a unified and coherent representation of the various norms and aims of mental attitudes and of their reciprocal relations (for example, Millar, 2009; Railton, 1997; Shah, 2008; Velleman, 2000b; Wedgwood, 2007). For instance, Wedgwood (2007) interprets the aim of attitudes as norms of correctness and argues that similar norms are constitutive and individuative of all intentional attitudes. Similarly, Shah (2008) applies an argument analogous to the transparency argument for belief (considered in §2.b) to other attitudes. In particular, he argues that the hypothesis that the concept of intention is governed by a constitutive norm would best explain the presumed fact that in order to conclude deliberation on whether to intend to A one must answer the question whether to A.

The debate on the aim of belief has, also, a particular relevance for certain views in philosophy of normativity. According to a prominent view in meta-ethics (so-called Constitutivism), normative facts can be grounded in facts about the constitution of action or agency. According to this view, agency is constitutively governed by practical norms (for example, Korsgaard, 1996; Velleman, 2000b). Some philosophers have tried to extend the view to other normative domains such as epistemology and aesthetics. The epistemological analogue of Ethical Constitutivism holds that epistemic normativity can be grounded in the constitutive aim or norm of belief. For a criticism of constitutivist approaches, see Enoch (2006).

The view that epistemic normativity is grounded in a fundamental truth-aim of belief has deep roots in the 20st century history of epistemology. Many philosophers in the past have argued that there is a strict dependence relation between the fundamental aim or norm of belief (sometimes presented as a conjunction of two values or goals of believing truly and avoiding falsehoods) and other derivable normative epistemic standards such as justification and rationality. Versions of this view have been defended by many well-known epistemologists, including Chisholm, Goldman, Lehrer, Plantinga, Alston, Foley, and Sosa (see Alston, 2005, ch. 1, for an overview). Accounts of this relation differ depending on the notions of justification and rationality adopted by philosophers, and by how philosophers conceive the relation between the truth-aim and other derived normative properties (for example, consequentialist, deontologist, virtue-based…).

Another approach in epistemology concerned with the topic of the aim or norm of belief is the so-called Knowledge First approach introduced in §3. According to this view, knowledge has a prominent role among epistemic notions and constitutes the fundamental epistemic standard of assertion, belief, action, practical reasoning, and disagreement (compare “Knowledge Norms”). This approach has generated a comparative study of these standards (for example, Smithies, 2012). An example is the reformulation of various arguments for the knowledge norm of assertion in order to defend other knowledge norms, such as those of belief and action (compare §3). In this perspective, the debate on the aim of belief can help in understanding important aspects of epistemic norms of assertion, action, practical reasoning, and disagreement, and in turn can receive important contributions from advances in the debates about norms governing these other practices.

5. References and Further Reading

  • Adler, J. (2002). Belief’s own ethics (Vol. 112). MIT Press.
  • Alston, W. (2005). Beyond “justification”: Dimensions of epistemic evaluation (Vol. 81). Ithaca: Cornell University Press.
  • Armstrong, D. M. (1973). Belief, truth and knowledge (Vol. 24). London: Cambridge University Press.
  • Baldwin, T. (2007). The normative character of belief. In M. S. Green & J. N. Williams (Eds.), Moore’s paradox: New essays on belief, rationality, and the first person (pp. 76–89). Oxford: Oxford University Press.
  • Berker, S. (2013). The rejection of epistemic consequentialism. Philosophical Issues, 23(1), 363–387.
  • Bird, A. (2007). Justified judging. Philosophy and Phenomenological Research, 74(1), 81–110.
  • Blackburn, S. (1993). Essays in quasi-realism. New York, NY: Oxford University Press.
  • Boghossian, P. A. (2003). The normativity of content. Philosophical Issues, 13(1), 31–45.
  • Brandom, R. B. (1994). Making it explicit: Reasoning, representing, and discursive commitment. Cambridge, MA: Harvard University Press.
  • Brandom, R. B. (2001). Modality, normativity, and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Brown, J. (2008). The knowledge norm for assertion. Philosophical Issues, 18(1), 89–103.
  • Brown, J. (2012). Assertion and practical reasoning: Common or divergent epistemic Standards? Philosophy and Phenomenological Research, 84(1), 123–157.
  • Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548.
  • Bykvist, K., & Hattiangadi, A. (2007). Does thought imply ought? Analysis, 67(296), 277–285.
  • Bykvist, K., & Hattiangadi, A. (2013). Belief, truth, and blindspots. In T. Chan (Ed.), The aim of belief (pp. 100–122). New York, NY: Oxford University Press.
  • Chrisman, M. (2008). Ought to believe. Journal of Philosophy, 105(7), 346–370.
  • David, M. (2005). Truth as the primary epistemic goal: A working hypothesis. In M. Steup & E. Sosa (Eds.), Contemporary debates in epistemology (pp. 296–312). Oxford: Blackwell.
  • Davidson, D. (2001). Comments on Karlovy Vary papers. In P. Kotatko (Ed.), Interpreting Davidson (pp. 285-308). Stanford, CA: CSLI Publications.
  • Douven, I. (2006). Assertion, knowledge, and rational credibility. Philosophical Review, 115(4), 449–485.
  • Dretske, F. (2000). Norms, history and the constitution of the mental. In Perception, knowledge and belief (pp. 242–258.). Cambridge: Cambridge University Press.
  • Engel, P. (2004). Truth and the aim of belief. In D. Gillies (Ed.), Laws and models in science (pp. 77–97). London: King’s College Publications.
  • Engel, P. (2007). Belief and normativity. Disputatio, 2(23), 179–203.
  • Engel, P. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 199–216.
  • Enoch, D. (2006). Agency, shmagency: Why normativity won’t come from what is constitutive of action. Philosophical Review, 115(2), 169–198.
  • Fassio, D. (2011). Belief, correctness and normativity. Logique Et Analyse, 54(216), 471-486.
  • Feldman, R. (2002). Epistemological duties. In P. Moser (Ed.), The Oxford handbook of epistemology (pp. 362–384). Oxford: Oxford University Press.
  • Fine, K. (1994). Essence and modality. Philosophical Perspectives, 8, 1–16.
  • Firth, R. (1981). Epistemic merit, intrinsic and instrumental. Proceedings and Addresses of the American Philosophical Association, 55(1), 5–23.
  • Frankish, K. (2007). Deciding to believe again. Mind, 116(463), 523–547.
  • Frost, K. (2014). On the very idea of direction of fit. Philosophical Review, 123(4), 429–484.
  • Gibbard, A. (2005). Truth and correct belief. Philosophical Issues, 15(1), 338–350.
  • Gibbons, J. (2013). The norm of belief. Oxford: Oxford University Press.
  • Glüer, K., & Pagin, P. (1998). Rules of meaning and practical reasoning. Synthese, 117(2), 207–227.
  • Glüer, K., & Wikforss, Å. (2009). Against content normativity. Mind, 118(469), 31–70.
  • Glüer, K., & Wikforss, Å. (2010a). The normativity of meaning and content. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Glüer, K., & Wikforss, Å. (2010b). The truth norm and guidance: A reply to Steglich-Petersen. Mind, 119(475), 757–761.
  • Glüer, K., & Wikforss, Å. (2013). Against belief normativity. In T. Chan (Ed.), The aim of belief. New York, NY: Oxford University Press.
  • Griffiths, A. P. (1962). On belief. Proceedings of the Aristotelian Society, 63(n/a), 167–186.
  • Hieronymi, P. (2006). Controlling attitudes. Pacific Philosophical Quarterly, 87(1), 45–74.
  • Horwich, P. (2006). The value of truth. Noûs, 40(2), 347–360.
  • Horwich, P. (2013). Belief-truth norms. In T. Chan (Ed.), The aim of belief (pp. 17–31). New York, NY: Oxford University Press.
  • Huemer, M. (2007). Moore’s paradox and the norm of belief. In S. Nuccetelli & G. Seay (Eds.), Themes from G.E. Moore (pp. 142–57). Oxford: Oxford University Press.
  • Humberstone, I. L. (1992). Direction of fit. Mind, 101(401), 59–83.
  • Hume, D. (1739). A treatise of human nature. Oxford: Oxford University Press.
  • Kelly, T. (2003). Epistemic rationality as instrumental rationality: A critique. Philosophy and Phenomenological Research, 66(3), 612–640.
  • Kornblith, H. (1993). Epistemic normativity. Synthese, 94(3), 357–376.
  • Korsgaard, C. M. (1996). The sources of normativity (Vol. 110). Cambridge: Cambridge University Press.
  • Kvanvig, J. L. (2003). The value of knowledge and the pursuit of understanding (Vol. 113). Cambridge: Cambridge University Press.
  • Lackey, J. (2007). Norms of assertion. Noûs, 41(4), 594–626.
  • Littlejohn, C. (2010). Moore’s paradox and epistemic norms. Australasian Journal of Philosophy, 88(1), 79–100.
  • Littlejohn, C. (2012). Justification and the truth-connection. Cambridge: Cambridge University Press.
  • Littlejohn, C. (2013). The Russellian retreat. Proceedings of the Aristotelian Society, 113, 293–320.
  • Lynch, M. (2004). True to life: Why truth matters. Cambridge, MA: MIT Press.
  • Lynch, M. (2009a). The value of truth and the Truth of Values. In A. Haddock, A. Millar, & D. Pritchard (Eds.), Epistemic value. Oxford: Oxford University Press.
  • Lynch, M. (2009b). Truth, value and epistemic expressivism. Philosophy and Phenomenological Research, 79(1), 76–97.
  • Maitzen, S. (1995). Our errant epistemic aim. Philosophy and Phenomenological Research, 55(4), 869–876.
  • Mayo, B. (1963). Belief and constraint. Proceedings of the Aristotelian Society, 64, 139–156.
  • McGlynn, A. (2013). Believing things unknown. Noûs, 47(2), 385–407.
  • McHugh, C. (2011). What do we aim at when we believe? Dialectica, 65(3), 369–392.
  • McHugh, C. (2012a). Belief and aims. Philosophical Studies, 160(3), 425–439.
  • McHugh, C. (2012b). The truth norm of belief. Pacific Philosophical Quarterly, 93(1), 8–30.
  • McHugh, C. (2013). Normativism and doxastic deliberation. Analytic Philosophy, 54(4), 447–465.
  • McHugh, C. (2014). Fitting belief. Proceedings of the Aristotelian Society, 114(2pt2), 167–187.
  • McHugh, C., & Whiting, D. (2014). The normativity of belief. Analysis, 74(4), 698–713.
  • Millar, A. (2004). Understanding people: Normativity and rationalizing explanation. New York, NY: Oxford University Press.
  • Millar, A. (2009). How reasons for action differ from reasons for belief. In S. Robertson (Ed.), Spheres of reason (pp. 140–163). New York, NY: Oxford University Press.
  • Miller, A. (2008). Thoughts, oughts and the conceptual primacy of belief. Analysis, 68(299), 234–238.
  • Millikan, R. G. (1984). Language, thought and other biological categories. Cambridge, MA: MIT Press.
  • Moore, G. E. (1942). A reply to my critics. In P. A. Schilpp (Ed.), The philosophy of G. E. Moore. Chicago, IL: Open Court.
  • Moran, R. A. (1997). Self-knowledge: Discovery, resolution, and undoing. European Journal of Philosophy, 5(2), 141–61.
  • Owens, D. J. (2003). Does belief have an aim? Philosophical Studies, 115(3), 283–305.
  • Papineau, D. (1999). Normativity and judgment. Proceedings of the Aristotelian Society, 73(73), 16–43.
  • Papineau, D. (2013). There are no norms of belief. In T. Chan (Ed.), The Aim of Belief (pp. 64–79). New York, NY: Oxford University Press.
  • Peacocke, C. (1999). Being known. Oxford: Oxford University Press.
  • Plantinga, A. (1993). Warrant and proper function. New York, NY: Oxford University Press.
  • Platts, M. B. (1979). Ways of meaning: An introduction to a philosophy of language. London: Routledge & K. Paul.
  • Railton, P. (1994). Truth, reason, and the regulation of belief. Philosophical Issues, 5, 71–93.
  • Railton, P. (1997). On the hypothetical and non-hypothetical in reasoning about belief and action. In G. Cullity & B. N. Gaut (Eds.), Ethics and practical reason (pp. 53–79). New York, NY: Oxford University Press.
  • Raleigh, T. (2013). Belief norms and blindspots. Southern Journal of Philosophy, 51(2), 243–269.
  • Ramsey, F. P. (1931). Foundations of mathematics and other logical essays. London: Routledge.
  • Rosen, G. (2001). Brandom on modality, normativity and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Setiya, K. (2008). Believing at will. Midwest Studies in Philosophy, 32(1), 36–52.
  • Shah, N. (2003). How truth governs belief. Philosophical Review, 112(4), 447–482.
  • Shah, N. (2008). How action governs intention. Philosophers’ Imprint, 8(5), 1–19.
  • Shah, N., & Velleman, J. D. (2005). Doxastic deliberation. Philosophical Review, 114(4), 497–534.
  • Smithies, D. (2012). The normative role of knowledge. Noûs, 46(2), 265–288.
  • Sosa, E. (2003). The place of truth in epistemology. In L. Zagzebski & M. DePaul (Eds.), Intellectual virtue: Perspectives from ethics and epistemology (pp. 155–180). New York, NY: Oxford University Press.
  • Sosa, E. (2007). A virtue epistemology: Apt belief and reflective knowledge, Volume I. Oxford: Oxford University Press.
  • Sosa, E. (2009). Knowing full well: The normativity of beliefs as performances. Philosophical Studies, 142(1), 5–15.
  • Sosa, E. (2010). Knowing full well. Princeton, NJ: Princeton University Press.
  • Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press.
  • Steglich-Petersen, A. (2006). No norm needed: On the aim of belief. Philosophical Quarterly, 56(225), 499–516.
  • Steglich-Petersen, A. (2008). Against essential normativity of the mental. Philosophical Studies, 140(2), 263–283.
  • Steglich-Petersen, A. (2009). Weighing the aim of belief. Philosophical Studies, 145(3), 395–405.
  • Sutton, J. (2007). Without justification. Cambridge, MA: MIT Press.
  • Toribio, J. (2013). Is there an “ought” in belief? Teorema: Revista Internacional de Filosofía, 32(3), 75–90.
  • Vahid, H. (2006). Aiming at truth: Doxastic vs. epistemic goals. Philosophical Studies, 131(2), 303–335.
  • Vahid, H. (2009). The epistemology of belief. London: Palgrave Macmillan.
  • Velleman, D. (2000a). On the aim of belief. In D. Velleman (Ed.), The possibility of practical reason (pp. 244–281). New York, NY: Oxford University Press.
  • Velleman, D. (2000b). The possibility of practical reason (Vol. 106). New York, NY: Oxford University Press.
  • Wedgwood, R. (2002). The aim of belief. Philosophical Perspectives, 16(s16), 267–97.
  • Wedgwood, R. (2007). The nature of normativity. New York, NY: Oxford University Press.
  • Wedgwood, R. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 217–234.
  • Weiner, M. (2005). Must we know what we say? Philosophical Review, 114(2), 227–251.
  • Weiner, M. (2014). The spectra of epistemic norms. In J. Turri & C. Littlejohn (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 201–218). Oxford: Oxford University Press.
  • Whiting, D. (2010). Should I believe the truth? Dialectica, 64(2), 213–224.
  • Whiting, D. (2013). Nothing but the truth: On the norms and aims of belief. In T. Chan (Ed.), The Aim of Belief. New York, NY: Oxford University Press.
  • Williams, B. (1973). Deciding to believe. In B. Williams (Ed.), Problems of the Self (pp. 136–51). Cambridge, MA: Cambridge University Press.
  • Williams, B. (2002). Truth and truthfulness: An essay in genealogy. Princeton, NJ: Princeton University Press.
  • Williamson, T. (2000). Knowledge and its limits. Oxford: Oxford University Press.
  • Williamson, T. (2005). Knowledge, context, and the agent’s point of view. In G. Preyer & G. Peter (Eds.), Contextualism in philosophy: Knowledge, meaning, and truth (pp. 91–114). New York, NY: Oxford University Press.
  • Wright, S. (2014). The dual-aspect norms of belief and assertion: A virtue approach to epistemic norms. In C. Littlejohn & J. Turri (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 239–258). New York, NY: Oxford University Press.
  • Yamada, M. (2012). Taking aim at the truth. Philosophical Studies, 157(1), 47–59.
  • Zagzebski, L. (2004). Epistemic value and the primacy of what we care about. Philosophical Papers, 33(3), 353–377.
  • Zalabardo, J. L. (2010). Why believe the truth? Shah and Velleman on the aim of belief. Philosophical Explorations, 13(1), 1–21.
  • Zangwill, N. (2005). The normativity of the mental. Philosophical Explorations, 8(1), 1–19.

 

Author Information

Davide Fassio
Email: davide.fassio@unige.ch
University of Geneva
Switzerland