African Philosophical Perspectives on the Meaning of Life

The question of life’s meaning is a perennial one. It can be claimed that all other questions, whether philosophical, scientific, or religious, are attempts to offer some glimpse into the meaning—in this sense, purpose—of human existence. In philosophical circles, the question of life’s meaning has been given some intense attention, from the works of Qoholeth, the supposed writer of the Biblical book, Ecclesiastics, to the works of pessimists such as Schopenhauer, down to the philosophies of existential scholars, especially Albert Camus and Sören Kierkegaard, and to twenty-first century thinkers on the topic such as John Cottingham and Thaddeus Metz. African scholars are not left out, and this article provides a brief overview of some of the major theories of meaning that African scholars have proposed. This is done by tying together ideas from African philosophical literature in a bid to present a brief systematic summary of African views about meaningfulness. From these ideas, one can identify seven theories of meaning in African philosophical literature. These theories include The African God-purpose theory of meaning, the vital force theory of meaning, the communal normative theory of meaning, the love theory of meaning, the (Yoruba) cultural cluster theory of meaning, the personhood-based theory of meaningfulness, and the conversational theory of meaning. Examining all these begins by explaining the meaning of “meaning” and the distinction between meaning in life and meaning of life.

Table of Contents

  1. Explaining Some Important Concepts
  2. African Philosophy and the Meaning of Life
    1. The African God-Purpose Theory of Meaning
    2. The Vital Force Theory of Meaning
    3. The Communal Normative Theory of Meaning
    4. The Love Theory of Meaning
    5. The Yoruba Cluster Theory of Meaning
    6. Personhood and a Meaningful Life
    7. The Conversational Theory
  3. Conclusion
  4. References and Further Reading

1. Explaining Some Important Concepts

What is meant by the terms meaning and meaningfulness? The concept of meaning is about what all competing ideas about meaning are about. In the literature, meaning is thought of in terms of purpose (received or determined teleological ends that are worth pursuing for their own sake), transcendence (things beyond our animal nature), normative reasons for actions, and so forth. These singular or monistic views, while interesting, also have their flaws. They barely capture what only and all competing intuitions about meaning. This is why Thaddeus Metz proposed a pluralistic account of meaning where he tells us that meaning consists of:

roughly, a cluster of ideas that overlap with one another. To ask about meaning … is to pose questions such as: which ends, besides one’s own pleasure as such are most worth pursuing for their own sake; how to transcend one’s animal nature; and what in life merits great esteem or admiration (Metz, 2013, p. 34).

It is easy to agree with Metz’s family semblance theory since the pluralism he employs allows one to accommodate most theories or conceptions of meaning while rejecting peripheral ideas – like pleasure, happiness – that do not, on their own, possess the quality of a ‘final value’, needed in the usual understanding of what meaning entails. But while this is so, ideas about subjective accounts of meaning appear missing in Metz’s concept. In addition, what about the meaning of life? These questions have led Attoe (2021) to add two extra variables to Metz’s family of values. The first is the subjective pursuit for those ends that an individual finds worth pursuing (insofar as the individual considers those things/values as ends in themselves). A cursory glance at Metz’s approach shows that although it is tenable, it is more objectivist than it is all-encompassing. By inserting subjectivity into his approach, subjectivist views about meaning are immediately accommodated, which are often found to be instrumentality incompatible with ideas about meaning (See Nagel, 1979, p. 16). The second variable that Attoe (2021) proposes is coherence. By coherence, he means the identification of a narrative that ties together an individual’s life – perhaps those moments of meaningfulness or those actions that dot her life – such that the individual can adjudge her whole life as meaningful. This feature bears more on ideas about the meaning of life.

With this in mind, it is expedient to make further distinctions about the meaning in life versus the meaning of life – as these concepts mean two different things and shall be used in different ways later on in this chapter. By meaning in life, what is meant are those instances of meaning that may dot an individual’s life. Thus, a marriage to a loved one or the successful completion of a degree may subsist as a meaningful act or a moment of meaningfulness.  With regards to the meaning of life, there are some, like Richard Taylor, who describes it as involving the meaning of existence, especially with regards to cosmic life, biological life, or human life/existence specifically. One can also use the term ‘meaning of’ in a narrower sense, where ‘meaning of’ delineates a judgment on what makes the life of a human person, considered as a whole (especially within the confines of an individual’s lifetime), meaningful (Attoe, 2021). This understanding is similar to Nagel’s understanding of the meaning of life. This distinction is important to note because instances of meaning do not always pre-judge the meaning of the entire life of an individual. So, the individual obtaining a degree can be a moment of meaningfulness (meaning in), but it would be hard to consider that individual’s life as a whole to be ultimately meaningful (meaning of) if that individual spent his time killing others for no reason, despite gaining a degree or marrying a loved one.

2. African Philosophy and the Meaning of Life

Having exhausted the more expedient distinctions, the following section delves into what could be considered African conceptions of the meaning of life. To fully understand the ideas that shall be put forth, a short detour is necessary to describe the metaphysics undergirding African thought as this would avail the reader the proper lenses with which to see the African view(s). For those who are new to African metaphysics, one can easily imagine that any talk of African metaphysics is predicated by some unsophisticated talk about fantastic religious myths and, perhaps, some witchcraft or voodoo. Fortunately, African metaphysics involves something deeper and it is this metaphysics that usually guides the traditional African worldview.

African metaphysics is grounded in an interesting version of empiricism that allows a monistic-cum-harmonious relationship between material and spiritual aspects of reality. It is empirical because most African thinkers are willing to grant the possibility of material and spiritual manifestations in everyday life.  Indeed, it is because one can point to certain acts as manifestations of these spiritual acts that metaphysicians of this type are quick to pronounce the existence of spiritual realities and recognise said acts as spiritual. In other words, knowledge of the spiritual develops from certain manifestations that are verifiable by the senses. What is talked about as spiritual, for instance, does not diminish its empirical worth, since empiricism agrees that knowledge – whatever type it is – is gotten from experience.

Unlike much of Western metaphysics, which, as Innocent Asouzu states, is inundated with all sorts of bifurcations, disjunctions and essentialisation, African metaphysics considers the fragmentations we see in reality as evidence of a harmonious complementary relationship between and among realities. There is a tacit acknowledgment of the interplay between matter and spirit or between realities from all and any spectrum, such that each facet of reality is seen as equally important and the supposedly artificial divide between material and spiritual objects, non-existent. Ifeanyi Menkiti captures this idea:

[T]he looseness or ambiguity regarding what constitutes the domain of the physical, and what the domain of the mental, does not necessarily stem from a kind of an ingrown limitation of the village mind, a crudeness or ignorance, unschooled, regarding the necessity of properly differentiating things, one from the other, but is rather an attitude that is well considered given the ambiguous nature of the physical universe, especially that part of it which is the domain of sentient biological organisms, within which include persons described as constituted by their bodies, their minds, and whatever else the post-Cartesian elucidators believe persons are made of or can ultimately be reduced to. My view on the matter is that the looseness or ambiguity in question is not necessarily a sign of indifference to applicable distinctions demanded by an epistemology, but is itself an epistemic stance, namely: do not make distinctions when the situation does not call for the distinctions that you make. (Menkiti, 2004b, pp. 124-125)

Somehow, this messaging trickles down into the African socio-ethical space where, for the most part, achieving the common good or attaining one’s humanity generally involves communal living or a deep form of mutual coexistence – one that has a metaphysical backing. It is no wonder then that Africa is known for, and has provided the world with, series of philosophies that reflect harmonious co-existence – from Ubuntu (Ramose, 1999; Metz, 2017)  to Ukama (Murove, 2007) to Ibuanyidanda philosophy (Asouzu, 2004; Asouzu, 2007) to Harmonious monism (Ijiomah, 2014), to Integrative humanism (Ozumba & Chimakonam, 2014), the list goes on.

What this slight but important detour seeks to show is that within the traditional African metaphysical space, most thinkers are inclined to believe that spiritual entities are, indeed, existent realities. It is also speculated that these spiritual realities can and do relate with other aspects of reality and that spiritual realities are not removed from our everyday reality – at least not in the way Descartes divided mind from matter – but are an important part of our understanding of reality as a whole – even more so than Spinoza’s parallelism. These ideas should be kept in mind as they help guide any exploration of African views of the meaning of life.

a. The African God-Purpose Theory of Meaning

To answer the questions about what constitutes African conceptions of the meaning of life, one can give a few answers. The first is the African God-purpose theory. Although the God-purpose theory is not a new one—as other scholars from other philosophical traditions have written about it (see: Metz, 2007; Metz, 2013; Mulgan, 2015; Poettcker, 2015; Metz, 2019)—nor a uniquely African view, the arguments contained in the view possess salient features that are African.

For some philosophers, a belief in the existence of God is often considered unnecessary when talking about the God-purpose theory. Why this is so, it is argued, is because what is spoken about is not whether or not God exists but rather what conditions are necessary for a God-purpose theory to subsist as a viable theory of meaning (Metz, 2013, p. 80). While this is a much-appreciated argument, it is hard to agree with its logic, for if we were to presume that the idea of a God or that the existence of a God was inconceivable then we would be forced to admit that a theory of meaning based on an inconceivable God, is not conceivable – indeed one can imagine it to be nothing more than wishful thinking. If, for instance, logical arguments were made for the capture of something as inconceivable as a three-winged leopard would grant one meaning, it would be odd for one to take such a theory as a plausible theory of meaning, since three-winged leopards do not exist. The same arguments can be made with regards to the God-purpose view. One must allow for some rational belief in a conceivable God before one can make claims about a God-purpose theory of meaning. For most traditional Africans, this is precisely the case. The African God-purpose theory begins with an all-pervading belief in God or the Supreme Being (the two terms will be used interchangeably). The belief in the Supreme Being features in the everyday life of the traditional African and Pantaleon Iroegbu makes this point clear:

So far, nobody to our knowledge, has disputed the claim that in African traditional societies, there were no atheists. The existence of God is not taught to children, the saying goes. This means that the existence of God is not learnt, for it is innate and obvious to all. God is ubiquitously involved in the life and practices of the people. (Iroegbu, 1995, p. 359)

This belief is not far-fetched and the ideas that govern this belief are plausible enough to grant the African God the mantle of conceivability. The reason for this is simple: From African philosophical literature, what immediately stands out is the fact that for most traditional Africans, nothingness is impossible. What is rather suggested is the idea of being-alone (as the African metaphysical equivalent of nothingness) and being-with-others as the full expression of reality (one immediately sees the communal metaphysics at play here). The African rejection of nothingness for being-alone comes from the African understanding of God as necessarily eternal (at least regressively speaking) and also the progenitor of the universe. Thus, the term being-alone not only encapsulates a necessarily eternal God, but it also underscores a God that necessarily existed without the universe.

However, being-alone also implies an unattractive mode of living, which does not tally with the more attractive communal ontology and/or mode of living. It is for this reason that one can plausibly speculate that the existence of the universe presupposes a supreme rejection of Its (the term “It” is used as a pronoun to denote a genderless God) being-alone in exchange for a more communal relationship with the other (the universe – understood as encapsulating all other existent realities) – one that legitimises Its existence. And so, the first overarching purpose of the universe is encountered – the legitimisation of God’s existence via a communal relationship with the universe (i.e. created reality). Since God – the source from which the Universe and other realities presumably sprang from – existed prior to other forms of existence, being-alone must have been a reality at some point. In the African view, the Cartesian cogito, acknowledging one’s existence, is not enough. Existence ought to be expressed via a relationship with another. It is in this way that other realities, which emerge from God, legitimise God’s existence as a being-with-others.

With this in mind deciphering God’s purpose for man, and how that translates to meaningfulness, becomes a much easier affair. With the ultimate goal of sustaining the harmony which preserves the universe and in turn legitimises the existence of God, living a meaningful life would involve living a life that ensures harmony. Perhaps this is another reason why complementarity is widespread in most communities in traditional Africa. But as far as it concerns living a meaningful life by doing God’s will, traditional African thinkers would agree that two methods stand out – fulfilling one’s destiny and obeying the divine law.

With regards to the destiny view, what is immediately clear is that the Supreme Being is responsible for the creation of destiny, as Segun Gbadegesin tells us. Whether such a destiny is chosen by the individual or imposed on the individual is unclear, but what is important is that such a destiny emanates from God. It must be iterated that destiny – as understood here – should be distinguished from, what may be termed, “fate(fulness)”. When one receives her destiny from God, one does not imply that the individual’s life would follow such a hard (pre)deterministic path such that whatever role one plays is devoid of the free will or the ability to control the trajectory of one’s life. One can still choose to pursue a certain destiny, choose to alter a given destiny, or choose not to pursue her destiny. Destiny is then thought of as an end that is specific to each individual and for which the individual can choose whether s/he wishes to pursue it or not.

Normative progression suggests that one attains personhood as time progresses and as the individual continues to gain moral experience. In this way, the older and more morally experienced an individual is, the closer the individual is to becoming a moral genius and a person. It would be quite plausible to assume that (even though there is no consensus on the matter) destinies are handed to the individual by God, since it would be harder to imagine—if one considers the African view of the normative progression of personhood—that an it, or even a yet to be developed human person, possesses the raw rational capacity to choose something as complex as its destiny. Although Gbadegesin reminds us that good destinies exist just as bad destinies do, and also that destinies are alterable, since one can choose to ignore a bad destiny and do good instead (Gbadegesin, 2004, p. 316), one can also argue that ignoring a bad destiny to do good is no different from ignoring a good destiny to do bad things. In both cases, what is being discussed is not the alteration of one’s destiny, but simply the neglect of one’s destiny. One can go as far as to assume that by ignoring one’s assigned destiny in such a manner, what is expressed is an inability to understand how one’s destiny ties to God’s overarching purpose and/or a willingness to live a meaningless life. Thus, the pursuit of bad destinies (as assigned by God) can also lead to meaningfulness – much like the Christian gospel of salvation is predicated on the betrayal of Jesus by Judas Iscariot. Hence, within the African God-purpose view, meaningfulness readily involves the pursuit and/or fulfilment of one’s God-assigned destiny. It would be meaningful since such a pursuit/fulfilment would be considered a source of great admiration and esteem both by the individual who has done the fulfilling and the members of his/her society who have understood that s/he has fulfilled his/her destiny, and life would be meaningless if one fails to pursue his/her destiny.

Another way in which one can think of the God-purpose theory as one that confers meaning is through divine laws. Divine laws are known to the individual via different conduits that serve as representatives or messengers of the supreme – usually lesser gods, spirits, ancestors, or priests (Idowu, 2005, pp. 186-187). What these laws are vary from culture to culture but the general idea is that one must avoid certain taboos or acts that allow discord in the community, and that one must engage in certain rites, customs, or rituals to flourish as an individual and obtain meaning. Indeed, as Mbiti reminds us, failing to adhere to divine law not only ensures meaninglessness, it also affects the grand purpose of sustaining the harmony that holds the universe to him. This is why acts of reparation – commiserate with the crime – are often advised once such discord is noticed.

It can be immediately noticed that the African God-purpose theory bears on both aspects of meaningfulness – i.e. meaning in life and the meaning of life. In the first instance, it is apparent that insofar as the individual performs those actions that are directly tied to his/her destiny, then those acts constitute for him/her a moment of meaningfulness. With regards to the latter, the narrative that ties the individual’s actions together and gives it its coherence is his/her destiny – or at least the pursuit of it. It is this narrative that allows one to sit back and adjudge a whole life as meaningful or meaningless.

While the African God-purpose theory of meaning offers an interesting approach to the question of meaning, two major criticisms that it is bound to face would be the instrumentality that regals God’s purpose and the narrowness of the view. These criticisms can be levelled against most God-purpose theories – especially those of the extreme kind. By locating meaning in what God wants the individual to do, one inadvertently admits that the individual only plays a functional role in the grand scheme of things. The imposition of God’s will – through destiny and/or divine law – disregards individual autonomy (whether one has the free will to choose or not) since meaning (especially in extreme versions of the God-purpose theory) only resides in doing God’s will. The second criticism lies in the fact that the African God-purpose theory fails to capture those instances of meaning that springs neither from one’s destiny nor divine law. Thus, the individual who strives to become a musical virtuoso would fail to achieve meaningfulness if that achievement or pursuit does not tally with his assigned destiny. Yet, it can  be intuited that such a pursuit counts as a moment of meaningfulness.

b. The Vital Force Theory of Meaning

The second theory of meaning that can be gleaned from African philosophical literature is the vital force theory of meaning. To understand what this theory entails, it is important to first understand what is meant by “vital force”. Vital force or life force can be described as some sort of ethereal/spiritual force emanating from God and present in all created realities. Wilfred Lajul, in explaining Maduabuchi Dukor’s views, expresses these claims quite succinctly:

Africans believe that behind every human being or object there is a vital power or soul (1989: 369). Africans personify nature because they believe that there is a spiritual force residing in every object of nature. (2017, p. 28)

This is why African religious practices, feasts, and ceremonies cannot in any way be equated to magical and idolatrous practices or fetishism. Within the hierarchy of being, the vital force expresses itself in different ways, with those in humans and ancestors, possessing an animating and rational character – unlike those in plants (which are supposedly inanimate and without rationality) and those in animals (which possess animation without the sort of rationality found in man). Indeed, Deogratia Bikopo and Louis-Jacques van Bogaert opine that:

All beings are endowed with varying levels of energy. The highest levels characterise the Supreme Being (God), the ‘Strong One’; the muntu (person, intelligent being), participates in God’s force, and so do the non-human animals but to a lesser degree…Life has its origin in Ashé, power, the creative source of all that is. This power gives vitality to life and dynamism to being. Ashé is the creative word, the logos; it is: ‘A rational and spiritual principle that confers identity and destiny to humans.’…What subsists after death is the ‘self’ that was hidden behind the body during life. The process of dying is not static; it goes through progressive stages of energy loss. To be dead means to have a diminished life because of a reduced level of energy. When the level of energy falls to zero, one is completely dead. (Bikopo & van Bogaert, 2010, pp. 44-45)

If one must take the idea of a vital force seriously, then it must be admited that the vital force forms an important part of the individual, and that it can be either diminished or augmented in several ways. To diminish one’s vital force, one must look to illness, suffering, depression, fatigue, disappointment, injustice, failure, or any negative occurrence as contributors to the diminution of vital force. In the same vein, one can posit conversely that good health, certain rituals, justice, happiness, engaging positively with others, and so forth, contribute to the augmentation and fortification of vital force. These ideas lead us to vitalism as a theory of meaning.

On what can constitute a vital force theory of meaning, it should be kept in mind that great importance is placed on augmenting one’s vital force as opposed to diminishing it. Thus, being of paramount importance, it would be important for the individual to continually fortify her vital force. This is done by engaging in certain rituals and prayers and by immersing oneself in morally uplifting acts and positive harmonious relations with one’s community and environment. Thus, meaningfulness is obtained by the continuous augmentation of one’s vital force and/or those of others via the processes outlined above. Indeed, the well criticised Tempels alludes to this when he states that ‘Supreme happiness, the only kind of blessing, is, to the Bantu, to possess the greatest vital force: the worst misfortune and, in very truth, the only misfortune, is, he thinks, the diminution of this power’ (Tempels, 1959, p. 22). On the other hand meaninglessness, and indeed death would involve the inability to augment one’s life force and/or actively engaging in acts that seek to diminish one’s vital force or those of others. This theory of meaning focuses on a transcendental goal whose mode of achievement usually involves acts that garner much esteem and admiration. Thus, by enhancing her vital force, the individual engages in something that is inherently meaningful and valuable.

Beyond this traditional view of vitalism, some scholars of African philosophy have also put up a more naturalistic account of meaning that avoids the problems (mainly of proof) associated with theories dealing with spiritual entities (see: Dzobo 1992; Kasenene 1994; Mkhize 2008, Metz 2012). Within this naturalistic understanding, what is referred to as vital force are wellbeing and creative power, rather than the spiritual force of Tempels’ Bantu ontology. So meaningfulness would then involve engaging in those acts that constantly improve one’s wellbeing and engaging ones creative power freely.

Some criticisms can be leveled against the vital force theory. First is the more obvious denial of the existence of spiritual essences within the human body – especially since the brain and the nervous system are thought of as responsible for animating the human body and for the cognitive abilities of a human person (see: Chimakonam, et al., 2019). The second criticism focuses on the naturalist account and argues that ideas about wellbeing and creative power need not bear the moniker of vitalism to make sense. Indeed, one can refer to the pursuit of wellbeing or the expression of creative genius as separate paths to meaningfulness that need not be seen as vitalist.

c. The Communal Normative Theory of Meaning

The third African theory of meaning has been termed “the communal normative function theory” (Attoe, 2020). This theory of meaning is based on one of the most widespread views in African philosophy – communalism. This idea has been discussed by various African philosophers such as Mbiti, Khoza, Mabogo Ramose, Menkiti, Asouzu, Murove, Ozumba & Chimakonam, Metz, and so forth, in various guises and with reference to the several branches of African philosophy ranging from African metaphysics, African logic, African ethics and even down to African socio-political philosophy. An understanding of communalism is necessary to see how this view speaks to conceptions.Communalism is founded on a metaphysics that understands various realities as missing links of an interconnected complementary whole (Asouzu, 2004). This ontology then flows down to human communities and social relationships, where the attainment of the common good and the attainment of personhood is invariably tied to how one best expresses herself as that missing link. So, within this framework, interconnectedness as encapsulated in ideas such as harmony, solidarity, inclusivity, welfarism, familyhood, and so forth, play a prominent role. This is why dicta like Mbiti’s famous dictum ‘I am because we are and since we are, therefore I am’ (Mbiti, 1990, p. 106) or the Ubuntu mantra “… A person is a person through other persons…” or expressions like “..one finger cannot pick up a grain…. “ (Khoza, 1994, p. 3) are commonplace in explaining communalism. Scholars, like Menkiti have therefore gone on to even tie individual personhood to how well the individual tries to live communally and engage with the community.

From this understanding of communalism, a theory of meaning emerges where meaning involves engaging harmoniously with others. By engaging positively with others, the individual seeks to acquire humanity in its most potent form, and it is by acquiring and enhancing this humanity or this personhood that the individual also acquires meaning. By engaging harmoniously with others, the individual sheds petty animal desires, especially those that spring from selfishness, and instead focuses on moral/normative goals that transcend the individual and centre on communal flourishing. Thus, within this framework, the lives of individuals such as Nelson Mandela or Mother Theresa would count as meaningful because of their constant striving to ensure harmony and uplift the lives of others. While the meaningfulness is gained by performing one’s communal normative function, meaninglessness would then subsist in either not engaging positively with others or performing those acts that ensure disharmony or discord – which, in turn, leads to the loss of one’s humanity.

While being an attractive/plausible theory of meaningfulness from the African space, the major shortcoming of the communal normative function theory is that it does not accommodate other meaningful acts that are not designed for, or may not contribute to, communal upliftment. Thus, if our music enthusiast were to engage in her pursuit of achieving virtuoso status and did so without seeking to engage with others with her music, that achievement would not count as a meaningful act.

d. The Love Theory of Meaning

In an earlier paper titled “On Pursuit of the Purpose of Life: The Shona Metaphysical Perspective”, love (which according to Mawere is similar to the Greek concept of Agape) is understood in this context as the “unconditional affection to do and promote goodness for oneself and others, even to strangers” (Mawere, 2010, p. 280).

A few things can be noted from the above. The first point is that love is an emotion from which the desire to do good emanates. As an emotion, one can speculate that love is an emotional feature available to all human beings in the same way that rationality, anger, and happiness are emotions that are also available to every human being. This point is easily countered by various heinous acts that many human beings have perpetrated throughout history; genocides of all kinds portray a hateful instinct rather than a loving one. However, the response to this point would be that love is not the only emotion that the human being is born with, hence the expression of other unpleasant emotions. The second response would be that love is a capacity that is nurtured. Mawere points to this fact:

However, one may wonder why some human beings do not love if love is a natural gift and the sole purpose of life. It is the contention of this paper that the virtuous quality of love though natural is nurtured by free will. (Mawere, 2010, p. 281)

Mawere is vague with regards to what he means by “free will”, and one can only speculate. However, the preferable route to take in describing how love/agape is nurtured would be to think of it in terms of deliberate cultivation and/or expression of love. When one actively seeks to promote goodness for one’s self and others, one is nurturing a habit that takes advantage of our presumably innate capacity to love. By ridding one’s self of the blockades to unconditional love such as self-interest, nepotistic attitudes, unforgiveness, and so on, the individual begins to find himself or herself expressing love in the way Mawere envisions viz. “unconditional affection to do and promote goodness for oneself and others, even to strangers” (Mawere, 2010, p. 280).

It is from this framework that Yolanda Mlungwana (2020) draws her notion of love. Mlungwana tells us that Mawere’s theory of Rudo (love) is different from Susan Wolf’s version of the love theory. According to her, “Insofar as people are the only objects of love for Mawere, his sense of “love” differs from Susan Wolf’s influential account, according to which it is logically possible to love certain activities, things or ideals”. So, while the love theory of meaning is, for Mawere, people-centred or anthropocentric, the love view for Wolf is much more encompassing and may feature a variety of objects that are not exactly human. One can show love to the environment by advocating for and trying to perpetuate a greener planet earth. It could also be a love for abandoned animals or an endangered species. Of course, one can wonder, here, about the narrowness of the traditional African love theory of meaning, and it is a valid critique to have. But the point here is that the scope of the African love view only encapsulates human beings.

It is agreed by the African love theorists that the purpose of existence is love, and it is the sole purpose of human existence. While this might seem a strange claim to make (since one can think of certain acts that are prima facie meaningful – say becoming a musical virtuoso – without necessarily being an act of love), one must first understand what the claim means before settling for certain conclusions. According to Mawere, love permeates all aspects of human relations with others and society:

The Shona consider Agape as the basis of all good relations in society, and therefore as the purpose of everyone’s life. In fact, for the Shonas, all other duties of man on earth such as reproducing, sharing, promoting peace, respecting others (including the ancestors and God), among others have their roots in love. Had it not been love which is the basis of all relationships, it was impossible to promote peace, respect others. In fact, meaningful life on earth would have been impossible. (Mawere, 2010, p. 280)

This point immediately demonstrates that the idea of love is not as one-dimensional as one would think it – doing good to others and one’s self. In our everyday lives and our everyday practices, insofar as we relate with others in some way, such a relation can be rooted in love. Love, in this sense, is not solely thought of in terms of a direct show of some sort of altruistic behaviour towards a person, community, or thing, it is manifested in many differing things/activities like judgments and reparation, business dealings, governance/leadership, teaching/learning, sportsmanship, self-improvement, and so forth. In this way, our musician, who aims to become a virtuoso (whether or not he plays for an audience), does so because he wishes to improve himself – a manifestation of self-love.

When human beings fail to nurture love and begin to manifest hate, problems begin to arise. Mlungwana alludes to this point when she states “In the absence of love, which is the foundation of all relationships, there is no encouragement of peace, respect, etc.” Since love is the purpose of human existence (for people like Mawere), an existence that shows an absence of love is one that is simply meaningless. This meaninglessness further degenerates into anti-meaning when the individual not only fails to show love but actively seeks to pursue hate.

It is hard to fault the love view but one point stands out: Suppose one’s attempt at love causes harm to some other person, could such an act be judged as meaningful? For instance, suppose a person trains to become a brilliant Special Forces Soldier (self-love) to serve his country (love towards his/her society). Let us further suppose that in service to his country, this individual is responsible for the death and destruction of other communities. While his/her dedicated service to his/her country can be seen as an act of love, the destruction of lives and communities constitutes an act of hate (at least from the viewpoint of the communities he has negatively affected). This problem is compounded by the fact that for Mawere, this love must be unconditional. One response to this problem would be that meaningfulness is not as objective a value as one might like to think. By extension, since meaningfulness is subjective and love/hate is context-dependent, the individual’s life is both meaningful and meaningless, depending on the context involved. While this point might seem problematic within a two-valued logical system, such a view is well captured in the dominant trivalent logical systems in African philosophy, such as Chimakonam’s Ezumezu logic.

e. The Yoruba Cluster Theory of Meaning

Another theory of meaning emanating from African philosophy is what may be described as the “Yoruba Cluster View” (YCV). It is so-called because the cultural values which cluster to form this particular view emanate from the dominant views that are found in traditional Yoruba thought. This view was first systematically articulated by Oladele Balogun (2020) and, to some extent, Benjamin Olujohungbe (2020).

The Yoruba conception of meaningfulness is anchored on what Balogun refers to as, a holism. What this holism represents is a theory of meaningfulness that is not based on single isolated paths to meaningfulness (call it a monism) but is instead based on a conglomeration of different complementary “interwoven and harmonious accounts of a meaningful life considered as a whole and not in isolation” (Balogun, 2020, p. 171). Metz had made a similar move when defining the concept “meaning” but Balogun’s pluralism (or holism, as he prefers) is different in two main ways. These paths to meaningfulness are, for Balogun, necessarily dependent on each other, since he is quick to conclude that “isolating one condition from the other alters the constitutive whole” (Balogun, 2020, p. 171).

These interwoven paths to meaningfulness are drawn from a series of normative values that are referred to as “life goods”. These life goods mainly encompass certain normative values that reflect the spiritual social ethical and epistemological experiences. According to Balogun:

The term “life goods” refers to material comfort symbolised with monetary possession, a long healthy life, children, a peaceful spouse and victory over the vicissitudes of existence. The fulfilment of such “life goods” at any stage of human existence is accompanied by the remarks “X has lived a meaningful life” or “X is living a meaningful life”, where X is a relational agent in a social network. The “life goods”, though materialistic and humanistic, are factual goals providing reasons for how the Yorùbá ought to act in daily life. To the extent that such “life goods” ground and guide human actions, and humans are urged to strive towards them in deeds and acts; they are normative prescriptions in the Yorùbá cultural milieu. (Balogun, 2020, p. 171)

These life goods are positive values, and in desiring to acquire this cluster of values, the individual ipso facto desires meaningfulness. What this invariably means is that acquiring meaningfulness would involve a subjective pursuit of these seemingly objective normative values. On the other hand, acquiring one of these values alone, according to the view, would not translate to living a meaningful life, since these values are thought of as means (not ends) to an end (meaningfulness) and since the necessity of acquiring all the values in the cluster is what gives life meaning, according to the view.

Drawing from William Bascom, Balogun identifies this cluster of values to include the following: ranked in order of importance: long life or “not death” (aiku), money (aje, owo), marriage or wives (aya, iyawo), children (omo), and victory (isegun) over life’s vicissitudes. These values can serve as a yardstick for which one can measure if his/her life is meaningful. Furthermore, the judgment call about whether a life is meaningful or not is not merely tied to subjective valuing, it is also subject to external valuing. In this way, even when one is dead, that individual’s life can still be adjudged as meaningful by those external to the individual (that is, other living persons outside the individual). What this means is that the view that death undercuts meaningfulness in some way, does not hold for friends of the Yoruba cluster view. This is because death is merely a transition to more life, either as an ancestor or as another person, via reincarnation, and because even in death, individuals that are external to the dead individual can still make judgments about the meaning of his/her life.

The firm belief in life after death also allows ancestors to attempt to find meaningfulness themselves since they are very much alive. It is safe to assume that the cluster view does not apply in this particular instance, since aje, aya/iyawo and omo, are not achievable goals for ancestors. However, it is expected that ancestors intervene in the lives of the living by “[protecting]  the clan and sanctioning of moral norms among the living” (Balogun 2020, 172).

The cluster view fully expresses itself as a theory of meaning when one realises that the values that make up the cluster intertwine with, and complement, each other. At the base is Aiku which undergirds any claim to meaningfulness. This is because a short-lived life is essentially derived the opportunity to pursue those means that allow for a meaningful life. Menkiti’s suggestion that children cannot attain personhood is instructive here. So, taking good care of one’s health and living a long life provides the individual with the time to achieve other values that are believed to constitute a meaningful life. Aje/owo offers the sort of material comfort that enables the individual to lead a comfortable life and one that allows the individual to take care of others. It is by possessing aje that the individual acquires the financial capacity to marry a wife/wives and take care of children. Without an iyawo, on the other hand, children are not possible except if one decides to bear children outside wedlock (which is frowned upon). As Balogun puts it, A life without a marital relationship and children is culturally held among the Yorùbá as meaningless. Given the pro-communitarian attitude of the Yorùbá, procreation within a network of peaceful spousal relationships is considered necessary for expanding the family lineage and clan. All this, combined with isegun, comes together to form a life that can be looked at an branded as meaningful.

Balogun goes further to augment these cluster values with morality. For the Yoruba, according to Balogun, “a meaningful life is a moral life. Within the Yorùbá cultural milieu, there is no clear demarcation between living a meaningful life and a moral life, for both are associated with each other” (Balogun 2020, 173). Olujohungbe, points to this fact when he concludes that a life filled with quality (one can say, a life that has achieved the cluster values that Balogun alludes to), must also be a morally good life:

A virtuous character thus trumps all other values such as long life, health, wealth, children and those other attributes making up the purported elements of well-being. In this connection, a distinction is thus made between quantity and quality of life. For a vicious person who dies at a “ripe” old age leaving behind children, wealth and other material resources, society often (though clandestinely) says akutunku e l’ona orun – which literally means “may you die severally on your way to heaven” and actually implies good riddance to bad rubbish. (Olujohungbe, 2020, p. 225)

So when one focuses the cluster values towards positively engaging with others, such an individual is living a meaningful life. Thus, when one purposes to alleviate the poverty of others with the aje/wealth that s/he has acquired, s/he is living a meaningful life. If one procreates and bears children, and guides those children into becoming good members of society, that individual is living a meaningful life. This applies to all the other cluster values used in tandem with each other.

f. Personhood and a Meaningful Life

Flowing from the views of scholars like Ifeanyi Menkiti and Kwame Gyekye and systematized into a theory of meaningfulness by Motsamai Molefe is the idea that attaining personhood can invariably lead to a meaningful life. Personhood, in African philosophical thought, is tied to more than mere existence. In other words, merely being a human being that exists is not a sufficient condition for personhood. One must exhibit personhood in others to be called a person.

There has been some debate in African philosophical thought, mainly between Menkiti and Gyekye, about the status of babies and young children with regards to personhood. Menkiti takes the more radical stand that children possess no personhood, and so cannot be persons. Gyekye, on the other hand, supposes that children are born with some level of personhood, and that this personhood, being in its nascent form, can be augmented by one’s level of normative function. What they both agree on, however, is that personhood is achievable and that one can strive to attain the highest form of personhood through positive relationships with the people in one’s community and with one’s culture.

Based on this general framework, Motsamai Molefe provided a systematic account of meaning based on the African idea of personhood. For him, meaningfulness begins when the individual develops those capacities and virtues that allow it to become the best of its kind. Life becomes meaningful with the acquisition of these virtues and “the conversion of these raw capacities to be bearers of moral excellence” (Molefe, 2020, p. 202).

So, because these virtues are bearers of moral excellence, Molefe further informs us that a meaningful life would, according to this theory, be construed in terms of the agent achieving the moral end of moral perfection or excellence. Moral excellence is not automatic, and like Menkiti opines, unlike other types of geniuses, moral geniuses (who have acquired moral excellence), only acquire that status after a long period of time. As a matter of fact, the passage of time only serves to enhance one’s moral experience.

Molefe also ties his idea of personhood and theory of meaning to dignity. Echoing Gyekye’s ideas, Molefe asserts that every individual has the capacity to be virtuous and that every individual ought to build that capacity to a reasonable level. In his words:

Remember, on the ethics of personhood, we have status or intrinsic dignity because we possess the capacity for moral virtue. The agent’s development of the capacity for virtue translates to moral perfection, which we can also think of in terms of a dignified existence. This kind of dignity is the one that we achieve relative to our efforts to attain moral perfection – achievement dignity. As such, a meaningful life is a function of a dignified human existence qua the development of the distinctive human capacity for virtue. I also emphasise that the agent is not required to live the best possible human life. The requirement is that the agent ought to reach satisfactory levels of moral excellence for her life to count as meaningful. (2020, p. 202)

The requirement to have satisfactory levels of moral excellence does not preclude the individual from aiming to live the best possible life. In this way, the requirement to have satisfactory levels of moral excellence stands as a bare minimum for one’s life to be considered meaningful. This allows us to think about the meaningfulness of life in terms of degrees of meaningfulness. In this way, if I live a meaningful life by sufficiently exuding some moral virtues, my life would be meaningful, but not to the degree that one may consider Nelson Mandela’s life to be meaningful.

g. The Conversational Theory

According to Chimakonam (2021), Conversationlism is a theory of meaning-making that strives to improve two main significists – the nwa-nsa and the nwa-nju – through the process intellectual/creative struggle (a process that conversationalists call “arumaristics”), anchored by the construction, deconstruction and reconstruction of seemingly contrary viewpoints. While Conversationalism is focused on conceptual forms of meaning-making, its implications for life is also apparent.

Meaning-making is a matter of conversations within one’s self and between one’s self and the objective values of the various contexts that s/he encounters in life. Within one’s self, meaning lies in self-improvement, achieved through the interrogation of one’s okwu (values, viewpoints, prejudices, and so forth), as Chimakonam (2019) calls it.  It is not just that this okwu, which forms the content of the individual’s life, is improved, but the ability to ask new questions in a life-long dialogue is also improved. By questioning himself/herself and finding answers to those questions, the individual improves his/her okwu, each time at higher levels of sophistication. This positive augmentation of one’s okwu is precisely what makes life meaningful for the conversationalist, at least from a subjective point of view.

But the individual also exists within a community, and, for the conversationalists, the ideas often called objective are mainly the intellectual contributions of subjective individuals who belong to a particular communal context. What, then, counts as objective meaning? Objective meaning, in Conversationalism, would involve the individual’s ability to either imbibe (as the nwa-nsa) or interrogate (as the nwa-nju) the ideas, actions or values that a communal context considers worthwhile. By imbibing those values or performing those actions that one’s communal context considers valuable, the individual pursues ends that are worthy for their own sake, merits esteem and admiration, and transcends his/her animal nature – s/he identifies with ends that are beyond him/her. By assuming the role of nwa-nju (or questioner), s/he makes his/her life meaningful by allowing the improvement of the very values for which this communal context relies as purveyors of meaningfulness. In this way, the individual’s life becomes meaningful by becoming a meaning-maker or meaning-curator.

3. Conclusion

What has been presented above are six plausible theories of life’s meaning that can be gleaned from traditional African philosophical thought. While these six accounts of meaning may not necessarily account for all the possible theories of meaning that can be hewed from the African worldview, it is a good start that invites contributions and critical engagements from philosophers interested in this subject-matter. How attractive are these theories of meaningfulness? Are there contemporary African alternatives to the more traditional views of meaningfulness? Are there more pessimistic accounts within the corpus of African thought that embrace a more nihilistic approach to meaningfulness?

4. References and Further Reading

  • Agada, A., 2020. The African vital force theory of meaning in life. South African Journal of Philosophy, 39(2), pp. 100-112.
    • The article articulates and discusses the African vital force theory.
  • Asouzu, I., 2004. Methods and Principles of Complementary Reflection in and Beyond African Philosophy. Calabar: University of Calabar Press.
    • In this book, Innocent Asouzu develops his idea of complementary as a full-fledged philosophical system.
  • Asouzu, I., 2007. Ibuanyidanda: New Complementary Ontology Beyond World Immanentism, Ethnocentric Reduction and Impositions. Zurich: LIT VERLAG GmbH.
    • This book exposes some of the problems in Aristotelian metaphysics and builds a new complementary ontology.
  • Attoe, A., 2020. Guest Editor’s Introduction: African Perspectives to the question of Life’s Meaning. South African Journal of Philosophy, 39(2), pp. 93-99.
    • This article offers an introductory overview to the discussions about life’s meaning from an African perspective.
  • Attoe, A., 2020. A Systematic Account of African Conceptions of the Meaning of/in Life. South African Journal of Philosophy, 39(2), pp. 127-139.
    • This article curates from available clues, three African conceptions of meaning, namely, the African God’s purpose theory, the vital force theory and the communal normative function theory.
  • Attoe, A. & Chimakonam, J., 2020. The Covid-19 Pandemic and Meaning in life. Phronimon, 21, pp. 1-12.
    • This article considers the impact of the COVID-19 on the question of life’s meaning.
  • Balogun, O., 2007. The Concepts of Ori and Human Destiny in Traditional Yoruba Thought: A Soft-Deterministic Interpretation. Nordic Journal of African Studies, 16(1), pp. 116-130.
    • This article looks at the concept of destiny in traditional Yoruba thought.
  • Balogun, O., 2020. The Traditional Yoruba Conception of a Meaningful Life. South African Journal of Philosophy, 39(2), pp. 166-178.
    • This article examines the account of what makes life meaningful in traditional Yoruba thought.
  • Bikopo, D. & van Bogaert, L.-J., 2010. Reflection on Euthanasia: Western and African Ntomba Perspectives on the Death of a King. Developing World Bioethics, 10(1), pp. 42-48.
    • The focus of this article is the Ntomba belief about vitality and ritual euthanasia and its implications for the idea of euthanasia in African thought.
  • Chimakonam, J., Uti, E., Segun, S. & Attoe, A., 2019. New Conversations on the Problems of Identity, Consciousness and Mind. Cham: Springer Nature.
    • This book attempts to provide answers to the age-old problem of identity, mind-body problem, qualia, and so forth.
  • Chimakonam, J., 2019. Ezumezu: A System of Logic for African Philosophy and Studies. Cham: Switzerland.
    • This book is a novel attempt at curating and systematising African logic.
  • Chimakonam, J., 2021. On the System of Conversational Thinking: An Overview. Arumaruka: Journal of Conversational Thinking, 1(1), pp. 1-46.
    • This article presents a survey of the concept of conversationalism.
  • Descartes, R., 1641. Meditations on first philosophy. Cambridge: Cambridge University Press (1996).
    • This famous book discusses issues like the existence of God, the existence of the soul/self, and so forth.
  • Dzobo, N., 1992. Values in a Changing Society: Man, Ancestors and God. In: K. Wiredu & K. Gyekye, eds. Person and Community: Ghanian Philosophical Studies. Washington: Center for Research in Values and Philosophy, pp. 223-240.
    • In this chapter, Noah Dzobo discusses some African values, the sanctity/value of life, ancestorhood and the idea of vital force.
  • Gbadegesin, S., 2004. Towards A Theory of Destiny. In: K. Wiredu, ed. A Companion to African Philosophy. Oxford: Blackwell Publishing, pp. 313 – 323.
    • This article provides an in-depth discussion of the idea of destiny in African (Yoruba) thought.
  • Gyekye, K., 1992. Person and Community in Akan Thought. In: K. Wiredu & K. Gyekye, eds. Person and Community. Washington D.C.: The Council for Research in Values and Philosophy, pp. 101-122.
    • Gyekye, in this chapter, challenges Ifeanyi Menkiti’s radical communitarianism and discusses the idea of moderate communitarianism from the Akan perspective.
  • Idowu, W., 2005. Law, Morality and the African Cultural Heritage: The Jurisprudential Significance of the Ogboni Institution. Nordic Journal of African Studies, 14(2), pp. 175-192.
    • This paper examines the nature of the concepts of law and morality from the Yoruba (Ogboni group) perspective.
  • Ijiomah, C., 2014. Harmonious Monism: A philosophical Logic of Explanation for Ontological Issues in Supernaturalism in African Thought. Calabar: Jochrisam Publishers.
    • This book provides the first real look into the logic and ontology of harmonious monism.
  • Iroegbu, P., 1995. Metaphysics: The Kpim of Philosophy. Owerri: International University Press.
    • This book provides an overview of metaphysics, and introduces its own novel uwa ontology, which is based on the African view.
  • Khoza, R., 1994. Ubuntu, African Humanism. Diepkloof: Ekhaya Promotions.
    • This book provides a critical exposition of the Southern African notion of Ubuntu.
  • Lajul, W., 2017. African Metaphysics: Traditional and Modern Discussions. In: I. Ukpokolo, ed. Themes, Issues and Problems in African Philosophy. Cham: Palgrave Macmillian, pp. 19-48.
    • This chapter provides a brief survey of some of the issues discussed in African Metaphysics.
  • Mawere, M., 2010. On Pursuit of the Purpose Life: The Shona Metaphysical Perspective. The Journal of Pan African Studies, 3(6), pp. 269-284.
    • This article seeks to establish the idea of “love” as the purpose of human existence.
  • Mbiti, J., 1990. African Religion and Philosophy. London: Heinemann.
    • This famous book provides an overview of some of the religious and philosophical beliefs of some societies in Africa.
  • Mbiti, J., 2012. Concepts of God in Africa. Nairobi: Acton Press.
    • This book provides an overview of the various ideas about God in some African societies.
  • Mbiti, J., 2015. Introduction to African Religion. 2nd ed. Illinois: Waveland Press.
    • This book introduces readers to traditional African religious philosophy.
  • Menkiti, I., 2004a. On the Normative Conception of a Person. Oxford: Blackwell Publishing.
    • This chapter outlines Menkiti’s radical views about personhood in African thought.
  • Menkiti, I., 2004b. Physical and Metaphysical Understanding: Nature, Agency, and Causation in African Traditional Thought. In: L. Brown, ed. African Philosophy: New and Traditional Perspectives. Oxford: Oxford University Press, pp. 107-135.
    • This article focuses on the idea of causation in African metaphysics.
  • Metz, T., 2007. God’s Purpose as Irrelevant to Life’s Meaning: Reply to Affolter. Religious Studies, Volume 43, pp. 457-464.
    • In this article, Metz responds to Jacob Affolter’s claim that an extensionless God could ground/grant the type of purpose that makes life meaningful.
  • Metz, T., 2012. African Conceptions of Human Dignity: Vitality and Community as the Ground of Human Rights. Human Rights Review, 13(1), pp. 19-37.
    • In this article, Metz argues for a more naturalistic account of vitality, based on creativity and wellbeing, that could ground human dignity.
  • Metz, T., 2013a. Meaning in Life. Oxford: Oxford University Press.
    • In this book, Metz provides an analytic discussion of the question of meaning in life.
  • Metz, T., 2017. Towards an African Moral Theory (Revised Edition). In: Themes, Issues and Problems in African Philosophy. Cham: Palgrave Macmillian, pp. 97-119.
    • In this article, Metz provides a slightly revised version of an earlier article (with the same name), outlining his account of African metaphysics.
  • Metz, T., 2019. God, Soul and the Meaning of Life. Cambridge: Cambridge University Press.
    • This short book provides an overview of supernaturalistic accounts of life’s meaning.
  • Metz, T., 2020. African Theories of Meaning in Life: A Critical Assessment. South African Journal of Philosophy, 39(2), pp. 113-126.
    • In this article, Metz discusses vitalist and communalistic accounts of meaning.
  • Mlungwana, Yolanda., 2020. An African Approach to the Meaning of Life. South African Journal of Philosophy, 39(2), pp. 153-165.
    • In this article, Yolanda Mlungwana provides an examination of some African accounts of meaning such as the life, love and destiny theories of meaning.
  • Mlungwana, Yoliswa., 2020. An African Response to Absurdism. South African Journal of Philosophy, 39(2), pp. 140-152.
    • In this article, Yoliswa Mlungwana revisits Albert Camus’ absurdism in the light of African religions and philosophy.
  • Molefe, M., 2020. Personhood and a Meaningful Life in African Philosophy. South African Journal of Philosophy, 39(2), pp. 194-207.
    • This article provides an account of meaning that is based on African views on personhood.
  • Mulgan, T., 2015. Purpose in the Universe: The Moral and Metaphysical Case for Ananthropocentric Purposivism. Oxford: Oxford University Press.
    • The book argues for a cosmic purpose, but one for which human beings are irrelevant.
  • Murove, M., 2007. The Shona Ethic of Ukama with Reference to the Immortality of Values. The Mankind Quarterly, Volume XLVIII, pp. 179-189.
    • This article examines the Shona relational ethics of Ukama.
  • Nagel, T., 1979. Mortal Questions. Cambridge: Cambridge University Press.
    • The book explores issues related to the question of life’s meaning, nature, and so forth.
  • Nagel, T., 1987. What Does It All Mean? A Very Short Introduction to Philosophy. Oxford: Oxford University Press.
    • This book discusses some of the central problems in Western philosophy.
  • Nalwamba, K. & Buitendag, J., 2017. Vital Force as a Triangulated Concept of Nature and s(S)pirit. HTS Teologiese Studies/Theological Studies, 73(3), pp. 1-10.
    • This article examines the concept of vitality in African thought.
  • Okolie, C., 2020. Living as a Person until Death: An African Ethical Perspective on Meaning in Life. South African Journal of Philosophy, 39(2), pp. 208-218.
    • This article discusses attaining personhood as a possible route to meaningfulness.
  • Olujohungbe, B., 2020. Situational Ambivalence of the Meaning of Life in Yorùbá Thought. South African Journal of Philosophy, 39(2), pp. 219-227.
    • This article provides an account of Yoruba conception of meaning.
  • Ozumba, G. & Chimakonam, J., 2014. Njikoka Amaka Further Discussions on the Philosophy of Integrative Humanism: A Contribution to African and Intercultural Philosophy. Calabar: 3rd Logic Option.
    • This book introduces the idea of Njikoka Amaka or integrative humanism.
  • Poettcker, J., 2015. Defending the Purpose Theory of Meaning in Life. Journal of Philosophy of Life, 5, pp. 180-207.
    • The article provides an interesting defence of the purpose theory.
  • Ramose, M., 1999. African Philosophy through Ubuntu. Harare: Mond Books.
    • In this book, Mogobe Ramose provides a systematic account of Ubuntu and the ontology that undergirds it.
  • Taylor, R., 1970. The Meaning of Life. In: R. Taylor, ed. Good and Evil: A New Direction. New York: Macmillian, p. 319–334.
    • This chapter is an honest discussion on the reality of meaninglessness in relation to the question of life’s meaning.
  • Tempels, P., 1959. Bantu Philosophy. Paris: Presence Africaine.
    • This book provides a Westerner’s account of African views about vitality.

 

Author Information

Aribiah David Attoe
Email: aribiahdavidattoe@gmail.com
University of Fort Hare
South Africa

Propositional Attitudes

Sentences such as “Galileo believes that the earth moves” and “Pia hopes that it will rain” are used to report what philosophers, psychologists, and other cognitive scientists call propositional attitudes—for example, the belief that the earth moves and the hope that it will rain. Just what propositional attitudes are is a matter of controversy. In fact, there is some controversy as to whether there are any propositional attitudes. But it is at least widely accepted that there are propositional attitudes, that they are mental phenomena of some kind, and that they figure centrally in our everyday practice of explaining, predicting, and rationalizing one another and ourselves.

For example, if you believe that Jay desires to avoid Sally and has just heard that she will be at the party this evening, you may infer that he has formed the belief that she will be at the party and so will act in light of this belief so as to satisfy his desire to avoid Sally. That is, you will predict that he will not attend the party. Similarly, if I believe that you have these beliefs and that you wish to keep tabs on Jay’s whereabouts, I may predict that you will have made the prediction that he will not attend the party. We effortlessly engage in this sort of reasoning, and we do it all the time.

If we take our social practices at face value, it is difficult to overstate the importance of the attitudes. It would seem that, without the attitudes and our capacity to recognize and ascribe them, as Daniel Dennett colorfully puts it, “we could have no interpersonal projects or relations at all; human activity would be just so much Brownian motion; we would be baffling ciphers to each other and to ourselves—we could not even conceptualize our own flailings”.

In fact, if we follow this line of thought, it seems right to say that we would not even be baffled. Nor would we have selves to call our own. So central, it seems, are the attitudes to our self-conception and so effortlessly do we recognize and ascribe them that one could be forgiven for not realizing that there are any philosophical issues here. Still, there are many. They concern not just the propositional attitudes themselves but, relatedly: propositional attitude reports, propositions, folk psychology, and the place of the propositional attitudes in the cognitive sciences. Although the main focus of this article is the propositional attitudes themselves, these other topics must also be addressed.

The article is organized as follows. Section 1 provides a general characterization of the propositional attitudes. Section 2 describes three influential views of the propositional attitudes. Section 3 describes the primary method deployed in theorizing about the propositional attitudes. Section 4 describes several views of the nature of folk psychology and the question of whether there are in fact any propositional attitudes. Section 5 briefly surveys work on a range of particular mental phenomena traditionally classified as propositional attitudes that might raise difficulties for the general characterization of propositional attitudes provided in Section 1.

Table of Contents

  1. General Characterization of the Propositional Attitudes
    1. Intentionality and Direction of Fit
    2. Conscious and Unconscious Attitudes
    3. Reasons and Causes
  2. Three Influential Views of the Propositional Attitudes
    1. The Classical View
    2. Dispositionalism
    3. Computational-Representationalism
  3. Propositions, Propositional Attitude Reports, and the Method of Truth in Metaphysics
    1. Reading Off the Metaphysics of Propositional Attitudes
    2. Reading Off the Metaphysics of Propositions
    3. A Challenge to the Received View of the Logical Form of Attitude Reports
  4. Folk Psychology and the Realism/Eliminativism Debate
    1. Folk Psychology as a Theory (Theory-Theory)
    2. Realism vs. Eliminativism
    3. Alternatives to Theory-Theory
  5. More on Particular Propositional Attitudes, or on Related Phenomena
    1. Imagining
    2. Judging
    3. Knowing
    4. Perceiving
    5. Intending
    6. Non-Propositional Attitudes
    7. Delusions
    8. Implicit Bias
  6. References and Further Reading

1. General Characterization of the Propositional Attitudes

This section provides a general characterization of the propositional attitudes.

a. Intentionality and Direction of Fit

The propositional attitudes are often thought to include not only believing, hoping, desiring, predicting, and wishing, but also fearing, loving, suspecting, expecting, and many other attitudes besides. For example: fearing that you will die alone, loving that your favorite director has a new movie coming out, suspecting that foreign powers are meddling in the election, and expecting that another recession is on the horizon. Generally, these and the rest of the propositional attitudes are thought to divide into two broad camps: the belief-like and the desire-like, or the cognitive ones and the conative ones. Among the cognitive ones are included believing, suspecting, and expecting; among the conative ones are included desiring, wishing, and hoping.

It is common to distinguish these two camps by their direction of fit, whether mind-to-world or world-to-mind (Anscombe 1957[1963], Searle 1983, 2001, Humberstone 1992). If an attitude has a mind-to-world direction of fit, it is supposed to fit or conform to the world; whereas if it has a world-to-mind direction of fit, it is the world that is supposed to conform to the attitude. The distinction can be put in terms of truth conditions or satisfaction conditions. A belief is true if and only if the world is the way it is believed to be, and is otherwise false; a desire is satisfied if and only if the world comes to be the way it is desired to be, and is otherwise unsatisfied. (In this respect, beliefs are akin to assertions or declarative sentences and desires to commands or imperative sentences.) In both cases, the attitude is in some sense directed at the world.

Accordingly, the propositional attitudes are said to be intentional states, that is, mental states which are directed at or about something. Take belief. If you believe that the earth moves, you have a belief about the earth, to the effect that it moves. More generally, if you believe that a is F, you have a belief about a, to the effect that it is F. The state of affairs a being F might also be construed as what one’s belief is about. That a is F is then the content of one’s belief, to wit, the proposition that a is F. One’s belief is true if and only if a is indeed F, that is, if the state of affairs a being F obtains. On common usage, an obtaining state of affairs is a fact.

(On various deflationary theories of truth, either there cannot be or there is no need for a substantive theory of facts. However, all that is required here is a very thin sense of fact. To admit the existence of facts, in this relevant thin sense, one needs only to accept that which propositions are true depends on what the world is like. To be sure, the recognition of true modal, mathematical, and moral claims, among others, raises many vexing questions for any attempt to provide a substantive theory of facts; but we can set these aside. No particular metaphysics of facts is here required. For further discussion, see the articles on truth, truthmaker theory, and the prosentential theory of truth.)

Of course, not every proposition is about some particular object. Some are instead general: for example, the proposition that whales are mammals, or the proposition that at least one person is mortal. All the same, there are some conditions that must obtain if these general propositions are to be true. If these conditions do not obtain, the propositions are false. If one believes that whales are mammals, one’s belief is true if and only if it is a fact that whales are mammals; and if one believes that at least one person is mortal, one’s belief is true if and only if it is a fact that at least one person is mortal.

According to many theorists, it is constitutive of belief that one intends to form true beliefs and does not hold a belief unless one takes it to be true (Shah and Velleman 2005). That is, on such a view, if one’s mental state is not, so to speak, governed or regulated by the norm of truth, it is not a belief state. Indeed, the idea is sometimes put in explicitly normative terms: one ought to form true beliefs, and one ought not to hold a belief unless it is true (Wedgwood 2002). It is thus sometimes said that belief aims at truth.

Desire is often said to work similarly. If you desire, for example, that you be recognized by your teammates for your contributions to the team, this desire will go unsatisfied unless and until the world becomes such that you are recognized by your teammates for your contributions to the team. Often, if not always, one desires what one perceives or believes to be good. Thus, it is sometimes said that while belief aims at what is (believed to be) true, desire aims at what is (perceived or believed to be) good. In one form or another, this view has been held by major figures like Plato, Aristotle, and Immanuel Kant.

b. Conscious and Unconscious Attitudes

Famously, Franz Brentano characterized intentionality as the “mark of the mental,” that is, as a necessary and sufficient condition for mentality. On some views, all intentional states are propositional attitudes (Crane 2001). Putting these two views together, it would follow that all mental states are propositional attitudes (Sterelny 1990). Other philosophers hold that there are intentional states that are not propositional attitudes (see Section 5). On still other views, there are non-intentional, qualitative mental states. Candidates include sensations, bodily feels, moods, emotions, and so forth. What is distinctive of these latter mental states is that there is something it is like to be in them, a property widely considered as characteristic of phenomenally conscious states (Nagel 1974). Most theorists have written as if the propositional attitudes do not have such qualitative properties. But others claim that the attitudes, when conscious, have a qualitative character, or a phenomenology—to wit, a cognitive phenomenology.

Some theorists have claimed that there is a constitutive connection between consciousness and mentality: mental states must be at least potentially conscious (Searle 1992). Other theorists, including those working in computational psychology, allow that some mental states might never be conscious. For example, the sequences of mental states involved in processing visual information or producing sentences may not be consciously accessible, even if the end products are. Perhaps, similarly, some propositional attitudes (possessed by some subject) are—and will always be—unconscious.

For example, if linguistic competence requires knowledge of the grammar of the language in question, and knowledge is (at least) true belief, then linguistic competence involves certain propositional attitudes. (This conditional is controversial, but it will still serve as an illustration. See Chomsky 1980.) Manifestly, however, being linguistically competent does not require conscious knowledge of the grammar of one’s language; otherwise, linguistics would be much easier than it is. Consider, for another example, the kind of desires postulated by Freudian psychoanalysis. Suppose, at least for the sake of argument, that this theory is approximately correct. Then, some attitudes might never be conscious without the help of a therapist.

Other attitudes might sometimes be conscious, other times not. For example, for some period of months before acting on your desire, you might desire to propose to your partner. You have the desire during this time, even if you are not always conscious of it. Similarly, you might for most of your life believe that the thrice-great grandson of Georg Wilhelm Friedrich Hegel was born in Louisville, KY, even if the circumstances in which this belief plays any role in your mental life are few and far between.

With these observations in mind, some theorists distinguish between standing or offline attitudes and occurrent or online attitudes. When, for example, your desire to propose to your partner and your belief that now is a good time to do it conspire in your decision to propose to your partner now, they are both online. Occurrent or online attitudes might often be conscious, but not always. Sometimes others are in a better position to recognize your own attitudes than you are. This seems especially likely to be the case when it would embarrass or pain us to realize what attitudes we have.

c. Reasons and Causes

Talk of combinations of beliefs and desires leading to action or behavior might suggest that propositional attitudes are causes of behavior; and this is, in fact, the dominant view in the philosophy of mind and action of the beginning of the 21st century. One common way of construing the notion of online attitudes is in causal terms: to say that an attitude is online is to say that it is constraining, controlling, directing, or in some other way exerting a causal influence on one’s behavior and other mental states. But standing attitudes might also be construed as causes of a kind. For example, Fred Dretske (1988, 1989, 1993) speaks of attitudes generally as “structuring causes”, in contrast to “triggering causes”. I might, for example, have a desire, presumably innately wired into me, to quench my thirst when thirsty. This, alongside a belief about how to go about quenching my thirst, might serve as a structuring cause which, when thirsty (a triggering cause), causes me to go about quenching my thirst.

Of course, sometimes when I am thirsty, have a desire to quench my thirst, and have a belief about how I might go about doing that, I remain fixed to the couch. In this case, barring some physical impediment, it is likely that I have some other desire stronger than the desire to quench my thirst. Just how much influence an attitude has on one’s behavior and other mental states may vary with its strength. Someone might, for another example, desire to lose weight, but not as much as they desire to eat ice cream. In this case, when presented with the opportunity to eat ice cream, they will, all else being equal, be more likely to engage in ice-cream eating than not. If we have information about the relative strengths of their desires, our predictions will reflect this. If our predictions prove correct, we have reason to think that we have in hand an explanation of their behavior—to wit, a causal explanation.

At least, this is how a causalist will put it. But belief-desire combinations are also said to constitute reasons for action, and on some views, reasons are not causes. Suppose, for example, that Mr. Xi starts to go to the fridge for a drink. We ask why. The reply: Because I am thirsty and there is just the fix in the fridge. It is generally agreed that what Xi supplies is a reason for doing what was done. He cites a desire to quench his thirst and a belief about how he might go about doing that. The causalist claims that this reason is also the cause of the behavior. But the anti-causalists deny that this reason is a cause, and for this reason they deny that rationalization is a species of causal explanation. Instead, rationalizations are for making sense of or justifying behaviors. Although this view was widely held in the first half of the 20th century, largely under the influence of Ludwig Wittgenstein and Elizabeth Anscombe, the dominant view at the beginning of the 21st century—owing largely to Donald Davidson—is that rationalizations are a species of causal explanation. Where one sides in this dispute may depend in part on the position one takes on folk psychology, which supplies the framework for rationalizations. We return to this in Section 4.

2. Three Influential Views of the Propositional Attitudes

This section describes three influential views of the propositional attitudes.

a. The Classical View

Gottlob Frege and Bertrand Russell did the most to put the propositional attitudes on the map in analytic philosophy. In fact, Russell is often credited with coining the term, and they both articulate what we might call the Classical View of propositional attitudes (although Russell, whose views on these matters changed many times, does not everywhere endorse it). (See, for example, Frege 1892 [1997], 1918 [1997], Russell 1903.) On this view, attitudes are mental states in which a subject is related to a proposition. They are, therefore, psychological relations between subjects and propositions. It follows from this that propositions are objects of attitudes, that is, they are what one believes, desires, and so on.

Propositions are also the contents of one’s attitudes. For example, when Galileo asserts that the earth moves, he expresses a belief (assuming, of course, that he is sincere, understands what he says, and so forth), namely the belief that the earth moves. What Galileo believes is that the earth moves. So, it is said that the content of Galileo’s belief, which may be true or false, is that the earth moves. This is precisely the proposition that the earth moves, reference to which we secure not only with the expression “the proposition that the earth moves” but also with “that the earth moves”.

Propositions, on this view, are the primary truth-bearers. If a sentence or belief is true, this is because the sentence expresses (or is used to express) a true proposition or because the belief has as its object and content a true proposition. It is in virtue of being related to a proposition that one’s belief can be true or false, as the case may be. As they may be true or false, propositions have truth conditions. They specify the conditions in which they are true, that is, the conditions or states of affairs that must obtain if the proposition is to be true—or, to put it in still another way, what the facts must be.

Propositions have their truth conditions absolutely, in the sense that their truth conditions are not relativized to the sentences used to express them. In using the sentence “the earth moves” to assert that the earth moves, we express the proposition that the earth moves and thus our belief that the earth moves. Galileo expressed this belief, too. Thus, we and Galileo believe the same thing and may therefore be said to have the same belief. Presumably, however, Galileo did not express his belief in English. He might have instead used the Italian sentence, “La terra si muove”. There are, of course, indefinitely many sentences, both within and across languages, that may be used to express one and the same proposition. (It might be said that sentences are inter-translatable if and only if they express the same proposition.) No matter which sentence is used to express a proposition, its truth conditions remain the same.

Propositions also have their truth conditions essentially, in the sense that they have them necessarily. Necessarily, the proposition that the earth moves (which can, again, be expressed with indefinitely many sentences) is true if and only if the earth moves. The proposition does not have these truth conditions contingently or accidentally; it is not the case that it might have been true if and only if, say, snow is white. (On some views, the sentence “the earth moves” might have expressed the proposition that snow is white, or some other proposition; but that is a different matter.) In the language of possible worlds, often employed in the discussion of modal notions like necessity and possibility: there is no possible world in which the proposition that the earth moves has truth conditions other than those it has in the actual world.

(Incidentally, though this is not something discussed by Frege and Russell, propositions are often said to be the primary bearers of modal properties, too. It is said, for example, that if it is necessary that 7 + 5 = 12, the proposition that 7 + 5 = 12 is necessary. It is, in other words, a necessary truth, where a truth is understood to be a true proposition. In the language of possible worlds: it is true in every possible world.)

We can, as already mentioned, share beliefs, and this means, on the going view, that one and the same proposition may be the object of our individual beliefs. This raises the question of what propositions could be, such that individuals as spatiotemporally separated as we and Galileo could be said to stand in relation to one and the same proposition. Frege’s answer, as well as Russell’s in some places, is that they must be mind- and language-independent abstract objects residing in what Frege called the “third realm”, that is, neither a psychological realm nor the physical realm (the realm of space and time). In other words, Frege (and again, Russell in some places) adopted a form of Platonism about propositions, or thoughts (Gedanken) as Frege called them.

To be sure, this view invites difficult questions about how we could come into contact with or know anything about these objects. (Being outside space, it is not even clear in what sense propositions could be objects.) It is generally assumed that whatever can have causal effects must be concrete, that is, non-abstract. It follows from this assumption that propositions, as abstract objects, are not just imperceptible but causally inefficacious. That is, they can themselves have no causal effects, whether on material objects or minds (even if minds are non-physical, as Frege thinks). Frege acknowledges this last observation but insists that, somehow, we do in some sense grasp or apprehend propositions.

In the philosophy of mathematics, serious worries have been raised about how we might gain knowledge of mathematical objects if they are as the Platonist conceives of them (see, for example, Benacerraf 1973), and the same would seem to go for propositions. The difficulty is compounded if, following Frege, we conceive of the mind as non-physical or non-concrete. The concrete is generally conceptualized as the spatiotemporally located. Thus to define the ‘abstract’ as the non-concrete is to define the abstract as the non-spatiotemporally located. On Frege’s view, the mind is not spatiotemporally located, concrete, or physical. And yet, presumably, it is not abstract. What’s more, there seem to be mental causes and effects, for one idea leads to another. In fact, as Frege acknowledges, ideas can bring about the acceleration of masses.

Frege nowhere presents a detailed view of these matters, so it is not clear that he had one. In general, it is not clear what a satisfactory view of propositions as abstract objects would look like. So perhaps it can be understood why, as early as 1918, Russell (despite his important role in developing the theory of propositions) would take the position that “obviously propositions are nothing”, adding that no one with a “vivid sense of reality” can think that there exists “these curious shadowy things going about”. Nevertheless, it is not clear that Russell manages to do without them. In fact, few have so much as attempted to do without them—the nature of propositions being a lively area of research. We return to a discussion of propositions in 3b.

b. Dispositionalism

Dispositionalism, most broadly construed, is the view that having an attitude, for example the belief that it is raining, is nothing more than having a certain disposition, or set of dispositions, or dispositional property or properties. On the simplest dispositionalist view—held by many philosophers when behaviorism was the dominant paradigm in psychology and logical positivism was the reigning philosophy of science (roughly, the first half of the 20th century)—the relevant dispositions are dispositions to overt, observable behavior (Carnap 1959). On this view, to lay stress on the point, to believe that it is raining just is to have certain behavioral dispositions, as exhibited in certain patterns of overt, observable behavior.

Dispositionalists and non-dispositionalists alike agree that patterns of behavior, being manifestations of behavioral dispositions, are evidence for particular beliefs and other attitudes an agent has. However, dispositionalists claim that there is nothing more to the phenomenon: if one were to have exhaustively specified the behavioral dispositions associated with the ascription of the belief that it is raining, one would have said everything there is to say about this belief. In this sense, dispositionalism is a superficial view: in ascribing an attitude, we do not commit ourselves to the existence of any particular internal state of the agent in possession of the attitude—whether a state of the mind or brain. Having an attitude is a surface phenomenon, a matter of how one conducts oneself in the world (Schwitzgebel 2013, Quilty-Dunn and Mandelbaum 2018).

Notoriously, it is very difficult to provide any informative general dispositional characterization of an attitude, such as the belief that it is raining. To take a stock example, we might say that if one believes that it is raining, one will be disposed to carry an umbrella when one leaves the house. Evidently, this will require not only that one has an umbrella on hand but that one desires not to get wet, remembers where the umbrella is located, believes that it will help one to stay dry, and so on. In general, as many theorists have observed, it seems that the behavioral dispositions associated with a particular attitude are not specifiable except by reference to other attitudes (Chisholm 1957, Geach 1957). This is sometimes referred to as the holism of the mental.

This is a problem for the simple dispositional accounts which seek to reduce or analyze away all mental talk into behavioral talk. Such a reductive project was pursued, or at least sketched, by the logical behaviorists, who wished—by analyzing talk of the mind into talk of behavior—to pave the way toward reducing all mental descriptions to physical descriptions. The view was thus a form of physicalism, according to which the mental is physical. Thus, it was sometimes said that mental state attributions are really but shorthand for descriptions of behavioral patterns and dispositions. According to the logical behaviorists, logical analysis was to reveal this. However, it is generally agreed that this project, articulated most influentially by Carl Hempel (1949) and Rudolph Carnap (1959), failed—and precisely on account of the holism of the mental. (There are no prominent logical behaviorists in the 21st century. Carnap and Hempel themselves abandoned the project after having rejected the verificationist criterion of meaning at the heart of logical positivism. According to this criterion, all meaningful empirical statements are verifiable by observation. In the case of psychological statements, the thought was that they should be verifiable by observation of overt behavior. For further discussion, see the articles linked to in the preceding paragraphs of this subsection.)

The holism of the mental is not, however, a problem for every simple dispositionalist account of the propositional attitudes, for not every such account has reductive ambitions. For some, it is enough that we can provide a dispositional characterization of each mental state attribution, albeit one involving reference to other mental states. For example, one might be said to remember where the umbrella is located if one is able to locate it—say, when one wants the umbrella (because, we might add, one desires not to get wet and believes that the umbrella will help one to stay dry). Similarly, one might be said to desire to stay dry if, when one believes that it is raining, one is disposed to adopt some rain-avoiding behavior: say, not leaving the house, or not leaving the house without an umbrella (if one believes that it will help one to stay dry). Despite the difficulty of providing informative general dispositional characterizations of attitudes, everyone semantically competent with the relevant stretches of language is adept at recognizing and ascribing attitudes on the basis of overt, observable behavioral patterns, including (in the case of linguistic beings) patterns of linguistic behavior. The simple dispositionalist view is again just that there is nothing more to know about the attitudes: they are but dispositions to the observed behavioral patterns.

For many dispositionalists, the appeal of dispositionalism is precisely its superficial character. Our everyday practice of ascribing attitudes, so of explaining, predicting, and rationalizing one another and ourselves with reference to the attitudes, appears to be insensitive to whatever is going on, so to speak, under the hood. In fact, many think that the practice has remained more or less unchanged for millennia, even if there have been indefinitely many changes in views of what the mind is and where it is located, if it is has a location. In historical terms, it is only quite recently that we have suspected minds to be brains and their locations to therefore be the interior of the skull. Even still, facts about the brain, specified in cognitive neuroscientific or computational psychological terms, never enter everyday considerations when ascribing attitudes.

Indeed, if an alien being or some cognitively sophisticated descendent of existing non-human animals or some future AI were to seamlessly integrate into human (or post-human) society, forming what are to all appearances nuanced beliefs about, say, the shortcomings of the American constitution, where to invest next year, and how to appease the in-laws this holiday without compromising one’s values, then most of us would be at least strongly inclined to accept this being as a true believer, as really in possession of the attitudes they seem to have—any differences in their physical makeup notwithstanding (see Schwitzgebel 2013, as well as Sehon 1997 for similar examples). Dispositionalism accords well with this.

However, it does seem that one could have an attitude without any associated overt, observable behavioral dispositions—just as one might experience pain without exhibiting or even being disposed to exhibit any pain-related behavior (yelping, wincing, cursing, stamping about, crying, and so forth) (see Putnam 1963 on “super-spartans” or “super-stoics”). A locked-in patient, for example, has beliefs, though no ability to behaviorally exhibit them. (Incidentally, this highlights the implausibility of behaviorism as applied to the mental generally, not just to the attitudes.) If they have the relevant dispositions, this is only in a very attenuated sense.

In addition, it is not clear that the affective or phenomenological should be excluded. For example, if you believe that the earth is flat, might you not be disposed to, say, feel surprised when you see a picture of the earth from space? It is not clear why this should not be among the dispositions characteristic of your belief. What is more, it seems that one might have the associated behavioral dispositions without the attitude. A sycophant to a president, for example, might be disposed to behave as if she thought the president were good and wise, even if she believes the contrary.

Recognizing the force of these and related observations but appreciating the appeal of a superficial account of belief, other more liberal dispositionalists have allowed that the relevant dispositions may include not just dispositions to overt, observable behavior but also dispositions to cognition and affect. Despite his usual mischaracterization as a logical behaviorist, Gilbert Ryle (1949) is a prime example of a dispositionalist of this latter sort. Eric Schwitzgebel (2002, 2010, 2013) is an example of a contemporary theorist who adopts a view similar to Ryle’s.

Like Ryle, Schwitzgebel does not attempt to provide a reductive account. He allows that, when providing a dispositional specification of a particular attitude, we must inevitably make reference to other attitudes. In other words, he countenances the holism of the mental. His view is also like Ryle’s in that he allows that the relevant dispositions are not just dispositions to overt, observable behavior. Unlike Ryle, however, Schwitzgebel makes it a point to emphasize this aspect of his view. He also emphasizes the fact that, when we consider whether someone’s dispositional profile matches “the dispositional stereotype for believing that P”, “what respects and degrees of match are to count as “appropriate” will vary contextually and so must be left to the ascriber’s judgment”(2002, p. 253). He emphasizes, in other words, the vagueness and context-dependency of our ascriptions. Finally, also unlike Ryle, but in line with the dominant view at the beginning of the 21st century, Schwitzgebel is at least inclined to the view that attitudes are causes and belief-desire explanations thus causal explanations.

Combining dispositionalism and a causal picture of the attitudes poses some difficulties. On most views of dispositions developed in the second half of the 20th century and the beginning of the 21st century, dispositions must have categorical bases. Consider, for example, a new rubber band. It has the property of elasticity, and this is a dispositional property: it is disposed to stretch when pulled and to return to its prior shape when released. We can intelligibly ask why, and the answer will tell us something about the categorical basis of this property—something about the material constitution of the rubber band. Similarly, we explain the brittleness of glass and solubility of sugar in terms of their material constitutions (perhaps, a little more specifically, their microphysical structures). In the case of attitudes, construed as dispositions, the most plausible categorical bases would be states of the brain. Now the question arises whether we should identify these dispositions with their categorical bases or not. A dispositionalist attracted to the position for its superficial character is not likely to make this identification. (That attitudes are brain states would seem to be a deep view.) However, without this identification, more work would need to be done to explain how dispositions can be causes.

c. Computational-Representationalism

On the classical picture, described in Section 2a, to believe that the earth moves, Galileo must grasp the proposition that the earth moves in the way the belief suggests it—where, as discussed, the proposition is an abstract mind- and language-independent object with essential and absolute truth-conditions, residing somewhere in the so-called third realm (neither the physical nor the mental realms, but somewhere else altogether—Plato’s Heaven perhaps). As discussed, one trouble with this view is that, if an object is in neither time nor space, there is no clear sense in which it is anywhere, let alone an object. But even supposing we can make sense of this, it remains to explain how we can grasp this object, as well as the nature of the grasping relation. Frege and Russell are not of much help here.

Many philosophers—and Jerry Fodor is the primary architect here—essentially take the classical picture and psychologize it. Thus, the proposition grasped becomes a mental representation (expressing, meaning, or having this proposition as content)—the mental representation being a physical thing, literally located in one’s head—and the grasping relation becomes a functional role or, still more exactly, a computational role. That is, according to Fodor: if Galileo believes that the earth moves, there is in Galileo’s brain a mental representation that means that the earth moves and that plays the functional (or computational) role appropriate to belief. The details are complex (see the articles just linked to), but the basic idea is simple: it is in virtue of having a certain object moving around in your head in a certain way that you bear a certain relation to it and so may be said to have the attitude you have. So put, Fodor’s computational-representational view, unlike the dispositional views discussed above, is a deep view.

In the vocabulary of computational psychology, the object of an attitude is a computational data structure, over which attitude-appropriate computations are performed and so definable. At the level of description at which this structure is so identified, it is multiply-realizable, meaning here that algorithmic and implementation details may vary (Marr 1982). A schematic picture might help:tree diagramRoughly, the top-level (the computational level) provides a formal specification of some function a mechanism might compute; the middle level (the algorithmic level) says how it is computed; and the bottom level (the implementation level) says how all this is physically realized in the brain. As depicted, there is more than one way to execute a function, and more than one way to physically realize its execution. In the end, the idea goes, we can account for the aspects of the mind thus theorized in purely physical terms (thus making the view another form of physicalism). Bridging these levels is not easy, but Fodor thinks that commonsense psychology can help in limning the computational structure of the mind-brain, or our cognitive architecture.

On this view, again, the proposition of the classical view becomes a mental representation to which the subject stands in a particular attitude relation in virtue of the fact that the mental representation plays a certain computational role in the cognitive architecture of the subject: a belief-role or a desire-role, as the case may be. On Fodor’s view, since the content of this mental representation is propositional, and so is a proposition, the representation must have a linguistic form, that is, it must be syntactically structured—and thus, he reasons, a mental sentence, to wit, a sentence of Mentalese, our language of thought. Since he conceptualizes mental representations as objects, he speaks of their syntactic shapes (see, for example, his 1987). In fact, it is supposed to be in virtue of their shapes that mental representations and so the attitudes with which they are associated have causal powers—that is, the ability to make other things move, including the subject with the mental representation in her head.

Fodor (1987) thinks, in fact, that if you look at attitude reports and our practice of ascribing attitudes (in other words, at commonsense belief-desire psychology, or folk psychology), what you will find is that attitudes have at least the following essential properties: “(i) They are semantically evaluable. (ii) They have causal powers. (iii) The implicit generalizations of common-sense belief/desire psychology are largely true of them.” (p. 10).

As an example generalization of common-sense belief-desire psychology, Fodor (1987) provides the following:

If x wants that P, and x believes that not-P unless Q, and x believes that x can bring it about that Q, then (ceteris paribus) x tries to bring it about that Q. (p. 2)

Generalizations like this are implicit in that they need not be—and often are not—explicitly entertained or represented when they are used to explain and predict behavior with reference to beliefs and desires. Taking an example from Fodor (1978), consider the following instance of the above generalization: if John wants that it rain, and John believes that it will not rain unless he washes his car, and John believes that he can bring it about that he washes his car, then (ceteris paribus) John tries to bring it about that he washes his car. According to Fodor, such explanations are causal, and the attitude ascriptions involved individuate attitudes of a given type (belief, desire) by their contents (that it will not rain unless I wash my car, that it will rain). Moreover, such explanations are largely successful: the predictions pan out more often than not; and Fodor reasons, we therefore have grounds to think that these ascriptions are often true. If true, a scientific account of our cognitive architecture should accord with this.

These generalizations, moreover, are counterfactual-supporting: if a subject’s attitudes are different, we folk psychologists, equipped with these generalizations, will produce different predictions of their behavior. So, the generalizations have the characteristics of the laws that make up scientific theories. Granted, there are exceptions to the generalizations, and so exceptional circumstances. In other words, these generalizations hold ceteris paribus, that is, all else being equal. For example: if one wants it to rain, and believes that it will not rain unless one washes one’s car, and one believes that one can wash one’s car, then one will wash one’s car—unless it suddenly begins to rain, or one is immobilized by fear of the neighbor’s unleashed dog, or one suffers a seizure, and so forth. (As competent folk psychologists, we are very capable of recognizing the exceptions.) However, this does not mean that the generalizations or instances thereof are empty—that is, true unless false—argues Fodor (1987); for this would make the success of folk psychology miraculous. Besides, all the generalizations of the special sciences (that is, all the sciences but basic physics) have exceptions; and that is no obstacle to their having theories. So, in fact, Fodor thinks that folk psychology is or involves a bona fide theory—and that this theory is vindicated by the best cognitive science. As vindicated, the posits of folk psychology, namely beliefs, desires, and the rest, are therefore shown to be real. We return to this in Section 4.

As a bona fide theory, Fodor reasons that the referents of folk psychology’s theoretical terms are unobservable. Beliefs, desires, and the rest are therefore unobservables and thus inner as opposed to outer mental states. If the ascriptions are largely true, then—on a non-instrumentalist reading—what they refer to must exist and moreover have the properties the truth of the ascriptions requires them to have. The explanations in which these ascriptions figure are again causal, so these inner states must be causally efficacious. Since, according to Fodor (1987), “whatever has causal powers is ipso facto material” (p. x, Preface), it follows that mental states are physically realized (presumably, in the brain). Since, once more, they are individuated by their contents, they are content-bearing. Putting this together, then, the propositional attitudes are neurally-realized causally efficacious content-bearing internal states. As Fodor (1987) states the view:

For any organism O, and any attitude A toward the proposition P, there is a (‘computational’/‘functional’) relation R and a mental representation MP such that

MP means that P, and

O has A iff O bears R to MP. (p. 17)

Importantly, according to Fodor, computational-representationalism—unlike any other theoretical framework before it—allows us to explain precisely how mental states like propositional attitudes can have both contents and causal powers (so the first two essential properties noted above). Attitudes, indeed mental states more generally, do not just cause behaviors. They also causally interact with one another. For example, believing that if it rains, it will pour, and then coming to believe that it will rain (say, on the basis of a perceptual experience), will typically cause one to believe that it will pour. What is interesting about this is that this causal pattern mirrors certain content-relations: If “it pours, if it rains” and “it rains” are true, “it pours” is true. This fact may be captured formally or syntactically: P → Q, P ⊢ Q. (This indicates that Q is derivable or provable from P → Q and P. This common inference pattern is known as Modus Ponens. See the article on propositional logic.) This, in turn, permits us to build machines—computers—which exhibit this causal-inferential behavior. In fact, not only may computer programs model our cognitive processes, we also are, on this view, computers of a sort ourselves.

Among those who accept the computational-representational account of propositional attitudes, some deny that the relevant representations are sentences, and some maintain agnosticism on this question (see, for example, Fred Dretske, Tyler Burge, Ruth Millikan). Moreover, not everyone who accepts that they are sentences accepts that they are sentences of a language of thought distinct from any public language (Harman 1973). Still others deny that only language is compositional—arguing, for example, that maps, too, can be compositional (see Braddon-Mitchell and Jackson 1996, Camp 2007). Views also differ on how the relevant mental representations get their content (see the article on conceptual role semantics, as well as the article on Fodor, for some of the views in this area). In any case, Fodor’s view has been the most influential articulation, and the above theoretical identification is general enough to be accepted by any computational-representationalist.

3. Propositions, Propositional Attitude Reports, and the Method of Truth in Metaphysics

Theorizing about propositions, propositional attitudes, and propositional attitude reports have traditionally gone together. The connection is what Davidson (1977) called “the method of truth in metaphysics”, or what Robert Matthews (2007) calls the “reading off method”—that is, the method of reading off the metaphysics of the things we talk about from the sentences we use to talk about these things, provided that the logical form and interpretation of the sentences have been settled. This section discusses this method and the metaphysics of propositional attitudes and propositions arrived at by its application.

a. Reading Off the Metaphysics of Propositional Attitudes

Many valid natural language inferences involving propositional attitude reports seem to require that these reports have relational logical forms—the reports thereby reporting the obtaining of a relation between subjects and certain objects to which we seem to be ontologically committed by existential generalization:

Galileo believes that the earth moves. Bgp
∴ Galileo believes something. ∴ ∃xBgx

(Ontology is the study of what there is. One’s ontological commitments are thus what one must take to exist. This notion of ontological commitment is most famously associated with Quine (1948, 1960); as he put it: “to be is to be the value of a bound variable”.) That is, if a report like

(1) Galileo believes that the earth moves.

has the logical form displayed above, then if Galileo believes that the earth moves, there is something—read: some thing, some object—Galileo believes, to wit, the proposition that the earth moves. If you believe that Galileo believes this, you are committed to the existence of this object.

Some of these inferences, moreover, appear to require that the objects to which subjects are related by such reports are truth-evaluable:

Galileo believes that the earth moves. Bgp
That the earth moves is true. Tp
∴ Galileo believes something true. ∴ ∃x(Bgx & Tx)

If Galileo’s belief is true, then the proposition that the earth moves is true. We thus say that the object of Galileo’s belief, the proposition, is also the content of the belief. Still other inferences appear to require that attitudes are shareable:

Galileo believes that the earth moves. Bgp
Sara believes that the earth moves. Bsp
∴ There is something they both believe. ∴ ∃x(Bgx & Bsx)

On the classical view owing to Frege and Russell (see Section 2a), objects and contents of belief, truth-evaluable, and shareable, not to mention the referents of that-clauses (for example “that the earth moves”) and expressible by sentences (for example “the earth moves”), are among the specs for propositions. Thus, a report like (1) appears to be true just in case Galileo stands in the belief-relation to the proposition that the earth moves, the subject and object being respectively the referents of “Galileo” and “that the earth moves”.

Of course, it is not just beliefs that we report. We also report fears and hopes and many other attitudes besides. For example:

(2) Bellarmine fears that the earth moves.

(3) Pia hopes that the earth moves.

So, we see that various attitudes may have the same proposition as their object (and content). Of course, the same type of attitude can be taken towards different propositions, referred to with different that-clauses (for example “that the earth is at the center of the universe”). Generalizing on these data, we therefore seem to be in a position to say the following:

Instances of x V that S are true if and only if x bears the relation expressed by V (the V-relation) to the referent of that S.

with (1)–(3) being instances of this schema.

This analysis is typically extended also to reports of speech acts of various kinds, which are sometimes included under the label propositional attitudes (Richard 2006). For example:

(4) Galileo {said/asserted/proclaimed/hypothesized…} that the earth moves.

In fact, replacing the attitude verb with a verb of saying, inferences like the following appear to be equally valid:

Galileo said that the earth moves. Sgp
∴ Galileo said something. ∴ ∃xSgx
Galileo said that the earth moves. Sgp
That the earth moves is true. Tp
∴ Galileo said something true. ∴ ∃x(Sgx & Tx)
Galileo said that the earth moves. Sgp
Sara said that the earth moves. Ssp
∴ There is something they both said. ∴ ∃x(Sgx & Ssx)

As one can believe what is said (asserted, proclaimed, and so forth), inferences like the following likewise appear valid:

Sara believes everything that Galileo says. ∀x(Sgx ⊃ Bsx)
Galileo said that the earth moves. Sgp
∴ Sara believes that the earth moves. ∴ Bsp

The objects of these reports are again often thought to be propositions. These inferences thereby lend further support for the above view of the form and interpretation of reports. This view, what Stephen Schiffer (2003) calls the Face-Value Theory, has long been the received view.

b. Reading Off the Metaphysics of Propositions

One remaining question concerns the nature of these propositions, taken to be the objects and contents of the attitudes. Getting clear on this would seem crucial to getting clear on what the propositional attitudes are, if they are indeed attitudes taken towards propositions. To this end, the same method has been used.

Provided that that-clauses like “that the earth moves” are replaced by individual constants in the logical translations of reports like (1)–(4), it seems right to construe that-clauses as singular referring terms (similar to proper names) and so their referents—propositions, by the foregoing reasoning—as objects. Moreover, if it is another property of propositions to be expressed by (indicative, declarative) sentences, then provided that these sentences have parts which compose in systematic ways to form wholes, it seems natural to think that the propositions are likewise structured—with the parts of the propositions corresponding to the parts of the sentences. So, propositions are structured objects, though distinct from the sentences used to express them. Moreover, since they are shareable, even countenancing vast spatiotemporal separation between subjects (both we and Galileo can believe that the earth moves), they must be, it is reasonable to think, abstract and mind-independent. Or so Frege (1918) reasoned.

With this granted, we might then ask what the nature of the constituents of propositions is. If, for example, when we use a sentence like

(5) The earth moves.

we refer to the earth and ascribe to it the property of moving, and we express propositions with sentences, then it is natural to think that the constituents of propositions are objects, like the earth, and properties (and relations), like the property of moving. This is the so-called Russellian view of the constituents of propositions, after Russell.

If the propositional contribution of a term just is a certain object, namely the one to which the term refers, then any other term that refers to the same object will have the same propositional contribution. This seems to mirror an observation made concerning sentences like (5). If, for example, “the earth” and “Ertha” are co-referring, then if (5) is true, so is

(6) Ertha moves.

That is, if

(7) The earth is Ertha.

is true, then—holding constant the sentences in which these terms are embedded—the one term should be substitutable for the other without change in truth value of the embedding sentence. The terms are, as it is sometimes put, intersubstitutable salva veritate (saving truth).

This seems, however, not to hold generally, as Frege (1892) famously observed. For suppose that Galileo believes that the earth and Ertha are distinct. Then even if the earth is Ertha and Galileo believes that the earth moves,

(8) Galileo believes that Ertha moves.

is false, or so it might seem. (Some Russellians deny this; see, for example, Salmon 1986.) Such apparent substitution failures are widely known as Frege cases. Frege thought that they cast doubt on the Russellian view of propositional constituents. For if propositional attitudes are relations between subjects and propositions, he reasoned, then provided that the type of attitude ascribed to Galileo is the same (belief, in the running example), there must be a difference in the proposition which accounts for the difference in the truth value of these reports.

Frege cases are related to another puzzle discussed by Frege (1892), widely known as the puzzle of cognitive significance. To take a widely used example from Frege, while the Babylonians would have found

(9) Hesperus is Hesperus.

and

(10) Phosphorus is Phosphorus.

as trivial as anyone else, it would have come as a surprise to them that

(11) Hesperus is Phosphorus.

Establishing the truth of (11) was a non-trivial astronomical discovery. It turns out, contrary to what the Babylonians believed, that Hesperus, the heavenly body which shines in the evening, and Phosphorus, the heavenly body which shines in the morning, are one and the same—not distinct stars, as the Babylonians believed, but the planet Venus. Yet, if (11) is true, and “Hesperus” and “Phosphorus” are two names for one and the same object, there is a sense in which (9), (10), and (11) all say the same thing. Therefore, an explanation of the fact that (9) and (10) are trivial while (11) is cognitively significant seems to be owed.

One possible explanation for the difference in cognitive significance is that we may attach distinct senses to distinct expressions, even if they are co-referring. If propositions are what we grasp when we understand sentences, then perhaps, Frege hypothesized, propositional constituents are not individuals and relations but senses. This is the so-called Fregean view of propositions.

In the first instance, senses are whatever difference accounts for the difference in cognitive significance between (9) and (10), on the one hand, and (11) on the other. More exactly, Frege suggested that we think of senses as ways of thinking about or modes of presenting what we are talking about which are associated with the expressions we use. For example, the sense associated with “Hesperus” by the Babylonians could be at least roughly captured with the description “the star that shines in the evening” and the sense associated with “Phosphorus” by the Babylonians could be at least roughly captured with the description “the star that shines in the morning”. (For further discussion, see the articles on Frege’s philosophy of language and Frege’s Problem.)

Taking this idea on board, compatibly with accepting the idea that co-referring terms are intersubstitutable salva veritate, Frege suggested that we might then account for substitution data like the above by a systematic shift in the referents of the expressions embedded in the scope of an attitude verb like “to believe”—in particular, a shift from customary referent to sense. On this view, for example, the referent of “Hesperus”, when embedded in

(12) Bab believes that Hesperus shines in the morning.

is not Hesperus (that is, Venus) but the sense of “Hesperus”. Similarly, the semantic contribution of the predicate “shines in the morning” would be the sense of that expression, not the property of shining in the morning. Putting these senses together, we have the proposition (or thought) expressed by “Hesperus shines in the morning”, the referent of the that-clause “that Hesperus shines in the morning”. This way we can see how (12) and

(13) Bab believes that Phosphorus shines in the morning.

may have opposite truth values, even if (11) is true.

Employing his theory of descriptions, Russell (1905) offered a different solution which is compatible with accepting the Russellian view of propositions. Still other solutions have been proposed, motivated by a variety of Frege cases. Common to almost all positions in this literature is the assumption that attitudes are relations between subjects and certain objects. Not all of the proposed solutions, however, take propositions to be structured objects (see, for example, Stalnaker 1984). In fact, not all of the proposed solutions take the objects to be propositions. Some instead propose different proposition-like objects, including: natural language sentences, mental sentences, interpreted logical forms, and interpreted utterance forms (see, for example, Carnap 1947, Fodor 1975, Larson and Ludlow 1993, and Matthews 2007). On such views, the expression “propositional attitudes” turns out to be something of a misnomer, as they are not, strictly speaking, attitudes toward propositions.

c. A Challenge to the Received View of the Logical Form of Attitude Reports

It should be noted that the received view of the logical form of propositional attitude reports discussed in Section 3a has not gone unchallenged. Much recent work in this area has been motivated by a renewed attention to a puzzle known as Prior’s Substitution Puzzle (after Arthur Prior, who is often credited with introducing the puzzle in his 1971 book).

If “that the earth moves” refers to the proposition that the earth moves, then assuming (as is common) that co-referring terms are intersubstitutable salva veritate, we should expect that substituting “that the earth moves” for “the proposition that the earth moves” in (2) will not change the sentence’s truth value. But here is the result:

(14) Bellarmine fears the proposition that the earth moves.

It seems clear that one may fear that the earth moves without fearing any propositions. We could give up the commonly held substitution principle, but this would be a last resort.

A natural thought is that the problem is peculiar to fear, but the problem is seen with many other attitudes besides. Take (3), for example, and perform the substitution. The result:

(15) Sara hopes the proposition that Galileo is right.

Clearly, something has gone wrong; for (15) is not even grammatical.

At this point, one might begin to question whether propositions are in fact the objects of the attitudes. However, it appears that none of the available alternatives to propositions will do:

Bellarmine fears the {proposition/(mental) sentence/interpreted logical form…} that the earth moves.

Some proponents of the received view of the logical form of attitude reports have provided responses to this problem which are compatible with maintaining the received view (see, for example, King 2002, Schiffer 2003). Others argue that the received view must be abandoned, and on some of these alternative views, that-clauses are not singular referring terms but predicates (see, for example, Moltmann 2017).

Insofar as one accepts the reading off method, different views of the logical form of attitude reports may lead to different views of the metaphysics of the propositional attitudes. Of course, not everyone accepts this method. For some general challenges to the method, see, for example, Chomsky (1981, 1992), and for challenges to the method specifically as it applies to theorizing about propositional attitudes, see, for example, Matthews 2007.

4. Folk Psychology and the Realism/Eliminativism Debate

This section discusses how propositional attitudes figure in folk psychology and how the success or lack of success of folk psychology has figured in debates about the reality of propositional attitudes.

a. Folk Psychology as a Theory (Theory-Theory)

The term “folk psychology” is sometimes used to refer to our everyday practice of explaining, predicting, and rationalizing one another and ourselves as minded agents in terms of the attitudes (and other mental constructs, such as sensations, moods, and so forth). Sometimes, it is more specifically used to refer to a particular understanding of this practice, according to which this practice deploys a theory, also referred to (somewhat confusingly) as “folk psychology”. This theory about the practice of folk psychology is sometimes referred to as the Theory-Theory (TT). Wilfrid Sellars (1956) is often credited with providing the first articulation of TT, and Adam Morton (1980)  with coining the term.

The idea that folk psychology deploys a theory immediately raises the question of what a theory is. At the time TT was introduced, the dominant view of scientific theories in the philosophy of science was that theories are bodies of laws, that is, sets of counterfactual-supporting generalizations (see Section 2c), generally codifiable in the form:

If___, then ___.

Where the first blank is filled by a description of antecedent conditions, and the second blank is filled by a description of consequent conditions. If the law is true, then if the described antecedent conditions obtain, the described consequent conditions will obtain. Thus, the law issues in a prediction and thereby gives us an explanation of the conditions described in its consequent—to wit, a causal explanation, the one condition (event, state of affairs) being the cause of the other.

This idea in turn raises the question of what the laws of folk psychology, understood as a theory (FP, for short), are supposed to be. The following example was provided in Section 2c:

If x wants that P, and x believes that not-P unless Q, and x believes that x can bring it about that Q, then (ceteris paribus) x tries to bring it about that Q.

And another example is the following (see Carruthers 1996):

If x has formed an intention to bring it about that P when R, and x believes that R, x will act so as to bring it about that P.

Additional laws take a similar form. It is acknowledged that they all admit of exceptions; but this, it is argued, does not undermine their status as laws: after all, all the laws of the special sciences have exceptions (again, see discussion in 2c).

Given our competence as folk psychologists, one might expect many more laws like this to be easily formalizable. Perhaps surprisingly, however, only a few more putative laws of FP have ever been presented (but see Porot and Mandelbaum 2021 for a report on some recent progress). Considering how rich and sophisticated folk psychology is (we are, after all, remarkably complex beings), one would expect FP to be a very detailed theory. As a result, one might think that the relative dearth of explicitly articulated laws might cast doubt on the idea that folk psychology in fact employs a theory. The line that proponents of folk psychology as a theory (FP) take is that the laws are implicitly or tacitly known and need not be explicitly entertained or represented when deploying the theory. This position is not an ad hoc one, since many domains in the cognitive sciences take a similar view—for example, in Chomskyan linguistics, which aims to provide an explicit articulation of natural language grammars. We are all competent speakers of a natural language and so must have mastered the grammar of this language. Manifestly, however, it is quite another thing to have an explicitly articulated grammar of the language in hand. We speak and comprehend our natural languages effortlessly, but coming up with an adequate grammar is devilishly difficult. The same may be true when it comes to our competence as folk psychologists.

It is generally agreed that the core of FP would comprise those laws concerning the attitudes (including, for example, the above laws). The key terms of the theory—its theoretical terms—would thus include “belief”, “desire”, “hope”, “fear”, and so forth, or their cognates. If FP is indeed a successful theory, and this is not a miraculous coincidence, then we have reason to think that its theoretical terms succeed in referring—that is, we have reason to think that there are beliefs, desires, hopes, fears, and the rest. If, however, FP is not a successful theory, then the attitudes may have to go the way of the luminiferous aether, phlogiston, and other theoretical posits of abandoned theories.

This understanding of folk psychology and the stakes at hand provide the shared background to the realism/eliminativism debate.

b. Realism vs. Eliminativism

The realist position is represented most forcefully and influentially by Fodor (whose position is described in Section 2c). Fodor takes the success of FP, considered independently of the cognitive sciences, to be obvious. Here is a typical passage from his (1987) book:

Commonsense psychology works so well it disappears… Someone I don’t know phones me at my office in New York from—as it might be—Arizona. ‘Would you like to lecture here next Tuesday?’ are the words that he utters. ‘Yes, thank you. I’ll be at your airport on the 3 p.m. flight’ are the words that I reply. That’s all that happens, but it’s more than enough; the rest of the burden of predicting behavior—of bridging the gap between utterances and actions—is routinely taken up by theory. And the theory works so well that several days later (or weeks later, or months later, or years later; you can vary the example to taste) and several thousand miles away, there I am at the airport, and there he is to meet me. Or if I don’t turn up, it’s less likely that the theory has failed than that something went wrong with the airline. It’s not possible to say, in quantitative terms, just how successfully commonsense psychology allows us to coordinate our behaviors. But I have the impression that we manage pretty well with one another; often rather better than we cope with less complex machines. (p. 3)

In fact, he adds: “If we could do that well with predicting the weather, no one would ever get his feet wet; and yet the etiology of the weather must surely be child’s play compared with the causes of behavior.” (p. 4)

What is more, he argues, signs are that the cognitive sciences—computational psychology, in particular—will vindicate FP by giving its theoretical posits pride of place (see also Fodor 1975). Eliminativists, of course, have a very different view.

Perhaps the most widely discussed and influential argument, or set of arguments, against the realist position and in favor of eliminativism is set forth by Paul Churchland in his (1981) essay “Eliminative Materialism and the Propositional Attitudes.” There, the eliminativist thesis is stated as follows:

Eliminative Materialism is the thesis that our commonsense conception of psychological phenomena constitutes a radically false theory, a theory so fundamentally defective that both the principles and the ontology of that theory will eventually be displaced, rather than smoothly reduced, by completed neuroscience. (p. 67)

He continues:

Our mutual understanding and even our introspection may then be reconstituted within the conceptual framework of completed neuroscience, a theory we may expect to be more powerful by far than the common-sense psychology it displaces, and more substantially integrated within physical science generally. (ibid.)

Churchland argues not only that FP will be shown to be false, but that it will be eliminated—that is, replaced by a more exact and encompassing theory, in terms of which we may then reconceptualize ourselves.

Whereas Churchland welcomes the prospect, Fodor (1990) has this to say:

If it isn’t literally true that my wanting is causally responsible for my reaching, and my itching is causally responsible for my scratching, and my believing is causally responsible for my saying… If none of that is literally true, then practically everything I believe about anything is false and it’s the end of the world. (p. 156)

This might strike some as hyperbolic at first. However, even the eliminativist thesis seems to presuppose what it denies; for after all, is not the assertion of the thesis an expression of belief? So, does Churchland not have to believe that there are no beliefs? (See Baker 1987)

Churchland offers three main arguments for his eliminative view. The first is that FP does not explain a wide range of mental phenomena, including “the nature and dynamics of mental illness, the faculty of creative imagination…the nature and psychological function of sleep…the rich variety of perceptual illusions” (1981, p. 73), and so on. The second is that, unlike other theories, folk psychology seems resistant to change, has not shown any development, is “stagnant”. The third is that the kinds of folk psychology (belief, desire, and so on) show no promise of reducing to, or being identified with, the kinds of cognitive science—indeed, no promise of cohering with theories in the physical sciences more generally.

A number of responses have been provided by those who take FP to be a successful theory. Regarding the first argument, one might simply reply that the theory is successful when applied to phenomena within its explanatory scope. FP needs not be the theory of everything mental. Regarding the second, one might observe that a remarkably successful theory does not call for revision. The third argument is the strongest. However, at the beginning of the 21st century (and all the more so, then, in the last decades of the 20th century, when this debate was an especially hot topic) it turns on little more than an empirical bet, about which there can be reasonable disagreement. Many theorists who appeal to the cognitive sciences in advancing eliminativism appeal in particular to developments in the connectionist paradigm or to other developments in lower-level computational neuroscience (see, besides Churchland 1981, Churchland 1986, Stich 1983, Ramsey et al. 1990), the empirical adequacy of which has been a subject of debate—particularly when it comes to explaining higher-level mental capacities, such as the capacities to produce and comprehend language, which are centrally implicated in folk psychology. (For more on this, see the article on the language of thought.)

There are many other responses to eliminativist arguments besides these, including some which involve rejecting TT (see Section 4c). If folk psychology does not involve a theory, then it cannot involve a false theory; and by the same token, then, beliefs, desires, and the rest cannot be written off as empty posits of a false theory. Even among those who accept that folk psychology involves a theory though, some might reject the idea that the falsity of the theory (namely, FP) entails the nonexistence of beliefs, desires, and the rest.

Stich (1996), a one-time prominent eliminativist, came then to suggest (following Lycan 1988) that the general form of the eliminativist argument—

(1) Attitudes are posits of a theory, namely FP;

(2) FP is defective;

(3) So, attitudes do not exist.

—is enthymematic, if it is not invalid. The suppressed premise Stich identifies is an assumption about the nature of theoretical terms, according to which their meanings are fixed by their relations with other theoretical terms in the theory in which they are embedded (see Lewis 1970, 1972). In other words, a form of descriptivism, according to which the meanings of terms are fixed by associated descriptions, is assumed. On this view, for example, the meaning of “water” is fixed by such descriptions as that “water falls from the sky, fills lakes and oceans, is odorless, colorless, potable, and so forth”. Water, in other words, just is whatever uniquely satisfies these descriptions. Similarly, then, beliefs would be those mental states which are, say, semantically evaluable, have causal powers, a mind-to-world direction-of-fit, and so forth—or in brief, those states of which the laws of FP featuring the term “belief” or its cognates are true. If it turns out that nothing satisfies the relevant descriptions, there are no beliefs. This works the same for desires and the rest. However, one might well reject descriptivism and so block this implication. In fact, the dominant view at the beginning of the 21st century in the theory of reference is not descriptivism but the causal-historical theory of reference.

According to the causal-historical theory of reference (owing principally to the work of Saul Kripke and Hilary Putnam), the referent of a term is fixed by an original baptism (typically involving a causal connection to the referent), later uses of the term depending for their referential success on being linked to other successful uses of the term linking back to the original baptism. Such a view allows for the possibility that we can succeed in referring to things even when we have very mistaken views about them, as it in fact seems possible to many. For example, it seems right that the ancients succeeded in referring to the stars, despite having very mistaken views about what they are. Similarly, then, the idea goes, if the causal-historical theory is correct, it should be possible that we succeed in referring to propositional attitudes even if we have very mistaken views about their nature, so even if FP is defective.

But in fact, Stich’s (1996) skepticism about the eliminativist argument goes even deeper than this, extending to the very method of truth or reading off method in metaphysics (what Stich calls “the strategy of semantic ascent”). It is not clear, he argues, what is required of a theory of reference, or whether there might be such a thing as the correct theory of reference. After all, descriptivism might seem to better accord with cases where we do reject the posits of rejected theories—the luminiferous aether, phlogiston, and so on.

One idea might be to have a close look at historical cases where theoretical posits have been retained despite theory change and cases where theory change leads to a change in ontology and see if we can uncover implicit general principles for deciding between (a) “we were mistaken about Xs” and (b) “Xs do not exist”. But Stich (1996) despairs of the prospects:

It is entirely possible that there simply are no normative principles of ontological reasoning to be found, or at least none that are strong enough and comprehensive enough to specify what we should conclude if the [premises] of the eliminativist’s arguments are true. (p. 66-7)

Moreover, he continues:

In some cases it might turn out that the outcome was heavily influenced by the personalities of the people involved or by social and political factors in the relevant scientific community or in the wider society in which the scientific community is embedded. (p. 67)

If this is correct, then it is indeterminate whether there are propositional attitudes, and it will remain indeterminate “until the political negotiations that are central to the decision have been resolved” (ibid., p. 72).

This gives us a very different view of the stakes at hand, as the reality of the attitudes no longer appears to be a question of what is the case independently of human interests and purposes. The question is, as Stich puts it, political. But this view, which is a sort of social constructivism, is controversial—even if some of the main lines of thought leading to this view have less controversial roots in pragmatism.

c. Alternatives to Theory-Theory

As noted in Section 4b, not every theorist of folk psychology accepts TT. One of the main alternatives to TT, which was developed against the backdrop of the realism/eliminativism debate, is the simulation theory (ST). According to this view, we do not deploy a theory in explaining, predicting, and rationalizing one another (that is, in brief, in practicing folk psychology). Instead, what we do is to simulate the mental states of others, put ourselves in their mental shoes, or take on their perspective (Heal 1986, Gordon 1986, Goldman 1989, 1993, 2006). Different theorists spell out the details differently, but there is a common core, helpfully summarized by Weiskopf and Adams (2015):

In trying to decide what someone thinks, we imaginatively generate perceptual inputs corresponding to what we think their own perceptual situation is like. That is, we try to imagine how the world looks from their perspective. Then we run our attitude-generating mechanisms offline, quarantining the results in a mental workspace the contents of which are treated as if they belonged to our target. These attitudes are used to generate further intentions, which can then be treated as predictions of what the target will do in these circumstances. Finally, explaining observed actions can be treated as a sort of analysis-by-synthesis process in which we seek to imagine the right sorts of input conditions that would lead to attitudes which, in turn, produce the behavior in question. These are then hypothesized to be the explanation of the target’s action. (p. 227)

One perceived advantage of this view, insofar as folk psychology is thought not to involve a theory, is that the attitudes (and other mental constructs) appear to be immune to elimination. Of course, just what folk psychology is or involves is itself an empirical question, and in particular a question for psychology. For a sustained empirical defense of ST, pointing to paired deficits and neuroimaging, among other lines of evidence, see Goldman 2006.

Above, TT and ST were described as alternatives. Indeed, ST was initially developed as an alternative to TT. However, subsequent theorists developed a number of hybrid views, suggesting that we both simulate and theorize, depending on how similar or dissimilar the targets of explanation are (Nichols and Stich 2003). In fact, it might be wondered why simulation could not just be seen as our way of applying a theory. After all, it is not claimed by the proponents of TT that the theory is consciously entertained; it is instead tacitly known. It could be, the suggestion goes, that simulation is what application of the theory looks like from the conscious, personal-level (see Crane 2003). Whether this view is correct is however an empirical question. For more on the empirical literature, see the article on the theory of mind.

Perhaps unsurprisingly, there are many other views of folk psychology besides TT, ST, and the hybrid theories. Another widely discussed class of views goes under the label interpretationism, with the key theorists here being, among others, Davidson and Dennett. We focus on Dennett’s articulation.

Whereas Fodor and Churchland agree that if propositional attitude reports are true, they are made true by the presence of certain causally efficacious and semantically evaluable internal states, Dennett demurs. There might well not be such states as Fodor believes there are, as Churchland argues; but pace Churchland, Dennett thinks that this would not impugn our folk psychological practices or call into question the truth of our propositional attitude reports. In providing a folk psychological explanation, we adopt what Dennett (1981 [1998], 1987 [1998]) calls the “intentional strategy” or “intentional stance”, in which we treat objects or systems of interest as if they are rational agents with beliefs and desires and other mental states. According to Dennett (1981 [1998]):

Any object—or…any system—whose behavior is well-predicted by this strategy is in the fullest sense of the word a believer. What it is to be a true believer is to be an intentional system, a system whose behavior is reliably and voluminously predictable via the intentional strategy. (p. 15)

This view emphasizes that:

all there is to being a true believer is being a system whose behavior is reliably predictable via the intentional strategy, and hence all there is to really and truly believing that p (for any proposition p) is being an intentional system for which p occurs as a belief in the best (most predictive) interpretation. (p. 29)

This gives us what Dennett characterizes as a “milder sort of realism”.

One obvious objection to this view is that it is not just the systems we intuitively take to be intentional that are reliably predictable from the intentional stance. Thermostats, rocks, stars, and so forth are also reliably predictable from the intentional stance. Dennett’s response to this is to observe that taking the intentional stance toward the latter objects is gratuitous; the “physical” or “design” stances instead suffice. Whereas, in the case of the systems we intuitively take to be intentional systems, the intentional stance is practically indispensable. This response has the curious result that whether a system is intentional is relative to who is trying to predict the behavior of the system. Presumably, Laplace’s Demon, capable of predicting any future state from any prior state under purely physical descriptions, would have no need for the intentional stance. For this reason, many have interpreted the view as a form of anti-realism, albeit not an eliminativist form of anti-realism: there are no attitudes, but it is useful to speak as if there were. For a view similar to Dennett’s in a number of respects but with a more decidedly realist slant, see Lynne Rudder Baker’s (1995) exposition of a view she calls “practical realism”.

Common to all the views of folk psychology mentioned above is the idea that folk psychology is for explaining and predicting. Such is the dominant view in the early 21st century. But it should be mentioned that on other views, folk psychology is not so much for reading minds as it is for shaping minds. It is, in other words, a normative-regulative system. For developments of this line, see, for example, Hutto 2008, McGeer 2007, Zawidzki 2013.

5. More on Particular Propositional Attitudes, or on Related Phenomena

Much of the foregoing has concerned belief and desire, as they are generally regarded to be the paradigmatic propositional attitudes, and especially belief, as is customary in much of the philosophy of mind and language. However, it must be noted that not every attitude fits easily under the general characterization of the attitudes discussed in Section 1, which divides the attitudes into the belief-like and desire-like. For example, imagining (make-believing) and entertaining, as well as delusions of various kinds, pose difficulties for this general classification, as they do not quite match either direction of fit associated with belief and desire. Other controversial candidate propositional attitudes, even if clear in their direction of fit, include judging, knowing, perceiving, and intending. What follows is a brief discussion of each.

a. Imagining

Typically, one does not imagine what is (or, what one takes to be) the case. One imagines what is not (or, what one takes not to be) the case. One imagines, for example, what the world will be like in 2100 if climate change is not meaningfully addressed. One imagines what it was like (or might have been like) to be a prehistoric hunter-gatherer. One imagines what the universe would have been like if one or another of the fundamental physical constants had been different. One imagines (or tries to imagine) what it would be like to be a bat, or to see a ripe tomato for the first time, or to taste Vegemite. One imagines that a particular banana is a telephone. One imagines flying by flapping one’s arms. One imagines that one’s best friend is a fairy. One imagines a weight and feather dropped from the Leaning Tower of Pisa. One imagines a replica of the Leaning Tower of Pisa made of Legos. One imagines listening to Charlie Parker live.

These examples seem to illustrate different kinds of imaginings, or imaginings with different kinds of contents;: imagining an event or state of affairs, or that such and such is the case, imagining objects and experiences, counterfactual imaginings, and imaginings of the past and future. Whatever more we can say of them, these are, plausibly, attitudes of a kind. Moreover, they seem relevant to everything from philosophical and scientific thought experiments and the enjoyment of fiction to pretending and childhood make-believe. As such, imagining is of interest to a range of areas of research, including meta-philosophy, philosophy of science, aesthetics, and philosophy of mind.

The question that interests us here is whether imagining is a propositional attitude. Imagining that such and such is the case (that the banana is a telephone, that one can fly by flapping one’s arms, that one’s best friend is a fairy) seems very plausibly to be a kind of propositional attitude. However, it does not quite seem to be either a belief-like or a desire-like propositional attitude. Beliefs, remember (see Section 1), have a mind-to-world direction of fit; they aim at the truth, or at what is the case—that is, they aim to bring the content of the belief in accord with the world. Desires, by contrast, have a world-to-mind direction of fit; they aim to bring the world in accord with their contents. While imaginings are, like desires, typically about what is not the case, in imagining one does not aim to bring the world in accord with the content of the imagining.

If anything, imagining’s direction of fit is that of belief, though again its aim is not to get the world right. (The use of the expression “make-believe” is suggestive.) Perhaps imagining might be said to have mind-to-counterfactual-world direction of fit; though in this case it still might not be clear what ‘getting it right’ amounts to. Relatedly, although beliefs tend to cohere, there seems to be no requirement that imaginings cohere with one another and with what one believes (except, perhaps, to the extent that they presuppose some beliefs; for example, if you are to imagine an elephant, you must presumably have beliefs about what they look like). Another difference between imagination and belief is seen when we consider its relation to the will. It is widely agreed that we cannot directly decide what to believe; in other words, we do not have direct voluntary control of our beliefs, though we may have indirect control. The same, in fact, may be true of desire. By contrast, we seem to have direct voluntary control of at least some of our imaginings. (There do also seem to be involuntary imaginings.)

It could be that what might be called propositional imagining and believing lie on a continuum (Schellenberg 2013). Perhaps, in fact, there are some mental states which are something of a combination of the two (Egan 2009). It seems that much the same might be said of a number of other attitudes or other attitude-like phenomena, including entertaining, supposing, conceiving, and the like.

Concerning the imagining of objects and experiences, they are perhaps not as obviously propositional attitudes. Nonetheless, such imagining plays an important role in our mental lives. Traditionally, they have been thought to involve images, taken to be rather like pictures (if we have in mind visual imagery), only in the mind. But the nature of such images has been the subject of much debate. For more on this, see the article on imagery and imagination.

b. Judging

Judging—though it often served as the focus of theorizing in Frege and Russell—might be better construed as an action or act (and so an event), rather than a state. Accordingly, some theorists distinguish between propositional attitudes and acts, where acts include not only judging, but also saying, asserting, entertaining, and hypothesizing. However, most theorists, it should be noted, do not draw this distinction. In fact, some identify occurrent beliefs and judgments, as Russell seems to do (see also, for example, Lycan 1988). The distinction may however be important to some views. For example, Hanks (2011, 2015) and Soames (2010, 2014) argue that propositional attitudes, like believing, are partly constituted by dispositions to perform certain propositional acts, like entertaining and judging; and that the contents of propositional attitudes can be identified with types of such acts. A primary motivation for this view is that it can constitute a solution to the problem of the cognitive accessibility of propositions, which the classical view (discussed Section 2) gives rise to.

c. Knowing

Epistemologists distinguish between knowledge-that, knowledge-how, and knowledge-wh (which includes knowledge-who, -what, -when, -whether, and -why). Examples include knowing that Goldbach’s conjecture has not been proven, knowing how to construct a truth-table, and knowing who Glenn Gould is. (Knowing your uncle might be an example of yet another kind of knowledge. Compare the contrast between kennen and wissen in German, or connaître and savoir in French. Related to this is the distinction drawn by Russell between knowledge by acquaintance and knowledge by description.) Most theorists view knowledge-wh as a species of knowledge-that. On a commonly held view, knowledge-that involves propositions as objects: for example, the proposition that Goldbach’s conjecture has not been proven. Famously, Ryle (1946, 1949) argued that knowledge-how is an ability, construed as a complex of dispositions, and is not reducible to knowledge-that (characterized in the above manner). Others argue that the one is reducible to the other. For example, Stanley and Williamson (2001) argue that knowledge-how reduces to knowledge-that. While some argue that knowing-that is a sui generis propositional attitude (Williamson 2000), a more traditional view is that such states of knowledge are just belief states which meet certain additional extra-mental conditions (most obviously, being true and justified). Similar remarks go for other factive attitudes, including recognizing and discovering.

d. Perceiving

The question of whether perceiving, and seeing in particular, is a propositional attitude has inspired a voluminous literature in the epistemology of perception. According to certain classical views, for example sense-data theory, propositional content does not belong to perceptual states, which cannot therefore be propositional attitudes, but instead to the judgments or beliefs they occasion. On this view, perceptual states themselves, insofar as they are conscious, are composed of just raw feels or qualia; for example, the redness of a visual experience of a ripe tomato. Other theories hold that perceptual states have a kind of  non-conceptual content (Evans 1982, Peacocke 2001, Crane 2009), while still others maintain that they must have conceptual or propositional content if they are to justify perception-based beliefs (McDowell 1994). These positions touch on many vexing issues at the intersection of epistemology, philosophy of mind, and cognitive science. For further discussion, see the article on Cognitive Penetrability of Perception and Epistemic Justification, in addition to those already linked to.

e. Intending

When you trip, sneeze, or blink, these are in some weak sense things you do. However, they are not, in a philosophically significant sense, actions you perform; they are not manifestations of your agency. It is less misleading to say that they are things that happen to you, or events of which you are subject, not agent. To distinguish between the events of which you are subject and those of which you are agent, most philosophers point to the involvement of intention. There are, however, many difficulties here.

To begin with, the way in which intention is a unified phenomenon, if it is, is not straightforward. Clearly, we can intend to thread the needle and pin the tail on the donkey. We can intentionally thumb our noses, swim the channel, and cross our eyes. We can move our fingers in a certain way, with the intention to tie our shoes. At first blush, it might seem that we have three distinct phenomena here: intending to act (prospective intention), acting intentionally, and acting with an intention (intention with which). We also have the suspicion that these phenomena are ultimately unified. After all, we use the term “intention” and its cognates to describe them all. Attempting to explain this unity has animated much of the philosophy of action in the late 20th and early 21st century.

In her very influential work Intention, Elizabeth Anscombe—partly influenced by the later Wittgenstein—identified intending to Φ (where “Φ” here stands in place of some action, for example: threading the needle) with Φ-ing intentionally and denied that intention is a mental state. This view has as a consequence that if, for example, you intend to visit Cairo in four years, you have already embarked. This consequence has struck many as implausible, though a view along these lines has be revived and given extensive defense by Michael Thompson (2008). Others, for example Davidson (1978), have claimed that one can intend to Φ even if one is not Φ-ing, intentionally or otherwise, and even if one is not currently performing any actions with the intention to Φ. In other words, there are cases of “pure intending” (for example, my intending to visit Cairo) and these are instances of prospective intention or intention for the future. Davidson also observed that the same intention may be present when the relevant actions are being performed and this indicates that intention, whether prospective or present in the course of action, is the basic phenomenon. To the question of what is intention or intending, Davidson’s answer is that intention is a mental state, and an attitude more specifically. (The Anscombe-type view is usually paired with an anti-causal view of folk psychological explanation, while the Davidsonian view is usually paired with a causal view. See Section 1.) However, if intention is an attitude, it is still to be determined what kind of attitude it is and what (if anything) its object is; and answers vary widely.

Some hold that intention is a species of desire (for example, to intend to Φ is to desire to Φ), others that it is a species of belief (for example, to intend to Φ is to believe that one ought to Φ, or will Φ, or is Φ-ing), and still others that it is some combination of desire and belief (for example, the desire to Φ and a belief about how to go about Φ-ing). Finally, some argue that intention is a sui generis state—for example, something like a plan which controls behavior (Bratman 1987). In any case, if intentions are attitudes of some sort, there remains the question of whether they are more specifically propositional attitudes.

Above, it was noted that some theorists have held that attitudes generally are propositional attitudes and so in some way involve propositions—usually, with propositions as objects of the attitudes, where these latter are construed as relations of some kind. Unlike with, say, belief reports, which are naturally worded with that-clauses like “that the earth moves”, as featured in reports like “Galileo believes that the earth moves”, reports of intentions are typically more naturally worded without that-clauses. (On that-clauses, see Section 3.) For example, we typically say things like “I intend to pay off my debts this year”, not “I intend that I will have paid off my debts this year” or “I intend that I should pay off my debts this year”. Likewise, we say “I intend to sleep”, not “I intend that I will be sleeping” or “I intend that I should be sleeping”. This might suggest that the objects of intentions (if there are any) are not propositions but acts, activities, processes, events, or the like—in which case, intentions would not, strictly speaking, be propositional attitudes.

Detailed discussions of intention are to be found primarily in metaethics and philosophy of action, which rests at the intersection of metaethics and philosophy mind, though an adequate account of intention would be foundational to research in other areas, including, for example, in the philosophy of language, and in particular to intention-based accounts of meaning and communication.

f. Non-Propositional Attitudes

Above, it was noted that reports of intentions are often most naturally worded without that-clauses. As it happens, the same is true of at least some reports of desires—desires being one of our paradigmatic propositional attitudes. Consider, for example, a report like “The baby desires to be held”. Here, instead of a that-clause we have a non-finite complement, “to be held”, which seems to designate not a proposition but an action. Of course, we do not see this just with desire reports. Consider: “You want to visit Cairo”, “Everyone wishes to live happily ever after”, “When he does that, he means to insult you”. Moreover, we sometimes have general terms and proper names instead of clauses following the attitude verb. For example: “Molly desires cake”, “Maxine desires Molly”. Similarly, “Jack fears dogs”, “Ibram likes chocolate ice cream”, “Jill loves Xi”, and so forth. In fact, we even see this with some belief reports. For example: “Sara believes Galileo”. If reports with these forms suggest that at least some attitudes are not propositional attitudes, that is because we are using the form of reports to guide our views about the nature of the things reported on (Ben-Yami 1997)—that is, in other words, we are using the reading off method (see Section 3).

Montague (2007), Buchanan (2012), and Grzankowski (2012) think that the reading off method reveals that there are, in addition to propositional attitudes, non-propositional (or objectual) attitudes, as ascribed, for example, by reports like “Desdemona loves Cassio”. Of course, there are competing views about the form of these reports, with some arguing that, despite surface appearances, these reports have the same form as reports like “Galileo believes that the earth moves” (Larson 2002). Also, the legitimacy of this method of reading off the nature of what is reported from the form and interpretation of reports has been challenged (again, see Section 3). Moreover, some arguments for the thesis that at least certain putative propositional attitudes, desires for example, are in fact not propositional attitudes are based not on linguistic observations but on comparative psychological and neuroscientific evidence (Thagard 2006). It could be that we need a finer-grained taxonomy.

g. Delusions

According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), a delusion is “a false belief based on incorrect inference about external reality that is firmly sustained despite what almost everyone else believes and despite what constitutes incontrovertible and obvious proof or evidence to the contrary” (p. 819). In other words, according to the DSM-5, delusions are pathological beliefs. These include, for example, Cotard delusion, the delusion that one is dead, and Capgras delusion, the delusion that one’s significant other has been replaced by an imposter.

At first blush, it is plausible that these delusions are indeed beliefs, however pathological. After all, people who have these delusions do sincerely assert that they are dead or that their significant other has been replaced by an imposter, and sincere assertions are typically expressions of belief. Moreover, these delusions have beliefs’ direction of fit, although they are false per definition. In other respects, however, these and other delusions are unlike typical beliefs.

For one, they often do not cohere with the delusional subject’s other beliefs and behavior. For example, a person suffering Capgras may continue to live with the “imposter” and not seek the “real” significant other they have replaced. Additionally, delusions are often not sensitive to evidence in the same way beliefs typically are, both with regard to their formation and maintenance. For example, someone with Cotard delusion may have no perceptual evidence for their delusion and may maintain the delusion despite overwhelming evidence to the contrary. In these respects, delusions might be closer to imaginings than beliefs (Currie 2000). It might be best to say that delusions are a sui generis kind of propositional attitude, somewhat like belief and somewhat like imagining (Egan 2009). After all, it is often not the case that the delusion is completely severed from all non-verbal behavior and affect. Subjects with Cotard delusion, for example, may be very distressed by their delusion and may stop bathing or caring for themselves (Young and Leafhead 1996).

Delusions are not easy to place. On the one hand, reflection on delusions highlights respects in which many ordinary beliefs might be considered delusional in some respects. Many sincerely professed religious, political, and philosophical beliefs, for example, might fail to cohere with one’s other beliefs and behavior or to be sensitive to evidence in the way other beliefs are. It is also very common to falsely believe, for example, that one is better than average at driving, that one’s children are smarter than average, and so on. It could be that delusions are not qualitatively different from many typical beliefs. Perhaps, what marks off delusions from typical beliefs is the degree to which they depart from societal norms or norms of rationality.

On the other hand, the underlying mechanisms at work may be different. As we learn more about the brain, we are beginning to get plausible explanations of some delusions (for example Capgras) in terms of neuropathology. Perhaps not all delusions can be explained in this way. Some—for example, erotomania, the delusion that one is loved by someone, usually in a position of power or prestige—might be motivated in a way amenable to psychological explanations at the personal level, much as the typical beliefs noted above may be. This in turn may point up the fact that delusion and self-deception, a vexing and philosophically fraught topic in its own right, may overlap.

As their inclusion in the DSM would suggest, delusions are typically thought to be maladaptive— and certainly they often are. However, recent work by philosopher Lisa Bortolotti (2015, 2020) and others point up the fact that delusions—as above defined, and irrational beliefs more generally, including optimistic biases, distorted memory-based beliefs, and confabulated explanations—can not only be psychologically adaptive but even (in Bortolotti’s terminology) “epistemically innocent”: they can help one to secure epistemic goods one would not otherwise enjoy (as may be seen, Bortolotti argues, in cases of anosognosia, in which the subject is unable to understand or perceive their illness or disorder). Epistemically innocent beliefs (including overestimations of our capacities) appear to be widespread in the non-clinical population—a fact which, if more widely appreciated, could modify potentially stigmatizing attitudes to clinical cases. Work in this area may also have significance for treatment.

h. Implicit Bias

Recent psychological research in implicit social cognition suggests that people often make judgments and act in ways contrary to their professed beliefs—ways that evidence implicit bias, including gender, racial, and prestige bias. For example, changing only the name at the top of a CV from a stereotypically White name to a stereotypically Black name or from a stereotypically male name to a stereotypically female name results on average in lower evaluations of the CV (Bertrand and Mullainathan 2004, Moss-Racusin et al. 2012). In a striking 1982 study, Peters and Ceci took papers already published in top peer-reviewed journals, altered the names and institutional affiliations associated with the papers, and resubmitted them to the same journal: 89% were rejected (only 8% being identified as plagiarized). Perhaps the most well-known and widely discussed measure of implicit bias is the Implicit Association Test (IAT), a reaction-time measure that requires subjects to sort images or words into categories as quickly and accurately as possible. Among the findings with this test are that subjects find it easier to sort stereotypically White names with positive words (like “morality” and “safe”) and stereotypically Black names with negative words (like “bad” and “murder”’) than the other way round. Subjects are also more likely to mistake a harmless tool for a weapon when a Black face is flashed on the screen than they are when a White face is flashed on the screen (Payne 2001). Members of the stigmatized groups, it has been found, are not immune to these biases. Such findings raise many questions regarding the metaphysics of implicit bias and its epistemological ramifications (regarding self-knowledge, for example), as well as questions of ethics.

Provided that we often read off people’s beliefs from their behaviors, it is natural to count implicit biases as beliefs. On the other hand, as noted in the preceding subsection, it is often thought that beliefs generally cohere with one another and are responsive to reasons. They are also often accessible to consciousness. In these respects, implicit biases appear to be unlike beliefs. One may be utterly oblivious to one’s implicit biases. One may find these biases disturbing, abhorrent, or irrational. However, it appears that it is not enough, in order to get rid of them, to sincerely believe that they are not tracking facts about the world.

On considerations like these, Tamar Gendler (2008b) has proposed that implicit biases fall into a category distinct from beliefs, namely what she calls “aliefs”, which are “associative, automatic, and arational” and typically “affect-ladden and action-generating” mental states (p. 557). An example of a less ethically charged alief would be alieving that the Grand Canyon Skywalk is unsafe (as manifest, for example, in trembling, clutching the handrails, shrieking), while firmly believing that it is safe (Gendler 2008a).

Some theorists have cast doubt on the idea that aliefs form a separate psychological kind (Nagel 2012, Mandelbaum 2013). Some argue that implicit measures like IAT and explicit measures (for example, surveys) alike measure beliefs; it is just that one’s implicit and explicit beliefs, which may be contradictory, are activated (that is, come online) in different contexts. In other words, we have compartmentalized or fragmented systems of belief (Lewis 1982, Egan 2008). Proponents of this kind of view generally adopt a computational-representational view of belief. On a dispositionalist view, beliefs are rather like traits; and just as one can be kind of a jerk and kind of not, in different respects and in different contexts, perhaps one can sort of believe that the Skywalk is safe and sort of not: one imperfectly realizes or exemplifies a certain dispositional profile or stereotype associated with the relevant beliefs. Perhaps the phenomenon of belief, in brief, allows for in-between cases (Schwitzgebel 2010).

There are still other proposals besides—and criticisms for each. The literature here is vast and growing, and the issues are many and complex (see, for example, the collections by Brownstein and Saul 2016a, 2016b). This subsection provides just a gesture at the literature.

The foregoing should give some indication of the significance of the propositional attitudes, and so theories thereof, to a great many intellectual projects, across various domains of philosophy and allied disciplines. It should also give an idea of how contested the positions are. It is only somewhat recently that much work has been devoted to discussion of the complexities of particular attitudes, or related phenomena, as opposed to the attitudes generally. Indeed, much of the general discussion of the attitudes has drawn on consideration of belief alone. The literature on particular attitudes, or related phenomena, such as the above, is growing rapidly. It is an exciting, interdisciplinary area of research. To be sure, there is here much more work to be done.

6. References and Further Reading

  • American Psychiatric Association. 2013. Diagnostic and Statistical Manual of Mental Disorders (5th ed.). Arlington, VA.
  • Anscombe, G.E.M. 1957 [1963]. Intention. Harvard University Press.
  • Baker, L.R. 1987. Saving Belief: A Critique of Physicalism. Princeton University Press.
  • Baker, L.R. 1995. Explaining Attitudes: A Practical Approach to the Mind. Cambridge University Press.
  • Benacerraf, P. 1973. Mathematical Truth. Journal of Philosophy 70 (19): 661-79.
  • Ben-Yami, H. 1997. Against Characterizing Mental States as Propositional Attitudes. Philosophical Quarterly 47 (186): 84-9.
  • Bertrand, M., and Mullainathan, S. 2004. Are Emily and Greg More Employable than Lakisha and Jamal? American Economic Review 94 (4): 991–1013.
  • Bortolotti, L. 2015. The Epistemic Innocence of Motivated Delusions. Consciousness and Cognition 33: 490-99.
  • Bortolotti, L. 2020. The Epistemic Innocence of Irrational Beliefs. Oxford University Press.
  • Braddon-Mitchell, D. and Jackson, F. 1996. The Philosophy of Mind: An Introduction. Blackwell.
  • Bratman, M. 1987. Intention, Plans, and Practical Reason. Harvard University Press.
  • Brownstein, M. & J. Saul (eds.). 2016a. Implicit Bias and Philosophy: Volume I, Metaphysics and Epistemology. Oxford University Press.
  • Brownstein, M. & J. Saul (eds.). 2016b. Implicit Bias and Philosophy: Volume 2, Moral Responsibility, Structural Injustice, and Ethics. Oxford University Press.
  • Buchanan, R. 2012. Is Belief a Propositional Attitude? Philosophers’ Imprint 12 (1): 1-20.
  • Burge, T. 2010. Origins of Objectivity. Oxford University Press.
  • Camp, E. 2007. Thinking with Maps. Philosophical Perspectives 21 (1): 145-82.
  • Carnap, R. 1947 [1956]. Meaning and Necessity: A Study in Semantics and Modal Logic. The University of Chicago Press.
  • Carnap, R. 1959. Psychology in Physical Language. In A.J. Ayer (ed.). Logical Positivism. Free Press.
  • Carruthers, P. 1996. Simulation and Self-Knowledge: A Defence of Theory-Theory. In P. Carruthers and P. Smith (eds.), Theories of Theories of Mind. Cambridge University Press: 22-38.
  • Chisholm, R. 1957. Perceiving: A Philosophical Study. Cornell University Press.
  • Chomsky, N. 1980. Rules and Representations. Columbia University Press.
  • Chomsky, N. 1981. Lectures on Government and Binding. Mouton.
  • Chomsky, N. 1992. Explaining Language Use. Philosophical Topics 20 (1): 205-31.
  • Churchland, P. M., 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy 78: 67–90.
  • Churchland, P.S. 1986. Neurophilosophy: Toward a Unified Science of the Mind/Brain. MIT Press.
  • Crane, T. 2001. Elements of Mind: An Introduction to the Philosophy of Mind. Oxford University Press.
  • Crane, T. 2003. The Mechanical Mind: A Philosophical Introduction to Minds, Machines and Mental Representation. Routledge.
  • Crane, T. 2009. Is Perception a Propositional Attitude? Philosophical Quarterly 59 (236): 452-469.
  • Currie, G. 2000. Imagination, Delusion and Hallucination. Mind & Language 15 (1): 168-83.
  • Davidson, D. 1977. The Method of Truth in Metaphysics. In H. Wettstein et al. (eds.), Midwest Studies in Philosophy, II: Studies in Metaphysics. University of Minnesota Press.
  • Davidson, D. 1978. Intending. Philosophy of History and Action 11: 41-60.
  • Dennett, D. 1981. True believers: The Intentional Strategy and Why It Works. In A.F. Heath (Ed.), Scientific Explanation: Papers Based on Herbert Spencer Lectures given in the University of Oxford. Clarendon Press. [Reprinted in Dennett, D. 1987 [1998]. The Intentional Stance. MIT Press: 13-35]
  • Dennett, D. 1987 [1998]. The Intentional Stance. MIT Press.
  • Dennett, D. 1991. Real Patterns. The Journal of Philosophy 88 (1): 27-51.
  • Dretske, F. 1988. Explaining Behavior: Reasons in a World of Causes. MIT Press.
  • Dretske, F. 1989. Reasons and Causes. Philosophical Perspectives 3: 1-15.
  • Dretske, F. 1993. Mental Events as Structuring Causes of Behavior. In Mental Causation, J. Heil and A. Mele (eds.). Oxford University Press: 121-136.
  • Egan, A. 2008. Seeing and Believing: Perception, Belief Formation and the Divided Mind. Philosophical Studies 140 (1): 47–63.
  • Egan, A. 2009. Imagination, Delusion, and Self-Deception. In T. Bayne and J. Fernandez (eds.), Delusion and Self-Deception: Motivational and Affective Influences on Belief-Formation. Psychology Press: 263-80.
  • Evans, G. 1982. The Varieties of Reference. Oxford University Press.
  • Fodor, J. 1975. The Language of Thought. Harvard University Press.
  • Fodor, J. 1978. Propositional Attitudes. The Monist 61 (4): 501-23.
  • Fodor, J. 1987. Psychosemantics: The Problem of Meaning in The Philosophy of Mind. MIT Press.
  • Fodor, J. 1990. A Theory of Content and Other Essays. MIT Press.
  • Frege, G. 1892 [1997]. On Sinn and Bedeutung (trans. M. Black). In M. Beaney (ed.), The Frege Reader. Blackwell: 151-71.
  • Frege, G. 1918 [1997]. Thought (trans. P. Geach and R.H. Stoothoff). In M. Beaney (ed.), The Frege Reader. Blackwell: 325-45.
  • Geach, P. 1957. Mental Acts: Their Content and Their Objects. Routledge and Kegan Paul.
  • Gendler, T. 2008a. Alief and Belief. The Journal of Philosophy 105 (10): 634–63.
  • Gendler, T. 2008b. Alief in Action (and Reaction). Mind & Language 23 (5): 552–85.
  • Goldman, A. 1989. Interpretation Psychologized. Mind & Language 4 (3): 161-85.
  • Goldman, A. 1993. The Psychology of Folk Psychology. Behavioral and Brain Sciences 16 (1): 15-28.
  • Goldman, A. 2006. Simulating Minds. Oxford University Press.
  • Gordon, R. 1986. Folk psychology as simulation. Mind & Language 1 (2): 158-71.
  • Grzankowski, A. 2012. Not All Attitudes Are Propositional. European Journal of Philosophy 3: 374-91.
  • Hanks, P. 2011. Structured Propositions as Types. Mind 120 (477): 11-52.
  • Hanks, p. 2015. Propositional Content. Oxford University Press.
  • Harman, G. 1973. Thought. Princeton University Press.
  • Heal, J. 1986. Replication and Functionalism. In J. Butterfield (Ed.), Language, Mind, and Logic. Cambridge University Press: 135-50.
  • Humberstone, I.L. 1992. Direction of Fit. Mind 101 (401): 59-83.
  • Hutto, D. 2008. Folk Psychological Narratives. MIT Press.
  • King, J. 2002. Designating Propositions. Philosophical Review 111 (3): 341-71.
  • Kripke, S. 1972. Naming and Necessity. Harvard University Press.
  • Larson, R. 2002. The Grammar of Intensionality. In G. Preyer and G. Peter (eds.), Logical Form and Language. Oxford University Press: 228-62.
  • Larson, R. and Ludlow, P. 1993. Interpreted Logical Forms. Synthese 95 (3): 305-55.
  • Lewis, D. 1970. How to Define Theoretical Terms. Journal of Philosophy 67 (13): 427-46.
  • Lewis, D. 1972. Psychophysical and Theoretical Identifications. Australasian Journal of Philosophy 50 (3): 249-58.
  • Lewis, D. 1982. Logic for Equivocators. Noûs 16 (3): 431–441.
  • Lycan, W. 1988. Judgment and Justification. Cambridge University Press.
  • Mandelbaum, E. 2013. Against Alief. Philosophical Studies 165 (1): 197-211.
  • Marr, D. 1982. Vision. W.H. Freeman.
  • Matthews, R. 2007. The Measure of Mind: Propositional Attitudes and Their Attribution. Oxford University Press.
  • McDowell, J. 1994. Mind and World. Harvard University Press.
  • McGeer, V. 2007. The Regulative Dimension of Folk-Psychology. In D. Hutto and M. Ratcliff (eds.), Folk-Psychology Reassessed. Springer.
  • Millikan, R. 1993. White Queen Psychology and Other Essays for Alice. MIT Press.
  • Moltmann, F. 2017. Cognitive Products and the Semantics of Attitude Verbs and Deontic Modals. In F. Moltmann and M. Textor (eds.), Act-Based Conceptions of Propositional Content. Oxford University Press.
  • Montague, M. 2007. Against Propositionalism. Noûs 41 (3): 503-18.
  • Morton, A. 1980. Frames of Mind: Constraints on the Common-Sense Conception of the Mental. Oxford University Press.
  • Moss-Racusin, C. et al. 2021. Science Faculty’s Subtle Gender Biases Favor Male Students. Proceedings of the National Academy of Sciences of the United States of America 109 (41): 16474-9.
  • Nagel, J. 2012. Gendler on Alief. Analysis 72 (4): 774–88.
  • Nagel, T. 1974. What Is It Like to Be a Bat? Philosophical Review 83 (10): 435-50.
  • Nichols, S. and Stich, S. 2003. Mindreading. Oxford University Press.
  • Payne, B. 2001. Prejudice and Perception: The Role of Automatic and Controlled Processes in Misperceiving a Weapon. Journal of Personality and Social Psychology 81 (2): 181–92.
  • Peacocke, C. 2001. Does Perception Have Nonconceptual Content? The Journal of Philosophy 98 (5): 239-64.
  • Peters, D. and Stephen Ceci, J. 1982. Peer-Review Practices of Psychological Journals: The Fate of Published Articles, Submitted Again. Behavioral and Brain Sciences 5 (2):187–255.
  • Porot, N. and Mandelbaum, E. 2021. The Science of Belief: A Progress Report. WIREs Cognitive Science 12 (2): e1539.
  • Prior, A.N. 1971. Objects of Thought. Clarendon Press.
  • Putnam, H. 1963. Brains and Behavior. In R.J. Butler (ed), Analytical Philosophy: Second Series. Blackwell: 1-19.
  • Putnam, H. 1975. The Meaning of “Meaning”. Minnesota Studies in the Philosophy of Science 7: 131-93.
  • Quilty-Dunn, J. and Mandelbaum, E. 2018. Against Dispositionalism: Belief in Cognitive Science. Philosophical Studies 175 (9): 2353-72.
  • Quine, W.V. 1948. On What There Is. Review of Metaphysics 2 (5): 12-38.
  • Quine, W.V. 1960. Word and Object. MIT Press.
  • Ramsey, W., Stich, S. and Garon, J. 1990. Connectionism, Eliminativism and The Future of Folk
  • Psychology. Philosophical Perspectives 4: 499–533.
  • Richard, M. 2006. Propositional Attitude Ascription. In M. De Witt and R. Hanley (eds.), The Blackwell Guide to The Philosophy of Language. Blackwell.
  • Russell, B. 1903. The Principles of Mathematics. Allen and Unwin.
  • Russell, B. 1905. On Denoting. Mind 14 (56): 479-93.
  • Russell, B. 1918. The Philosophy of Logical Atomism, Lectures 1-2. The Monist 28 (4): 495-527.
  • Ryle, G. 1946. Knowing How and Knowing That: The Presidential Address. Proceedings of the Aristotelian Society 46 (1): 1-16.
  • Ryle, G. 1949. The Concept of Mind. Chicago University Press.
  • Salmon, N. 1986. Frege’s Puzzle. Ridgeview.
  • Schiffer, S. 2003. The Things We Mean. Oxford University Press.
  • Schellenberg, S. 2013. Belief and Desire in Imagination and Immersion. Journal of Philosophy 110 (9): 497-517.
  • Schwitzgebel, E. 2002. A Phenomenal, Dispositional Account of Belief. Noûs 36 (2): 249-75.
  • Schwitzgebel, E. 2010. Acting Contrary to Our Professed Beliefs, or The Gulf between Occurrent Judgment and Dispositional Belief. Pacific Philosophical Quarterly 91 (4): 531-53.
  • Schwitzgebel, E. 2013. A Dispositional Approach to Attitudes: Thinking Outside the Belief Box. In N. Nottelmann (ed.), New Essays on Belief. Palgrave Macmillan: 75–99.
  • Searle, J. 1983. Intentionality. Cambridge University Press.
  • Searle, J. 1992. The Rediscovery of Mind. MIT Press.
  • Searle, J. 2001. Rationality in Action. MIT Press.
  • Sehon, S. 1997. Natural Kind Terms and the Status of Folk Psychology. American Philosophical Quarterly 34 (3): 333-44.
  • Sellars, W. 1956. Empiricism and the Philosophy of Mind. Minnesota Studies in the Philosophy of Science 1: 53-329.
  • Shah, N. and Velleman, D. 2005. Doxastic Deliberation. Philosophical Review 114 (4): 497-534.
  • Soames, S. 2010. What Is Meaning? Princeton University Press.
  • Soames, S. 2014. Cognitive Propositions. In J. King, S. Soames and J. Speaks (eds.), New Thinking About Propositions. Oxford University Press:
  • Stalnaker, R. 1984. Inquiry. MIT Press.
  • Stanley, J. and Williamson, T. 2001. Knowing How. The Journal of Philosophy 98 (8): 411-4.
  • Sterelny, K. 1990. The Representational Theory of Mind. Blackwell.
  • Stich, S. 1983. From Folk Psychology to Cognitive Science. MIT Press.
  • Stich, S. 1996. Deconstructing the Mind. Oxford University Press.
  • Thagard, P. 2006. Desires Are Not Propositional Attitudes. Dialogue 45 (1): 151-6.
  • Thompson, M. 2008. Life and Action. Harvard University Press.
  • Wedgwood, R. 2002. The Aim of Belief. Philosophical Perspectives 16: 267-97.
  • Weiskopf, D. and Adams, F. 2015. An Introduction to the Philosophy of Psychology. Cambridge University Press.
  • Williamson, T. 2000. Knowledge and Its Limits. Oxford University Press.
  • Wittgenstein, L. 1953. Philosophical Investigations. Wiley-Blackwell.
  • Young, A.W. and Leafhead, K. 1996. Betwixt Life and Death: Case Studies of the Cotard Delusion. In P. Halligan and J. Marshall (eds.) Method in Madness: Case Studies in Cognitive Neuropsychiatry. Psychology Press: 147-71.
  • Zawidzki, W. 2013. Mindshaping: A New Framework for Understanding Human Social Cognition. MIT Press.

Author Information

David Lindeman
Email: david.lindeman@georgetown.edu
Georgetown University
U. S. A.

Epistemic Modality

Epistemic modality is the kind of necessity and possibility that is determined by epistemic constraints. A modal claim is a claim about how things could be or must be given some constraints, such as the rules of logic (logical modality), moral obligations (deontic modality), or the laws of nature (nomic modality). A modal claim is epistemic when these constraints are epistemic in nature, meaning roughly that they are related to knowledge, justification, or rationality. An epistemic possibility is something that may be true, given the relevant epistemic constraints (for example, “Given what we know about the weather, it might rain tomorrow”), while an epistemic necessity is something that must be true given the relevant epistemic constraints (for example, “I don’t see Julie’s car in the parking lot, so she must have gone home”).

The epistemic modal status of a proposition is determined by some body of information, such as an individual or group’s knowledge, a set of data, or the available evidence. A proposition that is not ruled out or eliminated by the information is epistemically possible, whereas a proposition that is in some sense guaranteed by the information is epistemically necessary. As an analogy, consider a detective investigating a crime. Initially, there is little evidence, and so there are many suspects. As more evidence is acquired, suspects are gradually ruled out—it could not have been the butler, since he was in the gazebo at the time of the crime—until only one remains, who must be guilty. Similarly, an epistemic agent may start with limited evidence that leaves open many epistemic possibilities. As the agent acquires more evidence, various possibilities are ruled out until some propositions are epistemically necessary and so must be true.

This article presents the distinctive features of epistemic modality and surveys different answers to the following questions about epistemic modality:

(1) Whose information determines the modal status of a proposition?
(2) How does information determine the modal status of a proposition?
(3) How is epistemic modality related to knowledge?

It concludes with a discussion of alternatives to the standard semantics for epistemic modal language.

Table of Contents

  1. Epistemic Modality and Other Modalities
    1. Modal Puzzles
  2. Whose Information Determines the Epistemic Modal Status of a Proposition?
    1. Context-Dependence
    2. Relativism
  3. How Does Information Determine the Epistemic Modal Status of a Proposition?
    1. Negation
    2. Entailment
    3. Probability
    4. Dismissing
  4. How is Epistemic Modality Related to Knowledge?
    1. Knowledge as the Relevant Information
    2. Epistemic Modality and Knowledge
    3. Concessive Knowledge Attributions
  5. Alternatives to the Standard View of Epistemic Modals
    1. Embedded Epistemic Modals
    2. Hedging
    3. Other Views of Epistemic Modals
  6. References and Further Reading

1. Epistemic Modality and Other Modalities

An epistemic modal is an epistemic use of a modal term, such as “might”, “necessarily”, or “possible”. On the standard view of epistemic modals, sentences in which these modals are the main operator are used to make epistemic modal claims that attribute an epistemic modal status, either possibility or necessity, to a proposition. For example, (1)-(8) can all be used to make epistemic modal claims:

(1) Maybe it will rain tomorrow.

(2) Terry may not do well on the test.

(3) Perhaps my grandmother is in Venezuela.

(4) The special theory of relativity might be true, and it might be false.

(5) Aristotle might not have been a philosopher.

(6) Given the angle of the blow, the killer must have been over six feet tall.

(7) Sam must be on her way by now.

(8) For all I know, there is no solution.

On this standard view, (1)-(5) can be used to attribute epistemic possibility using the different epistemic modals “maybe”, “may”, “perhaps”, and “might”. (1), for example, attributes epistemic possibility to the proposition that it will rain tomorrow, while (4) attributes epistemic possibility both to the proposition that the special theory of relativity is true and to its negation. (6) and (7), on the other hand, use the epistemic modal “must” to attribute epistemic necessity to the propositions that the killer was over six feet tall and that Sam is on her way, respectively. (8) is also naturally read as expressing an epistemic modal claim attributing epistemic possibility to the proposition that there is no solution, even though no modal term is explicitly used.

The distinguishing characteristic of epistemic modal claims is that their truth is determined by epistemic factors. The epistemic modal status of a proposition is determined by some body of information, and not by logical, metaphysical, or scientific laws. An epistemic possibility is not, for example, some way the world could have been, given the actual laws of physics. Instead, it is a way the world might yet be, given some body of information, such as what we currently know. So, (4), if read as a claim about epistemic possibility, does not assert that the truth and falsehood of the special theory of relativity are both compatible with the laws of physics. It says only that both the truth and falsehood of the theory are individually compatible with some information, such as what the speaker knows. Similarly, an epistemic necessity is not some way the world had to be, given the constraints of logic. Instead, it is a way the world must in fact be, given, for example, what we have discovered about it. So, an utterance of (7) does not assert that some logical contradiction or metaphysical impossibility follows from the assumption that Sam is not on her way. It says only that, given some information, such as what we know about Sam’s schedule, she must in fact be on her way.

As a result, epistemic modal claims are about the actual world in a way that some other modal claims are not. An epistemic possibility is not an alternative way the world might have been had things gone differently, but a way the world might yet turn out to be given the relevant information. (5), if read as expressing metaphysical possibility, is true just in case there is some metaphysically possible world in which Aristotle is not a philosopher. So, it is about the various alternative ways that the world could have been, asserting that at least one of them includes Aristotle not being a philosopher. But (5) is ambiguous and could also be used to make a claim about epistemic possibility: that Aristotle not being a philosopher in this world is left open by the relevant information. If, for example, there were not enough information to determine whether Aristotle had ever done any philosophy in the actual world, it would be epistemically possible that Aristotle was not a philosopher. Unlike the metaphysical possibility claim, this claim is not about an alternative way that the world could have been, but instead about how the past might turn out to have actually been.

Similarly, an epistemic necessity is a way the world must in fact be, but not a way the world had to be— that is, an epistemic necessity might very well not be a metaphysical or logical necessity (and vice versa). The claim that it is metaphysically necessary that 2+2=4, for example, is true just in case there are no metaphysically possible worlds in which the sum of 2 and 2 is something other than 4. So, this claim asserts that, in all possible ways the world could have been, 2+2=4. On the other hand, an epistemic necessity claim made using (6) is true just in case the killer being over six feet tall is in some sense guaranteed by the angle of the blow. This claim is therefore not about how things had to be in all of the various ways the world could have been, but merely about how things must be given our information about how the world in fact is.

Another feature that distinguishes epistemic modality from other kinds of modality is that, because the epistemic modal status of a proposition is determined by epistemic constraints, it can vary over time and between subjects. So, I may truly utter (1) today, but having seen no rain by the end of the day tomorrow, I would have different information and could no longer truly say that rain on that day is possible. Similarly, not knowing my grandmother’s travel itinerary, I could truly say (3), but my cousin who has just found out that our grandmother’s trip to Venezuela was cancelled could not. As a result, it is common to say that a proposition is epistemically possible or necessary for some person or group (for example, “it was possible for Aristotle that the Earth was at the center of our solar system”), meaning it is possible or necessary on that person or group’s information. In contrast, logical, metaphysical, and nomic modalities do not vary across time and between subjects.

a. Modal Puzzles

Distinguishing epistemic modality from other modalities is a key step in solving some philosophical puzzles. Consider Goldbach’s conjecture:

(GC) Every even integer greater than 2 is the sum of two primes.

There is a sense in which (GCT) and (GCF) both seem true

(GCT) It is possible that Goldbach’s conjecture is true.

(GCF) It is possible that Goldbach’s conjecture is false.

But the truth value of Goldbach’s conjecture, like other mathematical truths, is often considered as being of necessity—it is either necessarily false or necessarily true. So, if (GCT) and (GCF) are expressions of mathematical possibility, they generate a contradiction. If Goldbach’s conjecture is possibly true, then it is necessarily true, and if it is possibly false, then it is necessarily false. From (GCT) and (GCF), then, it would follow that Goldbach’s conjecture is both necessarily true and necessarily false.

This result can be avoided by distinguishing mathematical possibility from epistemic possibility. Although (GCT) and (GCF) cannot both be true if they are read as claims about mathematical possibility, they can both be true when read as claims about epistemic possibility. According to one view of epistemic possibility, for example, because we do not yet know whether Goldbach’s conjecture is true or false, both options are epistemically possible for us. However, to avoid contradiction, this must not entail that they are both mathematically possible.

A puzzle involving co-referring names can similarly be solved by appeal to epistemic possibility. According to some views of names (for example, Kripke (1972)), the proposition “Hesperus is identical to Phosphorus” expresses a metaphysically necessary truth, since “Hesperus” and “Phosphorous” are both names for the planet Venus. Nevertheless, a person who has no reason to think that that “Hesperus” and “Phosphorous” refer to the same thing could truly say “it is possible that Hesperus is not identical to Phosphorous”. But if it is (metaphysically) necessarily true that Hesperus is identical to Phosphorous, then it cannot be (metaphysically) possible that Hesperus is not identical to Phosphorous. The contradiction can be avoided by understanding “it is possible Hesperus is not identical to Phosphorous” as a statement of epistemic, rather than metaphysical, possibility. It is epistemically possible, relative to the information of someone who does not know what these names refer to, that Hesperus is not identical to Phosphorous, even though it is not metaphysically possible. (See also Modal Illusions)

These solutions to these puzzles demonstrate two ways in which epistemic possibility is distinct from other kinds of possibility. It is broader in the sense that a logical, mathematical or metaphysical impossibility may be epistemically possible, as in the case of Goldbach’s conjecture or its negation (whichever is false). Similarly, a proposition can be epistemically necessary (for some subject), but not metaphysically, mathematically, or logically necessary. It is not metaphysically necessary that Descartes existed—he could have failed to exist. Nevertheless, it was epistemically necessary for Descartes that he existed—given his information, he must have existed. Epistemic possibility is also narrower in the sense that many logical, mathematical and metaphysical possibilities, such as that no human beings ever existed, are not epistemic possibilities. Epistemic necessity, too, is narrower, in that many logical, mathematical and metaphysical necessities are not epistemically necessary, such as yet-unproven theorems of logic. Because of these differences, epistemic modality “cuts across” logical, mathematical and metaphysical modalities, with the result that facts about epistemic modality cannot be inferred from facts about these other modalities, and vice versa.

2. Whose Information Determines the Epistemic Modal Status of a Proposition?

The epistemic modal status of a proposition is determined by some body of information, but it is not always specified which information is relevant. Phrases like “given the evidence presented today…”, “in view of the information we have…”, and “for all I know…” often indicate the relevant information for a particular epistemic modal claim. However, many sentences used to make epistemic modal claims lack this kind of clear indicator. Claim (1), for example, does not specify for whom or on which body of information it is possible that it will rain tomorrow. Since people have different information about the weather, the proposition that it will rain tomorrow may be possible on the information possessed by some people, but not on the information possessed by others. As a result, a complete theory of epistemic modal claims must have some mechanism for determining whose information is relevant in determining their truth.

a. Context-Dependence

According to some theories, the truth of an epistemic modal claim varies with features of the context in which it is made. Call these “context-dependent theories” of epistemic modal claims. On these views, facts about the context of assertion—that is, the situation in which the epistemic modal claim is spoken, written, or otherwise conveyed—affect the truth of the claim. The simplest kind of context-dependent theory is one in which the relevant information is the information possessed by the speaker:

(Speaker) “It might be that p” is true as tokened by the speaker S at time t if and only if p is epistemically possible on the information possessed by S at t.

According to (Speaker), whether an epistemic possibility claim expressed by “it might be that p” is true is determined by whether p is epistemically possible on the information possessed by the person who asserts that claim. Thus, the feature of the context that is relevant to determining the claim’s truth value is the information possessed by the speaker. If Paul says, for example, “it might be that God exists”, his claim is true just in case it is epistemically possible for Paul (that is, on the information that he possesses), at the time of his speaking, that God exists.

However, (Speaker) gives counterintuitive results in dialogues about epistemic modality. Suppose, for example, that Katie and Julia are discussing Laura’s whereabouts. Julia knows that Laura left on a flight for Hungary this morning, but Katie does not know that Laura has left. They then have the following discussion:

Katie: Laura might be in the living room.

Julia: No. She can’t be in the living room, because she left for Hungary this morning.

Katie: Oops. I guess I was wrong.

The problem for (Speaker) is that it is epistemically possible on Katie’s information at the beginning of the dialogue that Laura is in the living room. So, according to (Speaker), she speaks truly when she says “Laura might be in the living room”. To some, this seems false—since Julia knows that Laura is on her way to Hungary, Katie’s claim that she might be in the living room cannot be true. Furthermore, Katie seems right to correct herself at the end of the dialogue, but if (Speaker) is true, then this is a mistake. Even though Laura being in the living room is not possible on the information Katie has after talking to Julia, it was possible on her original information at the time that she spoke, which is all that is necessary to make her claim true, according to (Speaker

This problem can be avoided by expanding the relevant information to include information possessed by people other than the speaker, as in:

(Audience) “It might be that p” is true as tokened by the subject S at time t if and only if p is epistemically possible on the combined information possessed by S and S’s audience at t.

On this view, because Julia is Katie’s audience, her information is also used in determining whether Katie’s claim is true. Since Julia knows that Laura is on her way to Hungary, it is not possible on her information that Laura is in the living room, making Katie’s initial epistemic possibility claim false and her later self-correction warranted.

In some cases, though, the epistemic modal status of a proposition is evaluated relative to the information of some party other than the speaker or its audience. One appropriate response to the question “Is it true that the universe might continue expanding forever?”, for example, is “I don’t know; only a scientist would know if that’s possible.”. But if it is just the information of the speaker and its audience that determines the truth of epistemic possibility claims, this response is bizarre. All it would take to know whether the universe might continue expanding forever is to check the information possessed by those two parties. Here, though, it seems that the possibility of the universe’s continued expansion is being evaluated relative to the information possessed by the scientific community, which the speaker does not have access to. This suggests that, in some contexts, the relevant information is not the information of the speaker or its audience, but the information possessed by the members of some other community. Incorporating this idea gives the following kind of principle:

(Community) “It might be that p” is true as tokened by the subject S at time t if and only if p is epistemically possible on the information possessed by the members of the relevant community at t.

This view is still context-dependent, as which community is relevant is determined by contextual factors, such as the topic and purpose of the conversation. Note that the relevant community may often include just the speaker or just the speaker and its audience (as in Katie and Julia’s case). (Community) simply allows that in some contexts the information of the scientific community, for example, determines the truth of the epistemic modal claim.

Other examples suggest that even (Community) is insufficiently flexible to account for all epistemic modal claims, as they are sometimes evaluated relative to information that no one actually possesses. The classic example of this comes from Hacking (1967):

Imagine a salvage crew searching for a ship that sank a long time ago. The mate of the salvage ship works from an old log, makes some mistakes in his calculations, and concludes that the wreck may be in a certain bay. It is possible, he says, that the hulk is in these waters. No one knows anything to the contrary. But in fact, as it turns out later, it simply was not possible for the vessel to be in that bay; more careful examination of the log shows that the boat must have gone down at least thirty miles further south. The mate said something false… but the falsehood did not arise from what anyone actually knew at the time.

This kind of case leads some to include not only the information possessed by the relevant community, but also information that members of that community could acquire through investigation. No one in any community has information that rules out that the hulk is in the bay, so that cannot be why the mate’s claim is false. There is, however, an investigation the mate (or any member of the community) could make, a more careful examination of the log, that would yield information that rules out this possibility. If this is why the mate’s claim is false, then the truth of epistemic modal claims must be determined not only by the information possessed by the relevant community, but also by information that is in some way available to that community. Not just any way of gaining information can count, though, as for almost any false proposition, there will be some possible way of acquiring information that rules it out. Since many false propositions are epistemically possible, then, there must be some restriction on which kinds of investigations matter. There are many options for formulating this restriction, but one option is that it is also determined by context:

(Investigation) “It might be that p” is true as tokened by the subject S at time t if and only if:
(i) p is epistemically possible on the information possessed by the members of the relevant community at t, and
(ii) there is no relevant way for the members of that community to acquire information on which p is not epistemically possible.

According to (Investigation), which ways of acquiring information can affect the truth of epistemic modal claims is determined by features of the context in just the same way that the community is. Depending on the speaker’s background information, motivations, and so forth, different ways of acquiring information will be relevant. Since the mate has just checked the log and knows that he is basing his judgment on the data in the log, checking the log is a relevant way of acquiring information, and so his claim is false. By contrast, a student could truly say during an exam “the answer might be “Gettier”, but I’m not sure”. Although there are ways, such as reading through the textbook, for the student to acquire information that rules out that answer, none of those ways is relevant in the context of an exam. (Investigation) is thus in principle able to account for cases where information that no one has seems to determine the truth of epistemic modal claims.

b. Relativism

 According to relativist theories of epistemic modal claims, these claims are true relative to the context in which they are evaluated, rather than to the context in which they are asserted. Whereas context-dependent theories allow for different tokens of the same type of epistemic claim to have different truth values, these relativist views allow for the same token of some epistemic claim to have different truth values when evaluated in different contexts. The primary motivation for this kind of view is that a single token of an epistemic modal claim can be judged true in one context but false in another, and both judgments can seem correct. This happens in eavesdropping cases like the following:

Mara, Ian, and Eliza are playing a game of hide-and-seek. Mara is hiding in the closet while Ian and Eliza are searching for her. Ian and Eliza are discussing where Mara might be, and Ian says “She might be in the kitchen, since we haven’t checked there yet”. Listening from the closet, though, Mara knows that Ian is wrong—she is most definitely not in the kitchen.

The puzzle is that Ian’s reasoning seems perfectly good; any room which they have not checked yet is a room in which Mara might be hiding. Assuming the relevant community does not include Mara, no one in the relevant community has information that rules out that Mara is in the kitchen, and so it is epistemically possible on the community’s information that she is in the kitchen. However, Mara’s assessment that Ian is wrong also seems correct. She could not truthfully say “I know I am not in the kitchen, but Ian is right when he says that I might be in the kitchen”. So, the very same token of “She might be in the kitchen” seems true when evaluated by Ian and Eliza, but false when evaluated by Mara.

To accommodate this sort of intuition, relativists propose that epistemic modal claims are true only relative to the context in which they are assessed, resulting in a view like the following:

(Relativism) “It might be that p” is true as tokened by the subject S at time t1 and assessed by the agent A at time t2 if and only if p is epistemically possible on the information possessed by A at t2.

According to (Relativism), Ian’s claim that Mara might be in the kitchen is true when assessed by Ian and Eliza, since it is epistemically possible on their information that she is hiding in the kitchen. However, it is not true when assessed by Mara, since it is not epistemically possible on her information that she is in the kitchen; she knows full well that she is in the closet. On this kind of view, the token has no fixed truth value based on the context in which it is asserted; it has a truth value only relative to the context in which it is being assessed. On (Relativism), the feature of the context of assessment that determines this truth value is the information possessed by the person doing the assessing, but other relativist views may include other features such as the assessor’s intentions, the purpose of the assessment, information the assessor could easily obtain, and so forth.

One way of addressing this puzzle within the confines of a context-dependent theory is to include Mara in the relevant community, such that her knowledge that she is not in the kitchen makes Ian’s claim false, regardless of what Ian and Eliza know. This would allow the proponent of a context-dependent theory to say that Ian’s claim is strictly false, but still conversationally appropriate, as he does not know about the information that makes it false. Generalizing this strategy gives an implausible result, however, as there would be equally good reasons to include in the relevant community anyone who will ever consider a claim of epistemic possibility. As a result, any claim of the form “it might be that p”, where p will at some point be discovered by someone to be false, is false. If we know this, then it is almost always inappropriate to assert that p might be true, since we know that if p is ever discovered to be false, then our assertion will have been false. So, if epistemic possibility claims are commonly appropriate to assert, as they seem to be, there is a reason to doubt the context-dependent account of this case.

3. How Does Information Determine the Epistemic Modal Status of a Proposition?

Theories of epistemic modality also differ in how a proposition must be related to the relevant information in order to have a given epistemic modal status. Even if it is agreed that a proposition is epistemically possible for a subject S just in case it is not ruled out by what S knows, for example, there remains the question of what it takes for S’s knowledge to rule out a proposition.

a. Negation

 The simplest view of this relation is that a proposition is possible on a body of information just in case the information does not include the negation of that proposition. If the relevant information is a subject’s knowledge, for example, this yields:

(Negation) p is epistemically possible for a subject S if and only if S does not know that not-p.

So, if Bozo knows that he is at the circus, then it is not epistemically possible for him that he is not at the circus, whereas if he does not know that the square root of 289 is 17, then it is possible for him that it is not.

A difficulty for this sort of view involves a proposition that is intuitively ruled out by what someone knows, even though that person does not explicitly know the proposition’s negation. Suppose, for example, that Holmes knows that Adler has stolen his pipe. Holmes is perfectly capable of deducing from this that someone stole his pipe, but he has not bothered to do so. So, Holmes has not formed the belief that someone stole his pipe. As a result, he does not know that someone stole the pipe. According to (Negation), then, it is still epistemically possible for Holmes that no one stole the pipe (that is, that it is not the case that someone stole the pipe), even though it is not epistemically possible for Holmes that Adler did not steal the pipe. This is problematic, as knowing that Adler stole the pipe seems sufficient to rule out that no one stole the pipe, as the former obviously entails the falsehood of the latter. So, S’s not knowing not-p is not sufficient for p to be epistemically possible for S.

b. Entailment

To accommodate this kind of case, some theories require only that the information include something that entails not-p, such as:

(Entailment) p is epistemically possible for a subject S if and only if nothing that S knows entails not-p.

This resolves the problem in the Holmes case, as Holmes knows something (that Adler stole the pipe) which entails that someone stole the pipe. So, it is not epistemically possible for Holmes that no one stole the pipe, regardless of whether Holmes has formed the belief that someone stole the pipe.

However, views like (Entailment) face problems involving logically and metaphysically necessary propositions. On the assumption that logically and metaphysically necessary propositions are entailed by any body of information, their negations will be epistemically impossible for any subject on this kind of view. If Goldbach’s conjecture is false, for example, then any subject’s knowledge entails the negation of Goldbach’s conjecture. Nevertheless, it is epistemically possible for many subjects that Goldbach’s conjecture is true. So, S not knowing anything that entails not-p cannot be necessary for p to be epistemically possible for S.”

Another potential problem is that requiring the entailment of not-p to rule out p seems to result in too many epistemic possibilities. For example, if the detective knows that fingerprints matching the butler’s were found on the gun that killed the victim, that powder burns were found on the butler’s hands, that reliable witnesses testified that the butler had the only key to the room where the body was found, and that there is surveillance footage that shows the butler committing the murder, this would still be insufficient to rule out the butler’s innocence according to (Entailment), since none of these facts properly entail that the butler is guilty. Similarly, if the relevant information is not a subject’s knowledge but instead her foundational beliefs and/or experiences, then very few propositions will not be epistemically possible for a given subject. With the exception of necessary truths and propositions about my own mental states, none of the propositions I believe is entailed by my experiences or foundational beliefs (for more, see Fallibilism). As a result, if this information must entail not-p in order to rule out p as an epistemic possibility, nearly all contingent propositions will be epistemically possible for every subject. Because of this, any subject could truly assert “given my evidence, I might be standing on the moon right now”, which is a prima facie problem for this kind of view.

c. Probability

One way of weakening the conditions necessary to rule out a proposition is to analyze epistemic possibility in terms of probability:

(Probability) p is epistemically possible for a subject S if and only if the probability of p given what S knows is greater than or equal to x, where x is some threshold of probability between 0 and 1.

As long as x is greater than 0, this kind of view allows for p to be ruled out even when S’s knowledge does not entail not-p. If the probability of p given S’s knowledge is greater than 0 but less than x, there is still some chance (given what S knows) that p is true, but it does not follow that p is epistemically possible for S. Note, however, that, for any understanding of probability that obeys the Kolmogorov Axioms, (Probability) will face the same problem with necessary falsehoods that (Entailment) does. On any such understanding, the probability of a logically necessary falsehood is 0, and so no logically and metaphysically necessary falsehoods can be epistemically possible.

d. Dismissing

More complex theories about the “ruling-out relation” include Huemer’s (2007) proposal, which requires, among other things, that S have justification adequate for dismissing p in order for S to rule out p. “Dismissing” is meant to capture particularly strong disbelief in p or “disbelieving [p] and regarding the question as settled” (p. 132). Since, according to Huemer, the degree of justification adequate for dismissing a proposition varies with context, an epistemic possibility claim can be true in one context, but false in another. So, on this view, epistemic modal claims are context-sensitive even when the relevant information is specified, in a way that parallels the context-sensitivity of “know” argued for by epistemic contextualists. For example, on this kind of contextualist view, standards for dismissing may be low in ordinary contexts, but high when confronted with a skeptic. If so, then a subject can truly assert “it is not the case that I might be the victim of an evil demon” in an ordinary context, and also truly assert “I might be the victim of an evil demon” when confronted with a skeptic.

4. How is Epistemic Modality Related to Knowledge?

Since the modality under discussion is epistemic, it is natural to suppose that it is closely related to knowledge. This would account for the common use of “for all I know” and “for all anyone knows” to attribute epistemic possibility, as well as providing a straightforward explanation of what is epistemic about epistemic modality. It would also account for the apparent relevance of what might and must be true to what we know. There are, however, several different accounts of this relation.

a. Knowledge as the Relevant Information

One proposal is that knowledge is the relevant type of information that determines the epistemic modal status of a proposition. Whatever the correct theory of the ruling out relation, then, the following would be true:

p is epistemically possible for a subject S if and only if p is not ruled out by what S knows.

However, there are at least two problems with this sort of view. First, a subject may fail to know something for reasons that are intuitively irrelevant to the modal status of the proposition in question. Let q be a proposition such that if S knew q, then S’s knowledge would rule out p, and suppose that S satisfies every condition for knowledge of q except that S does not believe q. This may be because S lacks a concept required to believe q, has a psychological flaw that prevents her from believing q, or simply has not gotten around to forming the belief that q. These kinds of reasons for not believing q do not seem to affect what is epistemically possible for S, and yet if epistemic possibility is understood in terms of knowledge, they do.

The second problem is that epistemic modal claims are sometimes assessed relative to a body of information that no one actually knows. A computer hard drive, for example, may contain a tremendous amount of data, more than anyone could possibly know. Referencing such a drive, a person could assert “given the information on this drive, system X might contain a planet that would support life”. This is an epistemic modal claim that may be true or false, but its truth value cannot be determined by what anyone knows, since no one knows all of the data on the drive. As a result, the epistemic modal status of a proposition must at least sometimes be determined by a type of information other than knowledge.

An alternative proposal is that epistemic modality is determined by evidence. What this view amounts to depends on one’s theory of evidence. Evidence may include publicly available evidence, such as the data on a drive, the records in a log, or the results of an experiment. Evidence may also be understood as a subject’s personal evidence, consisting of experiences and other mental states (see Evidentialism).

b. Epistemic Modality and Knowledge

 Whether or not knowledge is the information that determines epistemic modality, many epistemological views connect knowledge to epistemic modality. In particular, ruling out the epistemic possibility that not-p is claimed by many to be a necessary condition for knowing that p:

(K1) S knows that p only if not-p is not epistemically possible for S.

Others also accept the potentially stronger claim that knowledge requires ruling out the epistemic possibility of any proposition incompatible with p:

(K2) S knows that p only if there is no q such that:
(i) q entails not-p, and
(ii) q is epistemically possible for S.

One motivation for this kind of connection between epistemic possibility and knowledge is the idea that epistemic necessity just is knowledge, such that p is epistemically necessary for S just in case S knows that p. On the assumption that epistemic necessity and possibility are duals, in the sense that a proposition is epistemically necessary just in case its negation is not epistemically possible, and vice versa (see Modal Logic), this would entail (K1).

A second motivation is the intuitive idea that knowledge requires the exhaustion of alternative possibilities, that in order to know that p, one must perform a thorough enough investigation to exclude any alternatives to p. For a detective to know that the butler is guilty, for example, she must rule out all of the other suspects. In doing so, she would rule out every possibility in which the butler was not guilty, thereby satisfying the consequent of (K2). If knowledge requires this kind of elimination of alternatives, then, there is good reason to accept (K2).

The main reason to doubt principles like (K1) and (K2) is their apparent inconsistency with fallibilism about knowledge, the view that some or all of our knowledge has inconclusive justification. If our justification for p is inconclusive, then there is some chance, given that justification, that not-p is true. But this seems to commit us to saying that not-p might be true and is therefore epistemically possible. So, if (K1) is true, then we do not know p after all. Since conclusive justification for our beliefs is very rare, applying this reasoning generally has the implausibly skeptical consequence that we have very little knowledge of the world.

c. Concessive Knowledge Attributions

A related issue is the apparent incoherence of Concessive Knowledge Attributions (“CKAs”) in which a subject claims to know something while admitting the (epistemic) possibility of error. (9), for example, sounds odd:

(9) I know that I own a cat, but I might not own a cat.

Furthermore, (9) seems to be in some way self-defeating—admitting the epistemic possibility that the speaker does not own a cat seems like an admission that she does not in fact know that she owns a cat. (10) has similar problems:

(10) I know that I own a cat, but I might not own any animals.

As long as the speaker knows that all cats are animals, asserting (10) seems problematic in roughly the same way as asserting (9) does. The second conjunct seems to commit the speaker to denying the first. An account of the relationship between epistemic possibility and knowledge must therefore give some explanation of the apparent tension in CKAs like (9) and (10).

The most straightforward account of the oddness of CKAs is that they are self-contradictory and therefore false. If (K1) is true, then sentences of the form “S knows that p” and “not-p is epistemically possible for S” are mutually inconsistent. On this kind of view, (9) seems odd and self-defeating because its conjuncts are inconsistent with each other—if the speaker knows that he owns a cat, then it is not epistemically possible for him that he does not own a cat. If (K2) is true, then sentences of the form “S knows that p” and “q is epistemically possible for S” (where q entails not-p) are also mutually inconsistent. So, on this kind of view (10) seems odd and self-defeating for just the same reason. If the speaker knows that she owns a cat, then according to (K2) no proposition that entails that she does not own a cat is epistemically possible for her. Since not owning an animal entails not owning a cat, then, it cannot be epistemically possible for the speaker that she does not own any animals.

On Lewis’s (1996) account, CKAs are not strictly self-contradictory, but they can never be truly asserted. For Lewis, “S knows that p” is true just in case S’s evidence rules out all not-p possibilities that are not properly ignored. Which possibilities are properly ignored varies with the conversational context, such that “S knows that p” may be true in one context, but false in another in which fewer propositions are properly ignored. As a result, there may be contexts in which “S knows that p” and “it might be that q” would be true, even though q entails not-p, so long as q is one of the not-p possibilities that is properly ignored in that context. So, strictly speaking, these sentences are not mutually inconsistent.

However, there are rules that govern when a possibility is properly ignored in a context, one of which is the Rule of Attention. This rule entails that any not-p possibility that is explicitly mentioned is not properly ignored, since it is not ignored at all. Because of this, conjunctions of the form “S knows that p, but it might be that q”, where q is a not-p possibility, cannot be truly asserted. Mentioning the not-p possibility q prevents it from being properly ignored. So, if q is not ruled out by S’s evidence, then “S knows that p” is false. This accounts for the tension in (9) and (10), as mentioning the epistemic possibilities that the speaker does not own a cat or does not own any animals is sufficient to prevent them from being properly ignored. This makes the speaker’s claim that she knows she owns a cat false, which is why these CKAs seem self-defeating, even though they are not strictly inconsistent.

Other views, such as that of Dougherty & Rysiew (2009), hold that CKAs are often true, but conversationally inappropriate to assert. On their view, p is epistemically possible for a subject just in case the subject’s evidence (consisting of her mental states) does not entail not-p. So, nearly all contingent propositions are epistemically possible. Because of this, mentioning that a contingent proposition is epistemically possible would be a strange thing to do in most conversations, akin to noting the obvious truth that there is a metaphysical possibility that one’s beliefs are false. As a result, on this view, asserting that a proposition is epistemically possible pragmatically implicates something more, such as that one has some compelling reason for taking seriously the possibility that not-p, and so one is not confident that p. As a result, CKAs like (9) and (10) are often true, as they assert that the speaker knows that p and does not have entailing evidence for p. However, they are conversationally inappropriate to assert. Unless the speaker has some good reason to suppose that she does not own a cat, it is inappropriate to assert that she might not. However, if she does have such a reason, then she should not say that she knows that she owns a cat, because doing so implicates confidence in and adequate evidence for the proposition that she owns a cat which are incompatible with having that kind of reason.

5. Alternatives to the Standard View of Epistemic Modals

The standard view of epistemic modals introduced in section 1 holds that epistemic modals like “might” and “must” are used to express epistemic modal claims attributing either epistemic possibility or epistemic necessity to propositions. This standard view is committed to two important theses, each of which has been challenged.

First, on the standard view, epistemic modals affect the semantic content of sentences. So, a sentence of the form “it might be that p” differs in its semantic content from simply “p”. For example, an utterance of proposition (2), “Terry may not do well on the test”, does not simply express the proposition that Terry will not do well on the test. Instead, it expresses the epistemic modal claim that the proposition that Terry will do well on the test is epistemically possible on the relevant information. This difference in meaning yields a difference in truth conditions, such that it is possible for someone to truly assert (2) even if Terry will in fact do well on the test (if, for example, the speaker does not know whether Terry will do well).

Second, sentences containing epistemic modals typically serve to describe the world, in the sense that they describe some proposition as having some particular modal status. Thus, whatever effect epistemic modals have on the meaning of sentences, they typically result in the expression of a descriptive claim about the world that can be evaluated for truth.

a. Embedded Epistemic Modals

The most significant challenge to the standard view is that epistemic modal sentences behave strangely when embedded in other sentences, which the standard view does not predict. As Yalcin (2007) first pointed out, conjunctions including epistemic modals yield unusual results when embedded in other kinds of sentences. Consider a case in which it is raining outside, but you have not looked out of the window. For you, then, it is epistemically possible that it is not raining, even though it is in fact raining. Nevertheless, the conjunction of “it is raining” and “it might not be raining” sounds odd when embedded in certain kinds of sentences. For example, the imperative sentence (11) sounds odd:

(11) Suppose that it is raining and it might not be raining.

(11) seems to give a command that is in some way defective; the conjunction in question cannot be coherently supposed. However, this seemingly cannot be because the conjuncts “it is raining” and “it might not be raining” are logically inconsistent. If they were, then “it might not be raining” would entail “it is not raining”, but the mere epistemic possibility that it is not raining seemingly cannot entail that it is in fact not raining. So, on the standard view, there is no obvious reason that (11) should be defective, as it simply asks you to suppose that two compatible claims are both true.

Similarly, (12) sounds odd:

(12) If it is raining and it might not be raining, then it is raining.

This oddness is unexpected, since, given the usual semantics for conditionals and the standard view of epistemic modals, (12) should be trivially true. Any material conditional of the form “If A and B, then A” should be obviously true, and yet the truth value of (12) is not obvious. This is not because (12) seems false, but because there seems to be something wrong with the antecedent of (12). On the standard view, though, there is no obvious reason that this should be the case. Each conjunct expresses a claim about the world, and whatever claim is expressed by the second conjunct, the consequent clearly follows from the first conjunct alone. In response to these puzzles, Yalcin (2007) develops a semantics according to which sentences like “it is raining and it might not be raining” really are strictly contradictory, but this is not the only way to account for the oddness of sentences like (11) and (12).

b. Hedging

An alternative to the standard view is that the modals of epistemic possibility (“may”, “might”, “perhaps”, etc.) are used to “hedge” or express reduced confidence about the expressed proposition, rather than to attribute any modal status. As Coates (1983 131) describes this kind of view: “MAY and MIGHT are the modals of Epistemic Possibility, expressing the speaker’s lack of confidence in the proposition expressed” (p. ?). On this kind of view, epistemic modals do not affect the semantic content of a sentence but are instead used to indicate the speaker’s uncertainty about the truth of that content. For example, on Schnieder’s (2010) view, (2) and (2¢) have the same semantic content:

(2) Terry may not do well on the test.

(2′) Terry will not do well on the test.

However, while a speaker who utters (2′) thereby asserts that Terry will not do well on the test, a speaker who utters (2) makes no assertion at all but instead engages in a different kind of speech act that presents that speaker as being uncertain about the proposition that Terry will not do well on the test. Thus, though the two sentences have the same semantic content, the epistemic modal in (2) results in an expression of the speaker’s uncertainty, rather than in an epistemic modal claim that describes a proposition as having some modal status. Views of this kind are therefore incompatible with both theses of the standard view, since epistemic modals do not affect the semantic content of a sentence, and sentences like (2) are not used to describe the world.

Hedging views can offer some account of the oddness of embedded epistemic modals, since on these views the speaker is using “may” or “might” to express uncertainty in situations in which it is inappropriate to do so. If, for example, it is only appropriate to suppose something that could be asserted, then (11) should sound odd, since “it might be raining” cannot be used to make an assertion. Thus, they seem to have some advantage over the standard view.

However, a significant objection to the underlying idea that epistemic modals do not affect semantic content, raised by Papafragou (2006), is that adding an epistemic modal to a sentence seems to change its truth conditions in many ordinary cases. Suppose, for example, that my grandmother is on vacation in South America, and I cannot recall her exact itinerary. On a hedging view, (3) and (3¢) have the same semantic content, and thus the same truth conditions:

(3) My grandmother might be in Venezuela.

(3′) My grandmother is in Venezuela.

If my grandmother is in fact in Brazil, then (3′) is false. So, if epistemic modals do not affect truth conditions, then (3) must also be false. But since I cannot remember her itinerary, I seem to speak truly when I utter (3). This difference in truth values requires a difference in semantic content, contrary to what hedging views predict. Similarly, if epistemic modals do not affect semantic content, then the proposition expressed by claim (4) (that is, “the special theory of relativity might be true, and it might be false”) would be a contradiction.

This raises two problems. First, intuitively, a speaker could use (4) to assert something true (if, for example, she did not know whether the special theory of relativity was true or false). Second, if (4) is not used to assert anything but instead used to express the speaker’s uncertainty about the semantic content of the sentence, then an utterance of (4) would express uncertainty about the truth value of a contradiction. But, at least in ordinary circumstances, that would be a strange epistemic state for a speaker to express.

Defenders of hedging views have options for responding to these objections, however. Perhaps, for example, my utterance of (3) seems appropriate when my grandmother is in Brazil not because it is true, but because it is sincere—I am presenting myself as being uncertain that my grandmother is not in Venezuela, and in fact I am uncertain of that proposition. This would explain why an utterance of (3) can be intuitively correct in some sense, even though the only proposition expressed in that utterance is false. And perhaps (4) is not an expression of uncertainty about a contradiction but instead a combination of two different speech acts: one expressing uncertainty that the special theory of relativity is true and another expressing uncertainty that it is false.

c. Other Views of Epistemic Modals

Another alternative to the standard view is to accept the first thesis that epistemic modals affect semantic content but deny the second thesis that they are used descriptively to attribute some epistemic modal status to a proposition. For example, a view of this kind is considered in Yalcin (2011): “To say that a proposition is possible, or that it might be the case, is to express the compatibility of the proposition with one’s state of mind, with the intention of engendering coordination on this property with one’s interlocutor” (p. 312). On this view, (3) and (3¢) do not have the same truth conditions, because (3) does not have truth conditions at all—it does not describe the world as being any particular way and so does not attribute any modal status to any proposition. Similarly, Willer’s (2013) dynamic semantics for epistemic modals does not assign truth conditions to epistemic modal claims but instead relations between information states, such that uttering (3) aims to change mere possibilities that are compatible with an agent’s evidence into “live” possibilities that the agent takes seriously in inquiry. On Swanson’s (2016) view, the content of an epistemic modal sentence is not a proposition but a constraint on credences, such that a speaker uttering (3) thereby advises their audience to adopt a set of credences that does not rule out or overlook the possibility that Terry will not do well on the test. On all of these views, epistemic modals affect the semantic content of the sentences in which they occur, but the resulting contents are not propositions with truth values. Thus, though they each handle embedded epistemic modals differently, none of these views are committed to the same seemingly implausible verdicts about sentences like (11) and (12) as the standard view.

6. References and Further Reading

  • Barnett, D. 2009. Yalcin on ‘Might’. Mind 118: 771-75.
    • A proposed solution to the embedding problem for epistemic modals.
  • Coates, J. 1983. The Semantics of Modal Auxiliaries. London: Croom Helm.
    • An account of the semantics of modals in English, including a discussion of hedging with epistemic modals.
  • DeRose, K. 1991. Epistemic Possibilities. The Philosophical Review 100: 581-605.
    • An overview of context-dependent accounts of epistemic modals, and a defense of (Investigation).
  • DeRose, K. 1998. Simple ‘Might’s, Indicative Possibilities and the Open Future. The Philosophical Quarterly 48: 67-82.
    • An argument that simple “might” and “possible” sentences are used to make epistemic modal claims.
  • Dougherty, T. and P. Rysiew. 2009. Fallibilism, Epistemic Possibility, and Concessive Knowledge Attributions. Philosophy and Phenomenological Research 78: 123-32.
    • Arguments for evidence being the relevant type of information, entailment being the relevant relation, and concessive knowledge attributions being typically true, but pragmatically inappropriate.
  • Egan, A. 2007. Epistemic Modals, Relativism, and Assertion. Philosophical Studies 133: 1-22.
    • A defense of relativism about epistemic modals, as well as a discussion of an objection to relativism based on the role of assertions.
  • Egan, A., J. Hawthorne, and B. Weatherson. 2005. Epistemic Modals in Context. In Contextualism in Philosophy, eds. G. Peter and P. Preyer. Oxford: Oxford University Press, 131-70
    • An extended discussion of contextualism and a defense of relativism.
  • Hacking, I. 1967. Possibility. The Philosophical Review 76: 143-68.
    • On the salvage case as motivation for a view like (Investigation).
  • Hawthorne, J. 2004. Knowledge and Lotteries. Oxford: Oxford University Press.
    • On epistemic possibility and its relation to knowledge.
  • Hintikka, J. 1962. Knowledge and Belief: An Introduction to the Logic of the Two Notions. Ithaca: Cornell University Press.
    • Development of a logic of epistemic modality, using a knowledge-based account of epistemic possibility.
  • Huemer, M. 2007. Epistemic Possibility. Synthese 156: 119-42.
    • Overview of problems for several different accounts of epistemic possibility, concluding in a defense of the dismissing view (section 3d).
  • Kripke, S. 1972. Naming and Necessity. Cambridge, MA: Harvard University Press.
    • Distinguishes epistemic from metaphysical modality using the Hesperus/Phosphorous example.
  • Lewis, D. 1996. Elusive Knowledge. Australasian Journal of Philosophy 74: 549-67.
    • Motivation for (K2), as well as Lewis’s account of CKAs.
  • MacFarlane, J. 2011. Epistemic Modals Are Assessment-Sensitive. In Epistemic Modality, eds. A. Egan and B. Weatherson. New York: Oxford University Press, 144-79.
    • Problems for context-dependent theories and a defense of relativism.
  • Moore, G. E. (ed.) 1962. Commonplace Book, 1919-1953. New York: Macmillan.
    • Notes on knowledge, epistemic possibility, and the standard view of epistemic modals.
  • Papafragou, A. 2006. Epistemic Modality and Truth Conditions. Lingua 116: 1688-702.
    • An explanation and critique of hedging views of epistemic modals.
  • Schnieder, B. 2010. Expressivism Concerning Epistemic Modals. The Philosophical Quarterly 60: 601-615.
    • An explanation and defense of a hedging view of epistemic modals.
  • Stanley, J. 2005. Fallibilism and Concessive Knowledge Attributions. Analysis 65: 126-31.
    • An argument that CKAs are self-contradictory and that something like (K1) holds.
  • Swanson, E. 2016. The Application of Constraint Semantics to the Language of Subjective Uncertainty. Journal of Philosophical Logic 45: 121-46.
    • An alternative to the standard view of epistemic modals formulated in terms of constraints on an agent’s credences.
  • Teller, P. 1972. Epistemic Possibility. Philosophia 2: 303-20.
    • Overview of some accounts of epistemic possibility, including the problem of necessary truths for entailment and probability views.
  • von Fintel, K and A. Gillies. 2008. CIA Leaks. The Philosophical Review 117: 77-98
    • A detailed overview of the motivations for context-dependence and relativism about epistemic modal claims.
  • Willer, M. 2013. Dynamics of Epistemic Modality. The Philosophical Review 122: 45-92.
    • A dynamic semantics for epistemic modals that rejects the standard view.
  • Wright, C. 2007. New Age Relativism and Epistemic Possibility: The Question of Evidence.Philosophical Issues 17: 262-283.
    • A series of objections to relativism and concerns about the motivations for it.
  • Yalcin, S. 2007. Epistemic Modals. Mind 116: 983-1026.
    • A presentation of the embedding problem for epistemic modals and a semantics designed to solve it.
  • Yalcin, S. 2011. Nonfactualism about Epistemic Modality. In Epistemic Modality, eds. A. Egan and B. Weatherson. New York: Oxford University Press, 295-333.
    • Arguments against the standard view of epistemic modals and development of a nonfactualist account.

 

Author Information

Brandon Carey
Email: brandon.carey@csus.edu
California State University, Sacramento
U. S. A.

Locke: Epistemology

LockeJohn Locke (1632-1704), one of the founders of British Empiricism, is famous for insisting that all our ideas come from experience and for emphasizing the need for empirical evidence. He develops his empiricist epistemology in An Essay Concerning Human understanding, which greatly influenced later empiricists such as George Berkeley and David Hume. In this article, Locke’s Essay is used to explain his criticism of innate knowledge and to explain his empiricist epistemology.

The great divide in Early Modern epistemology is rationalism versus empiricism. The Continental Rationalists believe that we are born with innate ideas or innate knowledge, and they emphasize what we can know through reasoning. By contrast, Locke and other British Empiricists believe that all of our ideas come from experience, and they are more skeptical about what reason can tell us about the world; instead, they think we must rely on experience and empirical observation.

Locke’s empiricism can be seen as a step forward in the development of the modern scientific worldview. Modern science bases its conclusions on empirical observation and always remains open to rejecting or revising a scientific theory based on further observations. Locke would have us do the same. He argues that the only way of learning about the natural world is to rely on experience and, further, that any general conclusions we draw from our limited observations will be uncertain. Although this is commonly understood now, this was not obvious to Locke’s contemporaries. As an enthusiastic supporter of the scientific revolution, Locke and his empiricist epistemology can be seen as part of the same broader movement toward relying on empirical evidence.

Locke’s religious epistemology is also paradigmatic of the ideals of the Enlightenment. The Enlightenment is known as the Age of Reason because of the emphasis on reason and evidence. Locke insists that even religious beliefs should be based on evidence, and he tries to show how religious belief can be supported by evidence. In this way, Locke defends an Enlightenment ideal of rational religion.

The overriding theme of Locke’s epistemology is the need for evidence, and particularly empirical evidence. This article explains Locke’s criticism of innate knowledge and shows how he thinks we can acquire all our knowledge from reasoning and experience.

Table of Contents

  1. Criticism of Innate Ideas and Knowledge
    1. No Innate Ideas
    2. Empiricist Theory of Ideas
    3. No Innate Knowledge
      1. The Argument from Universal Consent
      2. The Priority Thesis
  2. Empiricist Account of Knowledge
    1. Types of Knowledge
    2. A Priori Knowledge
    3. Sensitive Knowledge
      1. The Interpretive Problem
      2. The Skeptical Problem
    4. The Limits of Knowledge
  3. Judgment (Rational Belief)
    1. Science
    2. Testimony
    3. Faith
  4. Conclusion
  5. References and Further Reading
    1. Locke’s Works
    2. Recommended Reading

1. Criticism of Innate Ideas and Knowledge

Many philosophers, including the continental rationalists, have thought that we are born with innate ideas and innate knowledge. Locke criticizes the arguments for innate ideas and knowledge, arguing that any innate ideas or knowledge would be universal but it is obvious from experience that not everyone has these ideas or knowledge. He also offers an alternative explanation, consistent with his empiricism, for how we come to have all our ideas and knowledge. So, he thinks, rationalists fail to prove that we have innate ideas or knowledge.

a. No Innate Ideas

Although Locke holds that all ideas come from experience, many of his contemporaries did not agree.

For example, in the Third Meditation, Descartes argues that the idea of an infinite and perfect God is innate. He argues that we cannot get the idea of an infinite God from our limited experience, and the only possible explanation for how we came to have this idea is that God created us so that we have the innate idea of God already in our minds. Other rationalists make similar arguments for other ideas. Following Noam Chomsky, this is sometimes called a Poverty of Stimulus Argument.

Locke has two responses to the Poverty of Stimulus Arguments for innate ideas. First, Locke argues that some people do not even have the ideas that the rationalists claim are innate. For example, some cultures have never heard of the theistic conception of God and so have never formed this kind of idea of God (1.4.8). In reply, some might claim that the idea of God is in the mind even if we are not conscious of that idea. For example, Plato suggests we are born with the idea of equality but we become conscious of this idea only after seeing equal things and thus “recollect” the idea; Leibniz suggests innate ideas are “petite perceptions” that are present even though we do not notice them. However, Locke argues that saying an idea is “in the mind” when we are not aware of it is unintelligible. An idea is whatever we are aware of, and so if we are not aware of an idea, then it is not “in the mind” at all (1.2.5).

Second, the Poverty of Stimulus Argument claims that certain ideas cannot come from experience, but Locke explains how a wide variety of our ideas do come from experience. For example, in response to Descartes’ claim that the idea of God cannot come from experience, Locke explains how the idea of God can be derived from experience. First, we get the idea of knowledge and power, and so forth, by reflecting on ourselves (2.1.4). Second, we can take the idea of having some power and imagine a being that has all power, and we can take the idea of some knowledge and imagine a being that has all knowledge (2.23.33). In this way, we can use our ideas from experience to form an idea of an infinitely powerful and omniscient God. Since he can explain how we got the idea of God from experience, there is little reason to believe Descartes’ claim that the idea of God is innate.

b. Empiricist Theory of Ideas

Locke’s criticism of innate ideas would be incomplete without an alternative explanation for how we get the ideas we have, including the ideas that the rationalists claim are innate. This section, then, describes how Locke thinks we form ideas.

Locke famously says the mind is like a blank piece of paper and that it has ideas only by experience (Essay 2.1.2). There are two kinds of experience: sensation and reflection. Sensation is sense perception of the qualities of external objects. It is by sensation that we receive ideas such as red, cold, hot, sweet, and other “sensible qualities” (2.1.3). Reflection is the perception of “the internal operations of our minds” (2.1.2). Consider, for example, the experience of making a decision. We weigh the pros and cons and then decide to do x instead of y. Making the decision is an act of the mind. But notice that there is something it feels like to deliberate about the options and then decide what to do. That is what Locke means by reflection: it is the experience we have when we notice what is going on in our own minds. By reflection we come to have the ideas of “perception, thinking, doubting, believing, reasoning, knowing, willing, and all the different actings of our own minds” (2.1.4).

The central tenet of Locke’s empiricism is that all of our ideas come from one of these two sources. For many of our ideas, it is obvious how we got them from experience: we got the idea of red from seeing something red, the idea of sweet by tasting something sweet, and so on. But it is less obvious how other ideas come from experience. We can form some new ideas without ever seeing such an object in sense perception, such as the idea of a unicorn or the idea of a gold mountain. Other abstract concepts, such as the idea of justice, seem like something we cannot observe. For Locke’s empiricism to be plausible, then, he needs to explain how we derived these kinds of ideas from experience

Locke divides ideas into simple ideas and complex ideas. A simple idea has “one uniform appearance” and “enter[s] by the senses simple and unmixed” (2.2.1), whereas a complex idea is made up of several simple ideas combined together (2.12.1). For example, snow is both white and cold. The color of the snow has one uniform appearance (i.e., appearing white), and so white is one simple idea that is included in the complex idea of snow. The coldness of snow is another simple idea included in the idea of snow. The idea of snow is a complex idea, then, because it includes several simple ideas.

Locke claims that all simple ideas come from experience (2.2.2), but we can combine simple ideas in new ways. Having already gained from experience the idea of gold and the idea of a mountain, we can combine these together to form the idea of a gold-mountain. This idea depends on our past experience, but we can form the idea of a gold-mountain without ever seeing one. According to Locke’s empiricist theory of ideas, then, all of our complex ideas are combinations of simple ideas we gained from experience (2.12.2).

Abstract ideas also depend on experience. Consider first the abstract idea of white. We form the idea of white by observing several white things: we see that milk, chalk, and snow all have the same sensible quality and call that “white” (2.11.9). We form the idea of white by separating the ideas specific to milk (for example, being a liquid, having a certain flavor), or specific to snow (for example, being cold and fluffy), and attending only to what is the same: namely, being white. We can do the same process of abstraction for complex ideas. For example, we can see a blue triangle, a red triangle, and so on. By focusing on what is the same (having three straight sides) and setting aside what is different (for example, the color, the angles) we can form an abstract idea of triangle.

The Poverty of Stimulus argument for innate ideas claims that some of our ideas cannot come from experience and hence they are innate. However, Locke tries to explain how all of our ideas are derived, directly or indirectly, from experience. All simple ideas come from sensation or reflection, and we can then form new complex ideas by combining simple ideas in new ways. We can form abstract ideas by separating out what is specific to the idea of particular objects and retaining what is the same between several different ideas. Although these complex ideas are not always the objects of experience, they still are derived from experience because they depend on simple ideas that we receive from experience. If these explanations are successful, then we have little reason to believe our ideas are innate; instead, Locke concludes, all our ideas depend on experience.

c. No Innate Knowledge

Locke is also an empiricist about knowledge. Yet many philosophers at the time argued that some knowledge is innate. Locke responds to two such arguments: The Argument from Universal Consent and The Priority Argument. He argues that neither of these arguments are successful.

i. The Argument from Universal Consent

Many philosophers at the time disagreed with Locke’s empiricism. Some asserted that knowledge of God and knowledge of morality are innate, and others claimed that knowledge of a few basic axioms such as the Law of Identity and the Law of Non-Contradiction are innate. Take p as any proposition such as those supposed to be innate. One argument, which Locke criticizes, uses universal consent to try to prove that some knowledge is innate:

Argument from Universal Consent:

    1. Every person believes that p.
    2. If every person believes that p, then knowledge of p is innate.
    3. So, knowledge of p is innate.

Locke argues that both premises of the Argument from Universal Consent are false because there is no proposition which every person consents to, and, even if there were universal consent about a proposition, this does not prove that it is innate.

Locke says Premise 1 is false because no proposition is universally believed (2.2.5). For example, if p = “God exists,” then we know not everyone believes in God. And although philosophers sometimes assert that basic logical principles such as the Law of Non-Contradiction are innate, most people have not ever even thought about that principle, much less believed it to be true. We can know from experience, then, that premise 1 as stated is false. Perhaps premise 1 can be saved from Locke’s criticism by insisting that everyone who rationally thinks about it will believe the proposition (1.2.6) (Leibniz defends this view). Even if premise 1 can be revised so that it is not obviously false, Locke still thinks the argument fails because premise 2 is false.

Premise 2 is false for two reasons. First, premise 2 would prove too much. Usually, proponents of innate knowledge think there are only a few basic principles that are innately known (1.2.9-10). Locke argues, though, that every rational person who thinks about it would consent to all of the theorems of geometry, and “a million of other such propositions” (1.2.18), and thus premise 2 would count way too many truths as “innate” knowledge (1.2.13-23). Second, some things are obviously true, and so the fact that everyone believes them does not prove that they are innate (4.7.9). In Book 4, Locke sets out to explain how all of our knowledge comes from reason and experience and, if he is successful, then universal consent by rational adults would not imply that the knowledge is innate (1.2.1). Hence, premise 2 is false.

Sometimes innate instincts are mistaken for an example of innate knowledge. For example, we are born with a natural desire to eat and drink (2.21.34), and this might be misconstrued as innate knowledge that we should eat and drink. But this natural desire should not be confused with knowledge that we should eat and drink. Traditionally, knowledge has been defined as justified-true-belief, whereas Locke describes knowledge as a kind of perception of the truth. On either conception, knowledge requires us to be aware of a reason for believing the truth. Locke can grant that a newborn infant has an innate desire for food while denying that the infant knows that it is good to eat food. Innate instinct or other natural capacities, then, are not the same as innate knowledge.

Locke’s criticism of innate knowledge can be put in the form of a dilemma (2.2.5). Either innate knowledge is something which we are aware of at birth or it is something we become aware of only after thinking about it. Locke objects that if we are unaware of innate knowledge, then we can hardly be said to know it. But if we become aware of the “innate” knowledge only after thinking about it, then “innate” knowledge just means that we have the capacity to know it. In that case, though, all knowledge would be innate, which is not typically what the rationalist wants to claim.

ii. The Priority Thesis

Some assert that we have innate knowledge of a few basic logical principles, or “maxims”, and that this is how we are able to come to know other things. Call this the Priority Thesis. Locke criticizes the Priority Thesis and explains how we can attain certain knowledge without it.

Locke and the advocates of the Priority Thesis disagree both about (i) what is known first and (ii) what is known on the basis of other things. According to the Priority Thesis, we first have innate knowledge of general maxims and then we have knowledge of particular things on the basis of these general maxims. Locke disagrees on both counts. He thinks we first have knowledge of particular things and therefore denies that we know them because of the general maxims.

Some rationalists, such as Plato and Leibniz, hold that knowledge of particulars requires prior knowledge of abstract logical concepts or principles. For example, knowing that “white is white” and “white is not black” is thought to depend on prior knowledge of the general Law of Identity. On this view, we can know that “white is white” only because we recognize it as an instance of the general maxim that every object is identical to itself.

Locke rejects the Priority Thesis. First, he uses children as empirical evidence that people have knowledge of particulars before knowledge of general maxims (1.2.23; 4.7.9). He therefore denies that we need knowledge of maxims before we can have knowledge of particulars, as the Priority Thesis asserts. Second, he thinks he can explain how we get knowledge of particulars without the help of the general maxim. For example, “white is white” and “white is not black” are self-evident claims: we do not need any other information other than the relevant ideas to know that it is true, and once we have those ideas it is immediately obvious the propositions are true. Locke argues that we cannot be more certain of any general maxim than these obvious truths about the particulars, nor would knowing these general maxims “add anything” to our knowledge about them (4.7.9). In short, Locke thinks that knowledge of the particulars does not depend on knowledge of general maxims.

Locke argues a priori knowledge should not be confused with innate knowledge (4.7.9); innate knowledge is knowledge we are born with, whereas a priori knowledge is knowledge we acquire by reflecting on our ideas. For example, we can have a priori knowledge that “white is white.” The idea of white must come from experience. But once we have that idea, we do not need look at a bunch of white things in order to confirm that all the white things are white. Instead, we know by thinking about it that whatever is white must be white. Rationalists, such as Leibniz, sometimes argue that we become fully conscious of innate knowledge only by using a priori reasoning. Again, though, Locke argues that if it is not conscious then it is not something we really know. If we only come to know it when we engage in a priori reasoning, then we should say that we learned it by reasoning rather than positing some unconscious knowledge that was there all along. However, consistent with his empiricism, Locke denies that we can know that objects exist and what their properties are just by a priori reasoning.

Locke rejects innate knowledge. Instead, he thinks we must acquire knowledge from reasoning and experience.

2. Empiricist Account of Knowledge

In Book 4 of the Essay, Locke develops his empiricist account of knowledge. Empiricism emphasizes knowledge from empirical observation, but some knowledge depends only on a reflection of our ideas received from experience. This section explains the role of reason and empirical observation in Locke’s theory of knowledge.

a. Types of Knowledge

Locke categorizes knowledge in two ways: by what we know and by how we know it. As for what we can know, he says there are “four sorts” of things we can know (4.1.3):

    1. identity or diversity
    2. relation
    3. coexistence or necessary connection
    4. real existence

Knowledge of identity is knowing that something is the same, such as “white is white,” and knowledge of diversity is knowing that something is different, such as “white is not black” (4.1.4). Locke thinks this is the most obvious kind of knowledge and all other knowledge depends on this.

Knowledge of relation seems to be knowledge of necessary relations. He denies we can have universal knowledge of contingent relations (4.3.29). Technically, the categories of identity and necessary connections are both necessary relations, but they are important and distinctive enough to merit their own categories (4.1.7). Knowledge of relation, then, is a catch-all category that includes knowledge of any necessary relation.

Knowledge of coexistence or necessary connection concerns the properties of objects (4.1.6). If we perceive a necessary connection between properties A and B, then we would know that “all A are B.” However, Locke thinks this a priori knowledge of necessary connection is incredibly limited: we can know that figure requires extension, and causing motion by impulse requires solidity, but little else (4.3.14). In general, Locke denies that we can have a priori knowledge of the properties of objects (4.3.14; 4.3.25-26; 4.12.9). Alternatively, if we observe that a particular object x has properties A and B, then we know that A and B “coexist” in x (4.3.14). For example, we can observe that the same piece of gold is yellow and heavy.

Finally, knowledge of real existence is knowledge that an object exists (4.1.7 and 4.11.1). Knowledge of existence includes knowledge of the existence of the self, of God, and of material objects.

Locke also divides knowledge by how we know things (4.2.14):

    1. intuitive knowledge
    2. demonstrative knowledge
    3. sensitive knowledge

Intuitive knowledge comes from an immediate a priori perception of a necessary connection (4.2.1). Demonstrative knowledge is based on a demonstration, which is the perception of an a priori connection that is perceived by going through multiple steps (4.2.2). For example, the intuitions that “A is B” and “B is C” can be combined into a demonstration to prove that “A is C.” Finally, the sensation of objects provides “sensitive” knowledge (or knowledge from sensation) that those objects exist and have certain properties (4.2.14).

Locke describes intuitive, demonstrative, and sensitive knowledge as “three degrees of knowledge” (4.2.14). Intuitive knowledge is the most certain. It includes only things that the mind immediately sees are true without relying on any other information (4.2.1). The next degree of certainty is demonstrative knowledge, which consists in a chain of intuitively known propositions (4.2.2-6). This is less certain than intuitive knowledge because the truth is not as immediately obvious. Finally, sensitive knowledge is the third degree of knowledge.

There is considerable scholarly disagreement about Locke’s account of sensitive knowledge, or whether it even is knowledge. According to the standard interpretation, Locke thinks that sensitive knowledge is certain, though less certain than demonstrative knowledge. Alternatively, Samuel Rickless argues that, for Locke, sensitive knowledge is not, strictly speaking, knowledge at all. For knowledge requires certainty and Locke makes it clear that sensitive knowledge is less certain than intuitive and demonstrative knowledge. Locke also introduces sensitive knowledge by saying it is less certain and only “passes under the name knowledge” (4.2.14), perhaps implying that it is called “knowledge” even though it is not technically knowledge. However, in favor of the standard interpretation, he does call sensitive knowledge “knowledge” and describes sensitive knowledge as a kind of certainty (4.2.14). This encyclopedia article follows the standard interpretation.

Putting together what we know with how we know it: we can have intuitive knowledge of identity, and some necessary relations, and of our own existence (4.9.1); we can have demonstrative knowledge of some necessary relations (for example, in geometry), and of God’s existence (4.10.1-6); and we can then have sensitive knowledge of the existence of material objects and the co-existence of the properties of those objects (for example, this gold I see is yellow).

b. A Priori Knowledge

Both early modern rationalist and empiricist philosophers accept a priori knowledge. For example, they agree that we can have a priori knowledge of mathematics. Rationalists disagree about what we can know a priori. They tend to think we can discover claims about the nature of reality by a priori reasoning, whereas Locke and the empiricists think that we must instead rely on experience to learn about the natural world. This section explains what kinds of a priori knowledge Locke thinks we can and cannot have.

Locke defines knowledge as the perception of an agreement (or disagreement) between ideas (4.1.2). This definition of knowledge fits naturally, if not exclusively, within an account of a priori knowledge. Such knowledge relies solely on a reflection of our ideas; we can know it is true just by thinking about it.

Some a priori knowledge is (what Kant would later call) analytic. For example, knowledge of claims like “gold is gold” does not depend on empirical observation. We immediately and intuitively perceive that it is true. For this reason, Locke calls knowledge of identity “trifling” (4.8.2-3). Less obviously, we can have a priori knowledge that “gold is yellow.” According to Locke’s theory of ideas, the complex idea of gold is composed of the simple ideas of yellow, heavy, and so on. Thus, saying “gold is yellow” gives us no new information about gold and is, therefore, analytic (4.8.4).

We can also have (what Kant would later call) synthetic a priori knowledge. Synthetic propositions are “instructive” (4.8.3) because they give us new information. For example, a person might have the idea of a triangle as a shape with three sides without realizing that a triangle also has interior angles of 180 degrees. So, the proposition “a triangle has interior angles equal to 180 degrees” goes beyond the idea to tell us something new about a triangle (4.8.8), and thus is synthetic. Yet it can be proven, as a theorem in geometry, that the latter proposition is true. Further, this proof is a priori since the proof relies only on our idea of a triangle and not the observation of any particular triangle. So, we can know of synthetic a priori propositions.

Locke thinks we can have synthetic a priori knowledge of mathematics and morality. As mentioned above, any theorem in geometry will be proven a priori yet the theorem gives us new information. Locke claims we can prove moral truths in the same way (4.3.18 and 4.4.7). Yet, while moral theory is generally done by reflecting on our ideas, few have agreed with Locke that we can have knowledge as certain and precise about morality as we do of mathematics.

However, Locke denies that we can have synthetic a priori knowledge of the properties of material objects (4.3.25-26). Such knowledge would need to be instructive, and so tell us something new about the properties of the object, and yet be discovered without the help of experience. Locke argues, though, the knowledge of the properties of objects depends on observation. For example, if “gold” is defined as a yellow, heavy, fusible material substance, we might wonder whether gold is also malleable. But we do not perceive an a priori connection between this idea of gold and being malleable. Instead, we must go find out whether gold is malleable or not: “Experience here must teach me, what reason cannot” (4.12.9).

c. Sensitive Knowledge

Locke holds that empirical observation can give us knowledge of material objects and their properties, but he denies that we can have any sensitive knowledge (or knowledge from sensation) about material objects that goes beyond our experience.

Sensation gives us sensitive knowledge of the existence of external material objects (4.2.14). However, we can know that x exists while we perceive it but we cannot know that it continues to exist past the time we observe it: “this knowledge extends as far as the present testimony of the senses…and no farther” (4.11.9). For example, suppose we saw a man five minutes ago. We can be certain he existed while we saw him, but we “cannot be certain, that the same man exists now, since…by a thousand ways he may cease to be, since [we] had the testimony of the senses for his existence” (4.11.9). So, empirical observation can give us knowledge, but such knowledge is strictly limited to what we observe.

Empirical observation can also give us knowledge of the properties of objects (or knowledge of “coexistence”). For example, we learn from experience that the gold we have observed is malleable. We might reasonably conclude from this that all gold, including the gold we have not observed, is malleable. But we might turn out to be wrong about this. We have to rely on experience and see. So, for this reason, Locke considers judgments that go beyond what we directly observe “probability” rather than certain knowledge (4.3.14; 4.12.10).

Despite Locke’s insistence that we can have knowledge from experience, there are potential problems for his view. First, it is not clear how, if at all, sensitive knowledge is consistent with his general definition of knowledge. Second, Locke holds that we can only ever perceive the ideas of objects, and not the objects themselves, and so this raises questions about whether we can really know if external objects exist.

i. The Interpretive Problem

Locke’s description of sensitive knowledge (or knowledge from sensation) seems to conflict with his definition of knowledge. Some have thought that he is simply inconsistent, while others try to show how sensitive knowledge is consistent with his definition of knowledge.

The problem arises from Locke’s definition of knowledge. He defines knowledge as the perception of an agreement between ideas (4.1.2). But the perception of a connection between ideas appears to be an a priori way of acquiring knowledge rather than knowledge from experience. If we could perceive a connection between the idea of a particular man (for example, John) and existence, then we could know a priori, just by reflecting on our ideas, that the man exists. But this is mistaken. The only way to know if a particular person exists is from experience (see 4.11.1-2, 9). So, perceiving connections between ideas appears to be ill-suited for knowledge of existence.

Yet suppose we do see that John exists. There seems to be only one relevant idea needed for this knowledge: the sensation of John. It seems any other idea is unnecessary to know, on the basis of observation, that John exists. With what other idea, then, does sensitive knowledge depend? This is where interpretations of Locke diverge.

Some interpreters, such as James Gibson, hold that Locke is simply inconsistent. He defines knowledge as the perception of an agreement between two ideas, but sensitive knowledge is not the perception of two ideas; sensitive knowledge consists only in the sensation of an object. The advantage of this interpretation is that it just accepts at face value his definition of knowledge and his description of sensitive knowledge in the Essay. However, perhaps there is a consistent interpretation available. Moreover, Locke elsewhere identifies the two ideas that are supposed to agree, and so he thinks his account of sensitive knowledge fits his general definition.

Other interpreters turn to the passage in which Locke identifies the two ideas that in sensitive knowledge are supposed to agree. Locke explains:

Now the two ideas, that in this case are perceived to agree, and thereby do produce knowledge, are the idea of actual sensation (which is an action whereof I have a clear and distinct idea) and the idea of actual existence of something without me that causes that sensation. (Works v. 4, p. 360)

On one interpretation of this passage, by Lex Newman and others, the idea of the object (the sensation) is perceived to agree with the idea of existence. Locke describes the first idea as “the idea of actual sensation” and, on this interpretation, that means the sensation of the object. The second idea is “the idea of actual existence.” Excluding the parenthetical comment, this is a natural way to interpret the two ideas Locke identifies here. On this view, then, we know an object x exists when we perceive that the sensation of x agrees with the idea of existence.

On a second interpretation of the passage, by Jennifer Nagel and Nathan Rockwood, the idea of the object (the sensation) is perceived to agree with the idea of reflection of having a sensation. Here the “the idea of actual sensation” is not the sensation of the object; rather, it is a second-order awareness of having a sensation or, in other words, identifying the sensation as a sensation. This is because Locke describes the first idea as “the idea of actual sensation” and then follows that with the parenthetical comment “(which is an action…)”; an external object is not an action, but having a sensation is an action. So, perhaps the first idea should be taken as an idea of having a sensation. If so, then that makes the second idea the sensation of the object. This better captures Locke’s description of the first idea as an idea of an action. In any case, on this interpretation, we know an object x exists when we have the sensation of x and identify that sensation as a sensation.

However, the interpretive problem is resolved, there remains a worry that Locke’s view inevitably leads to skepticism.

ii. The Skeptical Problem

The skeptical problem for Locke is that perceiving ideas does not seem like the kind of thing that can give us knowledge of actual objects.

Locke has a representational theory of perception. When we perceive an object, we are immediately aware of the idea of the object rather than the external object itself. The idea is like a mental picture. For example, after seeing the photograph of Locke above, we can close our ideas and picture an image of John Locke in our minds. This mental picture is an idea. According to Locke, even if he were right here before our very eyes, we would directly “see” only the idea of Locke rather than Locke himself. However, an idea of an object represents that object. Just as looking at a picture can give us information about the thing it represents, “seeing” the idea of Locke allows us to become indirectly aware of Locke himself.

Locke’s representational theory of perception entails that there is a veil of perception. On this view, there are two things: the object itself and the idea of the object. Locke thinks we can never directly observe the object itself. There is, then, a “veil” between the ideas we are immediately aware of and the objects themselves. This raises questions about whether our sensation of objects really do correspond to external objects.

Berkeley and Thomas Reid, among others, object that Locke’s representational theory of perception inevitably leads to skepticism. Locke admits that, just as a portrait of a man does not guarantee that the man portrayed exists, the idea of a man does not guarantee that the man exists (4.11.1). So, goes the objection, since on Locke’s view we can only perceive the idea, and not the object itself, we can never know for sure the external object really exists.

While others frequently accuse Locke’s view of inevitably leading to skepticism, he is not a skeptic. Locke offers four “concurrent reasons” to believe sensations correspond to external objects (4.11.3). First, we cannot have an idea of something without first having the sensation of it, suggesting that it has an external (rather than an internal) cause (4.11.4). Second, sensations are involuntary. If we are outside in the daylight, we might wish the sun would go away, but it does not. Hence, it appears the sensation of the sun has an external cause (4.11.5). Third, some veridical sensations cause pain in a way that merely dreaming or hallucinating does not (4.11.6). For example, if we are unsure if the sensation of a fire is a dream, we can stick our hand into the fire and “may perhaps be wakened into a certainty greater than [we] could wish, that it is something more than bare imagination” (4.11.8). Fourth, the senses confirm each other: we can often see and feel an object, and in this way the testimony of one sense confirms that of the other (4.11.7). In each case, Locke argues that sensation is caused by an external object and thus the external object exists.

Perhaps the external cause of sensation can provide a way for Locke to escape skepticism. As seen above, he argues that sensation has an external cause. We thus have some reason to believe that our ideas correspond to external objects. Even if there is a veil of perception, then, sensations might nonetheless give us a reason to believe in external objects.

d. The Limits of Knowledge

One of the stated purposes of the Essay is to make clear the boundary between, on the one hand, knowledge and certainty, and, on the other hand, opinion and faith (1.1.2).

For Locke, knowledge requires certainty. As explained above, we can attain certainty by perceiving an a priori necessary connection between ideas (either by intuition or demonstration) or by empirical observation. In each case, Locke thinks, the evidence is sufficient for certainty. For example, we can perceive an a priori necessary connection between the idea of red and the idea of a color, and thus we see that it must be true that “red is a color.” Given the evidence, there is no possibility of error and hence we are certain. More controversially, Locke also thinks that direct empirical observation is sufficient evidence for certainty (4.2.14).

Any belief that falls short of certainty is not knowledge. Suppose we have seen many different ravens in a wide variety of places over a long period of time, and so far, all the observed ravens have been black. Still, we do not perceive an a priori necessary connection between the idea of a raven and the idea of black. It is possible, though perhaps unlikely, that we will discover ravens of a different color in the future. Yet no matter how high the probability is that all ravens are black, we cannot be certain that all ravens are black. So, Locke concludes, we cannot know for sure that all ravens are black (4.3.14). There is a sharp boundary between knowledge and belief that falls short of knowledge: knowledge is certain, whereas other beliefs are not.

Locke describes knowledge as “perception” whereas judgment is a “presumption” (4.14.4). To perceive that p is true guarantees the truth of that proposition. But we can presume the truth of p even if p is false. For example, given that all the ravens we have seen thus far have been black, it would be reasonable for us to presume that “all ravens are black” even though we are not certain this is true. Thus, judgment involves some epistemic risk of being wrong, whereas knowledge requires certainty.

In his account of empirical knowledge, Locke takes the knowledge-judgment distinction to the extreme in two important ways. First, while we can know that an object exists while we observe it, this knowledge does not extend at all beyond what we immediately observe. For example, suppose we see John walk into his office. While we see John, we know that John exists. Ordinarily, if we just saw John walk into his office a mere few seconds ago, we would say that we “know” that John exists and is currently in his office. But Locke does not use the word “know” in that way. He reserves “knowledge” for certainty. And, intuitively, we are more certain that John exists when we see him than when we do not see him any longer. So, the moment John shuts the door, we no longer know he exists. Locke concedes it is overwhelmingly likely that John continues to exist after shutting the door, but “I have not that certainty of it, which we strictly call knowledge; though the likelihood of it puts me past doubt, …this is but probability, not knowledge” (4.11.9). So, we can know something exists only for the time that we observe it.

Second, any scientific claim that goes beyond what is immediately observed cannot be known to be true. We know that observed ravens have all been black. From this we may presume, but cannot know, that unobserved ravens are all black (4.12.9). We know that friction of observed (macroscopic) objects causes heat. By analogy, we might reasonably guess, but cannot know, that friction among unobserved particles heats up the air (4.16.12). In general, experience of observed objects cannot give us knowledge of any unobserved objects. This distinction between knowledge of the observed versus uncertainty (or even skepticism) about the unobserved remains important for contemporary empiricism in the philosophy of science.

Thus, Locke, and empiricists after him, sharply distinguish between the observed and the unobserved. Further, he maintains that we can knowledge of the observed but never of the unobserved. When our evidence falls short of certainty, Locke holds that probable evidence ought to guide our beliefs.

3. Judgment (Rational Belief)

Locke holds that all rational beliefs required evidence. For knowledge, that evidence must give us certainty: there must not be the possibility of error given the evidence. But beliefs can be rational even if they fall short of certainty so long as the belief is based on what is probably true given the evidence.

There are two conditions for rational judgment (4.15.5). First, we should believe what is most likely to be true. Second, our confidence should be proportional to the evidence. The degrees of probability, given the evidence, depends on our own knowledge and experience and the testimony of others (4.15.4).

The three kinds of rational judgment that Locke is most concerned with are beliefs based on science, testimony, and faith.

a. Science

Locke was an active participant in the scientific revolution and his empiricism was an influential step towards the modern view of science. He was close associates with Robert Boyle and Isaac Newton, and was also a member of the Royal Society, a group of the leading scientists of the day. He describes himself not as one of the “master-builders” who are making scientific discoveries, but as an “under-labourer” who would be “clearing Ground a little, and removing some of the Rubbish, that lies in the way to knowledge” about the natural world (Epistle to the Reader, p. 9-10). Locke contributes to the scientific revolution, then, by developing an empiricist epistemology consistent with the principles of modern science. His emphasis on the need for empirical observation and the uncertainty of general conclusions of science helped shaped the modern view of science as being a reliable, though fallible, source of information about the natural world.

The prevailing view of science at the time was the Aristotelian view. According to Aristotle, a “science” is a system of knowledge with a few basic principles that are known to be true and then all other propositions in the science are deduced from these basic principles. (On this view, Euclid’s geometry is the paradigmatic science.) The result of Aristotelian science, if successful, is a set of necessary truths that are known with perfect certainty. While Locke is willing to grant that God and angels might have such knowledge of nature, he emphatically denies that us mere mortals are capable of this kind of certainty about a science of nature (4.3.14, 26; 4.12.9-10).

In Locke’s view, an Aristotelian science of nature would require knowledge of the “real essence” of material objects. The real essence of an object is a set of metaphysically fundamental properties that make an object what it is (3.6.2). Locke draws a helpful analogy to geometry (4.6.11). We can start with the definition of a triangle as a shape with three sides (its real essence) and then we can deduce other properties of a triangle from that definition (such as having interior angles of a 180 degrees). Locke thinks of the real essence of material objects in the same kind of way. For example, the real essence of gold is a fundamental set of properties, and this set of properties entails that gold is yellow, heavy, fusible, and so on. Now, if we knew the real essence of gold, then we could deduce its other properties in the same way that we can deduce the properties of a triangle. But, unlike with a triangle, we do not know the real essence of gold, or of any other material object. Locke thinks the real essence of material objects is determined by the structure of imperceptibly small particles. Since, at that time, we could not see the atomic structure of different material substances, Locke thinks we cannot know the real essences of things. For this reason, he denies that we can have an Aristotelian science of nature (4.3.25-26).

The big innovation in Locke’s philosophy of science is the introduction of the concept of a “nominal essence” (3.6.2). It is often assumed that when we classify things into natural kinds that this tracks some real division in nature. Locke is not so sure. We group objects together, and call them by the same name, because of superficial similarities. For example, we see several material substances that are yellow, heavy, and fusible, and we decide to call that kind of stuff “gold.” The set of properties we use to classify gold as gold (that is, being yellow, heavy, and fusible) is the “nominal” essence of gold (“nominal” meaning here “in name only”). Locke does not think that the nominal essence is the same as the real essence. The real essence is, at the time, the unobservable chemical structure of the object, whereas the nominal essence is the set of observable qualities we use to classify objects. Locke therefore recognizes that there is something artificial about the natural kinds identified by scientists.

In general, Locke denies that we can have synthetic a priori knowledge of material objects. Because we can have knowledge of only the nominal essence of an object, and not its real essence, we are unable to make a priori inferences about what other properties an object has. For example, if “gold” is defined as a yellow, heavy, fusible material substance, then “gold is yellow” would be analytic. The claim “gold is malleable” would be synthetic, because it gives us new information about the properties of gold. There is no a priori necessary connection between the defining properties of gold and malleability. Therefore, we cannot know with certainty that all gold is malleable. For this reason, Locke says, “we are not capable of scientifical knowledge; nor shall [we] ever be able to discover general, instructive, unquestionable truths concerning” material objects (4.3.26).

Most of Locke’s comments about science emphasize that we cannot have knowledge, but this does not mean that beliefs based on empirical science are unjustified. In Locke’s view, a claim is “more or less probable” depending on “the frequency and constancy of experience” (4.15.6). The more frequently it is observed that “A is B” the more likely it would be that, on any particular occasion, “A is B” (4.16.6-9). For example, since all the gold we have observed has been malleable, it is likely that all gold, even the gold we have not observed, is also malleable. In this way, we can use empirical evidence to make probable inferences about what we have not yet observed.

For Locke, then, all knowledge and rational beliefs about material objects must be grounded in empirical observation, either by observation or probable inferences made from observation.

b. Testimony

Testimony can be a credible source of evidence. Locke develops an early and influential account of when testimony should be believed and when it should be doubted.

In Locke’s view, we cannot know something on the basis of testimony. Knowledge requires certainty, but there is always the possibility that someone’s testimony is mistaken: perhaps the person is lying, or honestly stating her belief but is mistaken. So, although credible testimony is likely to be true, it is not guaranteed to be true, and hence we cannot be certain that it is so.

Yet credible testimony is often likely to be true. A high school math teacher knows a theorem in geometry is true because she has gone through the steps of the proof. She might then tell her students that the theorem is true. If they believe her on the basis of her testimony, rather than going through the steps of the proof themselves, then they do not know the theorem is true; yet they would have a rationally justified belief because it is likely to be true given the teacher’s testimony (4.15.1).

Whether someone’s testimony should be believed depends on (i) how well it conforms with our knowledge and past experience and (ii) the credibility of the testimony. The credibility of testimony depends on several factors (4.15.4):

    1. the number of people testifying (more witnesses provide more evidence)
    2. the integrity of the people
    3. the “skill” of the witnesses (that is, how well they know what they are talking about)
    4. the intention of the witnesses
    5. the consistency of the testimony
    6. contrary testimonies (if any)

We can be confident in a testimony that conforms with our own past experience and the reported experience of others (4.16.6). As noted above, the more frequently something is observed the more likely it happened on a given occasion. For example, in our past experience fire has always been warm. When we have the testimony that, on a specific occasion, a fire was warm, then we should believe it with the utmost confidence.

We should be less confident in the testimony when it conflicts with our past experience (4.16.9). Locke relates the story of the King of Siam who, knowing only the warm climate of south-east Asia, is told by a traveler that in Europe it gets so cold that water becomes solid (4.15.5). On the one hand, the King has credible testimony that water becomes solid. On the other hand, this conflicts with his past experience and the reported experience of those he knows. Locke implies that the King rationally doubted the testimony because, in this case, the evidence from the King’s experience is greater than the evidence from testimony. Experience does not always outweigh the evidence from testimony. The evidence from testimony depends in part on the number of people: “as the relators are more in number, and of more credit, and have no interest to speak contrary to the truth, so that matter of fact is like to find more or less belief.” So, if there is enough evidence from testimony, then that could in principle outweigh the evidence from experience.

That the evidence from testimony can sometimes outweigh the evidence from experience is particularly relevant to the testimony of miracles. Hume famously argues that because the testimony of miracles conflict with our ordinary experience we should never believe the testimony of a miracle. Locke, however, remains open to the possibility that the evidence from testimony could outweigh the evidence from experience. Indeed, he argues that we should believe in revelation, particularly in the Bible, because of the testimony of miracles (4.16.13-14; 4.19.15).

Although Locke thinks that testimony can provide good evidence, it does not always provide good evidence. We should not believe “the opinion of others” just because they say something is true. One difference between the testimony Locke accepts as credible and the testimony he rejects is that credible testimony begins with knowledge, whereas the testimony of “the opinion of others” is merely speculation of things “beyond the discovery of the senses” and so are “not capable of any such testimony” (4.16.5). In taking this attitude, Locke follows the Enlightenment sentiment of rejecting authority. Speculative theories should not be believed based on testimony. Instead, we should base our opinions on observation and our own reasoning.

Testimony provides credible evidence when the testimony makes it likely that a claim is true. When the person is in a position to know, either from reasoning or from experience, it can be reasonable for us to believe the person’s testimony.

c. Faith

Locke insists that all of our beliefs should be based on evidence. While this seems obviously true for most beliefs, some people want to make an exception for religion. Faith is sometimes thought to be belief that goes beyond the evidence. However, Locke thinks that, if faith is to be rational, even faith must be supported by evidence.

Some religious claims can be proven by the use of reason or natural theology. For example, Locke makes a cosmological argument for the existence of God (4.10.1-6). He thinks that, given this proof from natural theology, we can know that God exists. This kind of belief in God is knowledge and not faith, since faith implies some uncertainty.

Many religious claims cannot be proven by the use of natural reason; we must instead rely on revelation. Locke defines faith as believing something because God revealed it (4.18.2). We do not perceive the truth, as we do in knowledge, but instead presume that it is true because God has told us so in a revelation. Just as human testimony can provide evidence, revelation, as God’s testimony, can provide evidence. Yet revelation is better evidence than human testimony because human testimony is fallible whereas divine revelation is infallible (4.18.10).

We should believe whatever God has revealed, but we first must have good reason to believe that God revealed it. This makes faith dependent on reason. For reason must judge whether something is a genuine revelation from God (4.18.10). Further, we must be sure that we interpret the revelation correctly (4.18.5). Since whatever God reveals is guaranteed to be true, if we have good evidence that God revealed that p, then that provides us with evidence that p is true. In this way, Locke can insist that all religious beliefs require evidence and yet believe in the truths of revealed religion.

Locke only admits publicly available evidence as evidence for revelation. He criticizes “enthusiasm” which is, as he describes it, believing (what they claim is) a revelation without evidence that it is a revelation (4.19.4). The enthusiast believes God has revealed that p only because it seems to the person to have come from God. Some religious epistemologists take religious experience as evidence for religious belief, but Locke is skeptical of religious experience. Locke demands that this subjective feeling that God revealed p be backed up with concrete evidence that it really did come from God (4.19.11). Instead of relying on private religious experience, Locke appeals to miracles as publicly available evidence supporting revelation (4.16.13; 4.19.14-15).

Locke also limits the kind of things that can be believed on the basis of revelation. Some propositions are according to reasons, others are contrary to reason, and still others are above reason. Only the things above reason can appropriately be believed on the basis of revelation (4.18.5, 7).

Claims that are “according to reason” should not be believed on the basis of revelation because we already have all the evidence we need. By “according to reason” Locke means those propositions that we have a priori knowledge of, either from intuition or demonstration. If we already have a priori knowledge that p, then we need no further evidence. In that case, revelation would be useless because we already have certainty without it.

Claims that are “contrary to reason” should not be believed on the basis of revelation because we know with certainty that they are false. By “contrary to reason” Locke means those propositions that we have a priori knowledge are false. If it is self-evident that p is false, then we cannot rationally believe it under the pretense that it was revealed by God. For God only reveals true things, and if we know with certainty p is false, then we know for sure that God did not reveal it.

Faith, then, concerns those things that are “above reason.” By “above reason” Locke means those propositions that cannot be proven one way or the other by a priori reasoning. For example, the Bible teaches that some of the angels rebelled and were kicked out of heaven, and it predicts that the dead will rise again at the last day. We cannot know with certainty these claims are true, nor that they are false. We must instead rely on revelation: “faith gave the determination, where reason came short; and revelation discovered which side the truth lay” (4.18.9).

There is some disagreement about how much weight Locke gives revelation when there is conflicting evidence. For example, suppose we have good reason to believe that God revealed that p to Jesus, but that given our other evidence p appears to be false. Locke says “evident revelation ought to determine our assent even against probability” (4.18.9). On one interpretation, if there is good reason to believe God revealed that p we should believe p no matter how likely it is that p is false given our other evidence. On another interpretation, Locke carefully sticks with his evidentialism: if the evidence that God revealed p outweighs the evidence that p is false, then we should believe that p; but if the evidence that p is false outweighs the evidence that God revealed it, then we should not believe p (nor that God revealed it).

According to Locke, God created us as rational beings so that we would form rational beliefs based on the available evidence, and he thinks religious beliefs are no exception.

4. Conclusion

Locke makes a number of important contributions to the history of epistemology.

He makes the most sustained criticism of innate ideas and innate knowledge, which convinced generations of later empiricists such as Berkeley and Hume. He argues that there is no universal knowledge we are all born with and, instead, all our ideas and knowledge depend on experience.

He then develops an explanation for how we acquire knowledge that is consistent with his empiricism. All knowledge is either known a priori or based on empirical evidence. Later empiricists, such as Hume and the logical positivists, follow Locke in thinking some knowledge is known a priori whereas other knowledge is based on empirical evidence. However, Locke allows for synthetic a priori knowledge whereas Hume and the logical positivists hold that all a priori knowledge is analytic.

Locke’s emphasis on the need for the empirical evidence also supported the developments of the scientific revolution. Locke argues that a priori knowledge of nature is not possible, a thesis for which Hume would later become more famous. Rather than a priori knowledge of nature, Locke emphasizes the need for empirical evidence. Although inferences from empirical observations cannot give us certainty, Locke thought that they can give us evidence about what is most likely to be true. So, science should be both based on empirical evidence and acknowledge its uncertainty. In this way, Locke helped shift attitudes about science away from the Aristotelian view towards the modern conception of empirical science that is always open to revision upon further observation.

Locke gives one of the earliest careful treatments of how testimony can serve as evidence. He argues that testimony should be evaluated by its internal consistency and consistency with other things we know, including past observations about similar cases. Hume later takes this view of testimony and then, as he famously argues, claims that since the testimony of miracles conflicts with past experience we should not believe the testimony of miracles.

Finally, Locke insists that religious belief should be based on evidence. Locke himself thought that there was sufficient evidence from the testimony of miracles and revelation to support his belief in Christianity. However, many have criticized Locke’s evidentialism for undermining the rationality of religious belief. Critics such as Hume agreed that religious belief needs to be supported by evidence, but they argue there is no good evidence, and hence religious belief is not rational. Others, such as William Paley, attempted to provide the evidence needed to support religious belief. Needless to say, whether there is good evidence for religion or not was just as controversial then as it is now, yet many agree with Locke that religious belief requires evidence.

The guiding principle of Locke’s epistemology is the need for evidence. We can acquire evidence from a priori reasoning, from empirical observation, or most often from inferences from empirical observation. Limiting our beliefs to those supported by evidence, Locke thinks, is the most reliable way to get at the truth.

5. References and Further Reading

a. Locke’s Works

  • Locke, John. 1690/1975 An Essay Concerning Human Understanding (ed. Peter Nidditch). Oxford University Press, 1975.
    • References to the Essay are cited by book, chapter, and section. For example, 2.1.2 is book 2, chapter 1, section 2.
  • Locke, John, The Works of John Locke in ten volumes (ed. Thomas Tegg). London.

b. Recommended Reading

  • Anstey, Peter. 2011. John Locke & Natural Philosophy. Oxford: Oxford University Press.
    • This book is on Locke’s view of science and its relationship to Robert Boyle and Isaac Newton.
  • Ayers, Michael. 1993. Locke: Epistemology and Ontology. New York: Routledge
    • Volume 1 of this two-volume work is dedicated to Locke’s epistemology. It includes chapters on Locke’s theory of ideas, theory of perception, probable judgment, and knowledge.
  • Gibson. James. 1917. Locke’s Theory of Knowledge and its Historical Relations. Cambridge: Cambridge University Press.
    • This book gives a thorough overview of Locke’s epistemology with chapters showing how Locke’s view relates to other early modern philosophers such as Descartes, Leibniz, and Kant. The comparison of Locke and Kant on a priori knowledge is particularly helpful.
  • Gordon-Roth, Jessica and Weinberg, Shelley. 2021. The Lockean Mind. New York: Routledge.
    • A survey of all aspects of Locke’s philosophy by different scholars. It has several articles on Locke’s epistemology, including on Locke’s criticism of innate knowledge, account of knowledge, account of probable judgment, and religious belief, as well as his theory of ideas and theory of perception.
  • Jacovides, Michael. 2017. Locke’s Image of the World. Oxford: Oxford University Press.
    • This book is on Locke’s view of science and how Locke was influenced by and exerted an influence on scientific developments happening at the time.
  • Nagel, Jennifer. 2014. Knowledge: A Very Short Introduction. Oxford: Oxford University Press.
    • This introduction to epistemology includes a chapter on the early modern rationalism-empiricism debate and a chapter on Locke’s view of testimony.
  • Nagel, Jennifer. 2016. “Sensitive Knowledge: Locke on Skepticism and Sensation,” A Companion to Locke.
    • A discussion of Locke’s account of sensitive knowledge and response to skepticism.
  • Newman, Lex. 2007. “Locke on Knowledge.” Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
    • This is an accessible and excellent overview of Locke’s theory of knowledge.
  • Newman, Lex. 2007. Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
    • A survey of Locke’s Essay by different scholars. It has several articles on Locke’s epistemology, including his criticism of innate knowledge, theory of ideas, account of knowledge, probable judgment, and religious belief.
  • Chappell, Vere. 1994. The Cambridge Companion to Locke. Cambridge: Cambridge University Press.
    • A survey of all aspects of Locke’s philosophy by different scholars. It includes articles on Locke’s theory of ideas, account of knowledge, and religious belief.
  • Rickless, Samuel. 2007. “Locke’s Polemic Against Nativism.” Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
    • This is an accessible and excellent overview of Locke’s criticism of innate knowledge.
  • Rickless, Samuel. 2014. Locke. Malden, MA: Blackwell.
    • This book is an introduction and overview of Locke’s philosophy, which includes chapters on Locke’s criticism of innate knowledge, account of knowledge, and religious belief.
  • Rockwood, Nathan. 2018. “Locke on Empirical Knowledge.” History of Philosophy Quarterly, v. 35, n. 4.
    • The article explains Locke’s view on how empirical observation can justify knowledge that material objects exist and have specific properties.
  • Rockwood, Nathan. 2020. “Locke on Reason, Revelation, and Miracles.” The Lockean Mind (ed. Jessica Gordon-Roth and Shelley Weinberg). New York: Routledge.
    • This article is a good introduction to Locke’s religious epistemology.
  • Wolterstorff, Nicholas. 1996. John Locke and the Ethics of Belief. Cambridge: Cambridge University Press.
    • This book gives an overview of Locke’s epistemology generally and specifically his account of religious belief.

 

Author Information

Nathan Rockwood
Email: nathan_rockwood@byu.edu
Brigham Young University
U. S. A.

Lewis Carroll: Logic

Lewis CarrollCharles L. Dodgson (also known as Lewis Carroll), 1832-1898, was a British mathematician, logician, and the author of the ‘Alice’ books, Alice’s Adventures in Wonderland and Through the Looking Glass and What Alice Found There. His fame derives principally from his literary works, but in the twentieth century some of his mathematical and logical ideas found important applications. His approach to them led him to invent various methods that lend themselves to mechanical reasoning. He was not a traditional mathematician. Rather, he applied mathematical and logical solutions to problems that interested him. As a natural logician at a time when logic was not considered to be a part of mathematics, he successfully worked in both fields. Everything he published in mathematics reflected a logical way of thinking, particularly his works on geometry. Dodgson held an abiding interest in Euclid’s geometry. Of the ten books on mathematics that he wrote, including his two logic books, five dealt with geometry. From his study of geometry, he developed a strong affinity for determining the validity of arguments not only in mathematics but in daily life too. Dodgson felt strongly about logic as a basis for cogent thought in all areas of life—yet he did not realize he had developed concepts that would be explored or expanded upon in the twentieth century. Dodgson’s approach to solving logic problems led him to invent various methods, particularly the method of diagrams and the method of trees. As a method for a large number of sets, Carroll diagrams are easier than Venn diagrams to draw because they are self-similar. His uncommon exposition of elementary logic has amused modern authors who continue to take quotations from his logic books. The mathematician and logician Hugh MacColl’s views on logic were influenced by reading Dodgson’s Symbolic Logic, Part I. Their exchanges show that both had a deep interest in the precise use of words. And both saw no harm in attributing arbitrary meanings to words, as long as that meaning is precise and the attribution agreed upon. Dodgson’s reputation as the author of the ‘Alice’ books cast him primarily as an author of children’s books and prevented his logic books from being treated seriously. The barrier created by the fame Carroll deservedly earned from his ‘Alice’ books combined with a writing style more literary than mathematical, prevented the community of British logicians from properly recognizing him as a significant logician during his lifetime.

Table of Contents

  1. Dodgson’s Life
  2. The Logic Setting in His Time
  3. Logic and Geometry
    1. Syllogisms, Soriteses, and Puzzle Problems
    2. Venn and Carroll Diagrams
    3. Dodgson’s ‘Methods’
  4. The Automation of Deduction
  5. Dodgson’s Logic Circle
    1. The ‘Alice’ Effect
  6. Logic Paradoxes
    1. The Barbershop Paradox
    2. Achilles and the Tortoise
  7. Dodgson and Modern Mathematics
  8. Carroll as Popularizer
  9. Conclusion
  10. References and Further Reading
    1. Primary
    2. Secondary

1. Dodgson’s Life

Charles Lutwidge Dodgson (1832-1898), better known by his pen name Lewis Carroll that he adopted in 1856, entered Christ Church, Oxford University in England in 1852. After passing Responsions, the first of the three required examinations and achieving a first Class in Mathematics and a Second Class in Classics in Moderations, the second required examination, he earned a bachelor’s degree in 1854, placing first on the list of First-Class Honours in Mathematics, and earning Third Class Honours in the required Classics. He received the Master of Arts degree in 1857. He remained at Christ Church College for the rest of his life.

He began teaching individual students privately in differential calculus, conics, Euclidean geometry, algebra, and trigonometry. In 1856 the Dean of Christ Church, the Reverend Henry Liddell, appointed him the Mathematical Lecturer, a post he held for 25 years, before resigning it in 1881.

In 1856 he took up photography, eventually taking about 3,000 photos, many of eminent people in government, science, the arts, and theatre. Prime Minister Salisbury, Michael Faraday, and John Ruskin were some of his subjects. He became one of the most eminent photographers of his time. He also was a prolific letter writer, keeping a register of the letters he received and sent, 98,721 of them, in the last thirty-five years of his life.

Taking holy orders was a requirement for all faculty. He chose to be a deacon rather than a priest so that he could devote his time to teaching and continue to go to the theatre in London which was his favorite pastime. He was ordained a Deacon in 1861. Dodgson developed a deeply religious view in his life. His father, Archdeacon Charles Dodgson, had been considered a strong candidate for the post of Archbishop of Canterbury before he married.

His first publications (pamphlets and books from 1860 to 1864) were designed to help students: A Syllabus of Plane Algebraic Geometry, Systematically Arranged with Formal Definitions, Postulates and Axioms; Notes on the First Two Books of Euclid; Notes on the First Part of Algebra; The Formulae of Plane Trigonometry; The Enunciations of Euclid I, II; General List of [Mathematical] Subjects, and Cycle for Working Examples; A Guide to the Mathematical Student.

In the mid-1860s Dodgson became active in college life, writing humorous mathematical ‘screeds’ to argue on various issues at Christ Church, voting on the election of Students (Fellows), and on physical alterations to the College’s buildings and grounds. These activities piqued his interest in ranking and voting methods. He became a member of the Governing Board in 1867 and remained on it for his entire life. In 1868 he acquired an apartment in the NW corner of Tom Quad, part of Christ Church, where he constructed a studio for his photography on its roof. His apartment was the choicest, most expensive one in the College.

Becoming active in political affairs outside the College in the 1880s, he sent many letters to the Editors of The St. James’s Gazette, the Pall Mall Gazette, and other newspapers presenting his position on various issues of national importance. Through his photography, he became friendly with Lord Salisbury, who became Prime Minister in 1881. Their social relationship, begun in 1870, lasted throughout Dodgson’s life and spurred him to consider the problem of fairness both in representation and apportionment, culminating in his pamphlet of 1884, The Principles of Parliamentary Representation.

His publications, pamphlets and two books, during the remainder of the 1860s reflect these interests as well as those of mathematics, and those that provide evidence of his considerable literary abilities: The Dynamics of a Particle with an Excursus on the New Method of Evaluation as Applied to Π; Alice’s Adventures in Wonderland; An Elementary Treatise on Determinants with Their Applications to Simultaneous Linear Equations and Algebraical Geometry; The Fifth Book of Euclid Treated Algebraically, so Far as It Relates to Commensurable Magnitudes, with Notes; Algebraical formulae for the Use of Candidates for Responsions; Phantasmagoria and Other Poems.

His publications in the 1870s continued in the same vein: Algebraical Formulae and Rules for the Use of Candidates for Responsions; Arithmetical Formulae and Rules for the Use of Candidates for Responsions; Through The looking-Glass, And What Alice Found There; The Enunciations of Euclid, Books I–VI; Examples in Arithmetic; A Discussion of The Various Methods Of Procedure in Conducting Elections; Suggestions As To The Best Method Of Taking Vote; Euclid Book V, Proved Algebraically So Far As It Relates To Commensurable Magnitudes, with Notes; The Hunting Of The Snark; A Method Of Taking Votes On More Than Two Issues; Euclid And His Modern Rivals.

After resigning his position as Mathematical Lecturer, Dodgson now had more time for writing. In the first half of the 1880s Dodgson published Lawn Tennis Tournaments, The Principles of Parliamentary Representation, A Tangled Tale, Alice’s Adventures Underground (facsimile edition).

But the second half of the 1880s saw a tectonic shift with his first book on logic: The Game of Logic, as well as his cipher, Memoria Technica, two more books, Curiosa Mathematica, Part I: A New Theory of Parallels, and Sylvie and Bruno. In 1887 he published the first of three articles in Nature, “To Find the Day of the Week for any Given date”.

In the last decade of his life more books were published. The Nursery Alice appeared in 1890. Curiosa Mathematica, Part II: Pillow Problems, and Sylvie and Bruno Concluded appeared in 1893. His only other publications in logic came out between 1894 and 1896. These were the two articles in Mind, “A Logical Paradox”, “What the Tortoise Said to Achilles”, and a book, Symbolic Logic, Part I: Elementary. From 1892 to 1897 he worked on a chapter of a projected book on games and puzzles that was never published. It included his “Rule for Finding Easter-Day for Any Date till A. D. 2499”. His final publications were: “Brief Method of Dividing a Given Number By 9 Or 11”, (1897) and “Abridged Long Division” (1898). Both appeared in the journal Nature.

2. The Logic Setting in His Time

The treatment of logic in England began to fundamentally change when George Boole published a short book in 1847 called The Mathematical Analysis of Logic. In it he developed the notion that logical relations could be expressed by algebraic formulas. Boole, using his laws of calculation, was able to represent algebraically all of the methods of reasoning in traditional classical logic. And in a book that he published in 1854, An Investigation of the Laws of Thought, Boole set out for himself the goal of creating a completely general method in logic.

Paralleling Boole’s work was that of De Morgan, whose book, Formal Logic, appeared at about the same time as Boole’s in 1847. De Morgan became interested in developing the logic of relations to complement Boole’s logic of classes. His purpose was to exhibit the most general form of a syllogism. His belief that the laws of algebra can be stated formally without giving a particular interpretation such as the number system, influenced Boole.

Although Boole and his followers understood that they were just algebraizing logic, that is, rewriting syllogisms in a new notational system rather than inventing a new logical calculus, they correctly claimed that all valid arguments cannot be reduced to these forms. Venn understood this; he published an article in Mind in 1876 that included the following problem as an illustration of the inadequacies of Aristotelian forms of reasoning and the superiority of Boolean methods. Venn had given the problem whose conclusion is: no shareholders are bondholders, as a test question to Cambridge University undergraduates. He remarked that of the 150 or so students, only five or six were able to solve the following simple problem:

A certain Company had a Board of Directors. Every Director held either Bonds or Shares; but no Director held both. Every Bondholder was on the Board. Deduce all that can logically be deduced, in as few propositions as possible.

For Dodgson and his contemporaries, the central problem of the logic of classes, known as the elimination problem, was to determine the maximum amount of information obtainable from a given set of propositions. In his 1854 book, Boole made the solution to this problem considerably more complex when he provided the mechanism of a purely symbolic treatment which allowed propositions to have any number of terms, thereby introducing the possibility of an overwhelming number of computations.

Logical arguments using rules of inference are a major component of both geometry and logic. To Dodgson, logic and geometry shared the characteristics of truth and certainty, qualities that held him in thrall. From the mid 1880s on, he shifted his focus from the truth given by geometrical theorems (true statements) to the validity of logical arguments, the rules that guarantee that only true conclusions can be inferred from true premises, and he pushed the envelope of the standard forms of the prevailing logic of his time, which was Aristotelian.

Dodgson began writing explicitly about logic in the 1870s when he began his magnus opus, Symbolic Logic, the first part appearing in 1896. Dodgson’s formulation of formal logic came late in his life following his publications on Euclid’s geometry in the 1860s and 1870s. In mathematics generally, and in geometry particularly, one begins with a set of axioms and certain inference rules to infer that if one proposition is true, then so is another proposition. To Dodgson, geometry and logic shared the characteristic of certainty, a quality that always interested him. But by the early 1890s he had shifted his focus away from the truth given by geometrical theorems to the validity of logical arguments.

Dodgson worked alone but he was not at all isolated from the community of logicians of his time. He corresponded with a number of British logicians  These include: James Welton, author of the two volume A Manual of Logic; John Cook Wilson, Professor of Logic at Oxford from 1889 until his death in 1915; Thomas Fowler, Wykeham Professor of Logic at Oxford (1873 to 1889) and author of The Elements of Deductive Logic; William Ernest Johnson, a collaborator of John Neville Keynes at Cambridge and author of “The Logical Calculus,” a series of three articles that appeared in Mind in 1892; Herbert William Blunt; Henry Sidgwick, Professor of Moral Philosophy at Cambridge; John Venn, author of the influential book, Symbolic Logic;  as well as F. H. Bradley, author of The Principles of Logic; and Stewart. He also cited the book, Studies in Logic, edited by Peirce and includes pieces by Peirce’s students: Marquand, Ladd – Franklin, Oscar Howard Mitchell, and B. I. Gilman. We know from Venn’s review of Studies in Logic appearing in the October 1883 edition of Mind, soon after Peirce’s book was published, that Peirce was well-known to the British symbolists, and that they were aware of Peirce’s publications.

Marquand’s contributions, a short article, “A Machine for Producing Syllogistic Variations”, and his “Note on an Eight-Term Logic Machine”, contain ideas that Dodgson captured in his Register of Attributes, a tool he constructed to organize the premises when he applied his tree method to soriteses, (A soritesis an argument having many premises and a single conclusion. It can be resolved as a list of syllogisms, the conclusion of each becoming a premise of the next syllogism.) Dodgson had used ideas associated with a logic machine even earlier in The Game of Logic.

The sale of Dodgson’s library at his death included works on logic by Boole, Venn, Allan Marquand, Mitchell, Ladd-Franklin, Benjamin Ives Gilman, Peirce, John Neville Keynes, Rudolph Hermann Lotze (in English translation by Bernard Bosanquet) James William Gilbart, De Morgan, Bernard Bosanquet, Francis H. Bradley, John Stuart Mill, William Stirling Hamilton, William Whewell, and Jevons, among others. Some of these works influenced his own writing and also provided material he needed in his dealings with Oxford adversaries,  

3. Logic and Geometry

On an implicit level, Dodgson wrote about logic throughout his entire professional career. Everything he published in mathematics reflected a logical way of thinking, particularly his works on geometry. Dodgson’s heightened concern with logic followed his publications on Euclid’s geometry from the 1860s and 1870s.

From the mid -1880s on, he shifted his focus from the truth given by geometrical theorems (true statements) to the validity of logical arguments, the rules that guarantee that only true conclusions can be inferred from true premises. On p. xi of the preface to the third edition (1890) of his book about geometry, Curiosa Mathematica Part I: A New Theory of Parallels, he pointed out that the validity of a syllogism is independent of the truth of its premises. He gave this example:

I have sent for you, my dear Ducks, said the worthy Mrs. Bond, ‘to enquire with what sauce you would like to be eaten?’ ‘But we don’t want to be killed!’ cried the Ducks. ‘You are wandering from the point’ was Mrs. Bond’s perfectly logical reply.

Dodgson held an abiding interest in Euclid’s geometry. Of the ten books on mathematics that he wrote, including his two logic books, five dealt with geometry. From his study of geometry, he developed a strong affinity for determining the validity of arguments not only in mathematics but in daily life too. Arguably, Dodgson’s formulation of formal logic came late in his life as the culmination of his publications on Euclid’s geometry in the 1860s and 1870s. Exactly one month before he died, in an unpublished letter Dodgson wrote to Dugald Stewart criticizing a manuscript Stewart had given to him for his opinion, he commented:

Logic, under that view, would become to me, a science of such uncertainty that I shd [should] take no further interest in it. It is its absolute certainty which at present fascinates me. (Dodgson, Berol Collection, New York University, 14 December 1897)

We also know that Dodgson was proficient in proving theorems by the contradiction method in his many publications on geometry. Just as logic informed his geometric work, so geometry informed his logic writings. In his logic book, he used geometric notation and terms, for example, the reverse paragraph symbol for the main connective, a syllogism, the implication relation, and the corresponding symbol for ‘therefore.’

a. Syllogisms, Soriteses, and Puzzle Problems

In classical Aristotelian logic there are four forms of propositions:

A: All x is y
E: No x is y
I: Some x is y
O: some x is not y.

These Boole wrote as:

x(1 – y) = 0
xy = 0
xy ≠ 0
x(1 – y) ≠ 0.

The x, y, z symbols denote classes; and Boole used ordinary algebraic laws governing calculations with numbers to interpret his system of classes and the permissible operations on them. He assumed that each of these laws such as xy = yx, expresses a proposition that is true. Boole also developed rules to deal with elimination problems.  If the equation f(x) = 0 denotes the information available about a class x, and we want to find the relations that hold between x and the other classes (y, z, and so forth) to which x is related which is symbolized by the expression f(x), Boole, using his laws of calculation, was able to represent algebraically all of the methods of reasoning in traditional classical logic. For example, syllogistic reasoning involves reducing two class equations (premises) to one equation (conclusion), eliminating the middle term, and then solving the equation of the conclusion for the subject term. The mechanical nature of these steps is apparent.

Dodgson, like most of his peers, used classical forms, such as the syllogism and sorites, to solve logical problems. These forms of traditional Aristotelian logic were the basis of the system of logical reasoning that prevailed in England up to the first quarter of the twentieth century. But Dodgson went much further, creating logical puzzle problems, some of which contained arguments intended to confuse the reader, while others could be described as paradoxical because they seemed to prove what was thought to be false. With these purposes in mind, he wanted to show that the classical syllogistic form, the prevailing logic system of his time, permits much more general reasoning than what was commonly believed.

Medieval Aristotelian logicians had formulated classifications of either fifteen, nineteen, or twenty-four valid syllogisms, depending on a number of assumptions. And in part II of Symbolic Logic, Bartley includes three more valid syllogistic formulas that Dodgson had constructed primarily to handle syllogisms that contain “not-all” statements.

Syllogistic reasoning, from the time of Aristotle until George Boole’s work in logic in the mid nineteenth century, was the essential method of all logical reasoning. In a syllogism, there are three terms (classes) in its three statements: subject, predicate (an expression that attributes properties), and the middle term which occurs once in each premise. There are several classification systems for syllogisms involving the relative position of the repeated middle term (which determines its figure, or case—there are four cases) and the way that a syllogism can be constructed within a figure (which determines its mood).

Dodgson created the first part of his visual proof system, a diagrammatic system, beginning in 1887 in a small book titled The Game of Logic. His diagrammatic system could detect fallacies, a subject that greatly interested him. He defined a fallacy as an “argument which deceives us, by seeming to prove what it does not really prove….” (Bartley 1977, p. 129)

The “game” employs two and three set diagrams only.  His diagrams can represent both universal and existential statements. This textbook, intended for young people, has many examples and their solutions.

With a view to extending his proof method, Dodgson went on to expand his set of diagrams, eventually creating diagrams for eight sets (classes), and describing the construction of nine set and ten set diagrams.

He believed that mental activities and mental recreations, like games and particularly puzzles, were enjoyable and conferred a sense of power to those who make the effort to solve them. In an advertisement for the fourth edition of Symbolic Logic, Part I. Elementary addressed to teachers he wrote:

I claim, for Symbolic Logic, a very high place among recreations that have the nature of games or puzzles….Symbolic Logic has one unique feature, as compared with games and puzzles, which entitles it, I hold, to rank above them all….He may apply his skill to any and every subject of human thought; in every one of them it will help him to get clear ideas, to make orderly arrangement of his knowledge, and more important than all, to detect and unravel the fallacies he will meet with in every subject he may interest himself in. (Bartley 1977, p. 46)

Dodgson felt strongly about logic as a basis for cogent thought in all areas of life – yet he did not realize he had developed concepts that would be explored or expanded upon in the twentieth century. Although he recognized his innovations as significant, the fact that he presented them primarily in a didactic context, as opposed to a research context, has affected how they were perceived and evaluated in his time and even after Bartley’s publication.

Carroll’s idea of syllogistic construction differed from both the classical and the medieval as well as from his contemporaries. Some reasons he gave for consolidating the nineteen different forms appearing in current textbooks included the following: the syllogistic rules were too specialized; many conclusions were incomplete; and many legitimate syllogistic forms were ignored. Although Boole believed that the solutions that were found when his methods were used were complete, it has been shown this was not always the case.

Carroll made several changes to syllogistic constructions compared to what was currently accepted in his time. The result is the fifteen valid syllogisms, although he did not actually list them, that Carroll recognized. A syllogism is an argument having two premises and a single conclusion, with each proposition being one of four kinds, A: ‘all…are…’; E: ‘no…is…’; I: ‘some…are…’; O: ‘some…are not…’. There are three terms (classes) in the three statements: subject, predicate (an expression that attributes properties) and the middle term which occurs once in each premise. The number of his valid syllogisms ranges between eighteen and twenty-four.

In his earlier book, The Game of Logic, Carroll created a diagrammatic system to solve syllogisms. Ten years later, in Symbolic Logic, Part I, he extended the method of diagrams to handle the construction of up to ten classes (sets) depicting their relationships and the corresponding propositions. This visual logic method, which employs triliteral and biliteral diagrams, is a proof system for categorical syllogisms whose propositions are of the A, E, I type. He subsumed the O type under I, that is, ‘some x are not-y’ is equivalent to ‘some x are y and some x are not-y.’ But he did not use the method as a proof system beyond syllogisms. For the more complex soriteses, he settled on the ‘methods of barred premises and barred groups,’ and his final visual method, the method of trees, which remained unpublished until 1977 when it appeared in W. W. Bartley III’s book, Lewis Carroll’s Symbolic Logic. In Bartley’s construction of part II of Symbolic Logic, using Dodgson’s extant papers, letters, and manuscripts, the main topics in the eight books are: fallacies, logical charts, the two methods of barred premises and of trees, and puzzle problems. In part I of Symbolic Logic Dodgson used just three formulas, which he called Figures or Forms to designate the classical syllogisms. In the fourth edition of Symbolic Logic, Part I. Elementary, Dodgson pointed this out in an Appendix, Addressed to Teachers where he wrote:

As to Syllogisms, I find their [in textbooks] nineteen forms, with about a score of others which they have ignored, can all be arranged under three forms, each with a very simple Rule of its own. (Bartley 1977, p. 250)

In Symbolic Logic, Part I which appeared in four editions in 1896, Dodgson, represented syllogisms as in this example:

No x are mʹ;

All m are y.

∴ No x are yʹ

in the form of conditional statements using a subscript form that is written symbolically as: xmʹ0 † m10 (reverse ¶) xyʹ0 (Bartley 1977, p. 122) with the reverse paragraph sign signifying the connecting implication relation, which he defined as: the propositions on the left side “would, if true, prove” the proposition on the right side. (Bartley 1977, p. 119) Dodgson’s algebraic notation is a modification of Boole’s which he thought was unwieldy.

Why did Dodgson choose to write his logic books under his pseudonym? Bartley suggests a combination of motives: He wanted the material to appeal to a large general audience, particularly to young people, a task made easier using the wide acclaim accorded him as the writer, Lewis Carroll. Then too, there was the financial motive; books authored by Lewis Carroll could generate greater revenue than books by the mathematician Charles Dodgson. By 1896, Dodgson was very much concerned about his mortality and the responsibility he bore for the future care of his family, especially his unmarried sisters. But there were other reasons why he wanted the exposure his pseudonym would offer. A deeply religious man, Dodgson considered his mathematical abilities to be a gift that he should use in the service of God. In a letter to his mathematically talented sister, Louisa, dated 28 September 1896, he wrote:

[W]hereas there is no living man who could (or at any rate would take the trouble to) & finish, & publish, the 2nd Part of the Logic. Also I have the Logic book in my head….So I have decided to get Part II finished first….The book will be a great novelty, & will help, I fully believe, to make the study of Logic far easier than it now is: & it will, I also believe, be a help to religious thoughts, by giving clearness of conception & of expression, which may enable many people to face, & conquer, many religious difficulties for themselves. So I do really regard it as work for God. (Bartley 1977, pp. 366-371)

b. Venn and Carroll Diagrams

In their diagrammatic methods, both Venn and Carroll used simple symmetric figures, and they valued visual clarity and ease of drawing as the most important attributes. Like Boole and Jevons, both were in the tradition of calculus ratiocinator, that is, mechanical deduction. Each of them used a system of symbolic forms isomorphic to their diagrammatic forms.

Both Venn diagrams and Carroll diagrams are maximal, in the sense that no additional logic information like inclusive disjunctions is representable by them. But Carroll diagrams are easier to draw for a large number of sets because of their self-similarity and algorithmic construction. This regularity makes it simpler to locate and thereby erase cells corresponding to classes destroyed by the premises of an argument. Although both Venn and Carroll diagrams can represent existential statements, Carroll diagrams are capable of easily handling more complex problems than Venn’s system can without compromising the visual clarity of the diagram. Carroll only hinted at the superiority of his method when he compared his own solution to a syllogism with one that Venn had supplied.  (Carroll 1958, pp. 182-183)

In both Dodgson’s and Venn’s systems, existential propositions can be represented. The use of a small plus sign, ‘+’ in a region to indicate that it is not empty did not appear until 1894, and Dodgson reported it in his symbolic logic book. However, Dodgson may have been the first to use it. A MS worksheet on logic problems, probably from 1885, contains a variant of a triliteral diagram that has a ‘+’ representing a nonempty region. But in his published work, Dodgson preferred the symbol ‘1′ for a nonempty region and the symbol ’0’ to designate an empty region.

Both Venn and Carroll diagrams can represent exclusive disjunctions; neither can represent inclusive disjunctive statements like x + y when x and y have something in common. Exclusive disjunctions are important in syllogistic logic because existential statements like, ‘some x are y’ can be written as the disjunction, xyz or xyz¢; and the statement, ‘some y are z¢’ can be written as the disjunction, xyz¢ or x¢yz¢. Actually, it isn’t possible to represent general disjunctive information in a diagram without adding an arbitrary additional syntactic device, and that addition would result in a loss in the visual power of the diagram. Carroll also represented the universal set by enclosing the diagram, a feature Venn did not think important enough to bother with, but one that is essential in depicting the universe of discourse, a key concept in modern logic discussed by Boole and developed further by him.

Carroll’s fifteen syllogisms can be represented by Venn and even Euler diagrams, but not with the visual clarity of Carroll Diagrams. Carroll himself showed this when he presented a solution to a syllogism by Euler’s method, one that involves eighteen diagrams, and a solution that Venn provided for the same syllogism where, possibly for the first time, since it does not appear in the second edition of his symbolic logic book, Venn used a small ‘+’ to indicate a non-empty region. (Carroll 1958, pp. 180-182)

Anthony Macula constructed an iterative method to produce new Carroll diagrams, that he called (k+n)-grams where k > 4 and a multiple of four and n = 1, 2, 3, 4, by putting the 2k partitions of a k-gram into each of the partitions of an n-gram, respectively. The algorithm constructs a (k+n)-gram for any such k by iteration.  It’s now easy to see that Dodgson’s description in part I of Symbolic Logic of a 9-set diagram as composed of two 8-set diagrams, one for the inside and one for the outside of the eighth set, is the result of placing the partitions of an 8-gram into each of the two partitions of a 1-gram. And the 10-set diagram, that he described as an arrangement of four octo-literal diagrams in a square, is the result of putting the partitions of an 8-gram into each of the four partitions of a 2-gram. We observe that when k > 4, the construction of a new (k+n)-gram reverses the order of the insertion of the partitions because the insertions are multiples of 4-grams into n-grams. (Carroll 1958, pp. 178-9; Macula 1995, pp. 269-274)

Although Venn’s system is isomorphic to Boole’s logic of classes, it is not isomorphic to a Boolean algebra  because there is no way to illustrate inclusive disjunctive statements, that is, statements other than those that can be expressed in terms of the removal of classes as in the previous example, and in other exclusive disjunctive expressions like: x’w(yz’ + y’z),that is to say what is not x but is w, and is also either, y but not z, or z but not y. (Venn 1881, p. 102) Existential statements can be represented in Venn diagrams, and he provided the mechanism in the second edition of Symbolic Logic (actually two different representations: horizontal line shading, integers). The choice of a small plus sign in a region ‘+’ to indicate that it is not empty appears to have been made after 1894 and was reported by Carroll in his symbolic logic book. (Venn 1971, pp.131-132; Carroll 1958, p. 174)

In 1959, Trenchard More, Jr. proved what Venn knew to be true, that Venn diagrams can be constructed for any number of simply connected regions. His construction preserves the property Venn deemed essential, that each subregion is simply connected and represents a different combination of overlapping of all the simply connected regions bounded by the Jordan curves. But the diagrams resulting from More’s construction are quite complex and involve what More called a ‘weaving curve’. (More 1959, pp. 303-304)

For a large number of sets, Carroll diagrams are easier to draw because they are self-similar, that is, each diagram remains invariant under a change of scale, discontinuous, and capable of being constructed algorithmically. Their regularity makes it simpler to locate and erase cells that must be destroyed by the premises of a syllogistic argument, a task that is difficult to accomplish in Venn diagrams for five or more classes. For example, a five-set diagram results from placing a vertical line segment in each of the sixteen partitions of a four-set diagram, and a six-set diagram is obtained by putting the 22 partitions of a suitably reduced two-set diagram into each of the sixteen partitions of a four-set diagram. Seven-set and eight-set diagrams are similarly constructed. We see that each k-gram (a k-set diagram) has 2k partitions, for example, a five-set diagram has thirty-two partitions, while an 8-set diagram has two hundred fifty-six.

c. Dodgson’s ‘Methods’

Dodgson’s approach to solving logic problems led him to invent various methods. In Symbolic Logic, Part I these are the method of underscoring, the method of subscripts, and the method of diagrams. In part II they are the methods of barred premises and barred groups, although he did not refer to them as ‘methods’, and, most importantly, the method of trees. In Book (chapter) XII of part two of Symbolic Logic, instead of just exhibiting the solution tree piecemeal for a particular problem he gives a “soliloquy” as he works it through, accompanied by “stage directions” showing what he is doing to enable the reader to construct the tree in an amusing way. Bartley provides many examples of sorites problems solved by the tree method in Book xii of part II of symbolic logic. And several intricate puzzle problems solved by the tree method appear in Book xiii of part II of Symbolic Logic.

While his distinction as a logician relies on these visual innovations, Dodgson’s methods depend essentially on his idiosyncratic algebraic notation which he called the Method of Subscripts. He used letters for terms which can represent classes or attributes. (In part II of Symbolic Logic, letters are used to represent statements as well.) The subscript 0 on a letter denotes the negation of the existence of the object; the subscript 1 denotes the object’s existence. When there are two letters in an expression, it doesn’t matter which of them is first or which of them is subscripted because each subscript takes effect back to the beginning of the expression, that is, from right to left.

Bartley observed that existential import is implicit in Dodgson’s Method of Subscripts. Using this notation, Dodgson had no other way to separate subject from predicate, for example, xy1z′0 which expresses all xy are z, implies that there are some xy. But we can interpret this as either no xy are not z, or all xy are z, which are equivalent in modern logic usage. However, Dodgson may not have held this idea as a philosophical belief.

As George Englebretsen points out, “A good notation makes hidden things obvious…Carroll saw his own notation as at least simpler than Boole’s.” (Englebretsen 2007, p. 145)

When did Dodgson first use his tree method? Certainly, earlier than 16 July 1894 when he wrote in his diary that he had worked a problem of forty premises. This is the date when he constructed his last formal method which he called the Method of Trees. The essential characteristic of this method is that it uses a reductio ad absurdum approach, a standard proof method in geometry, where in order to prove that a set of retinends (the terms in the conclusion) is a nullity (empty), we start by assuming instead that it is an entity, then by a process of deduction, arriving at a contradiction of this assumption which proves that the set of retinends is indeed a nullity. He needed the new formal method to solve these more complicated problems because he understood that his diagram method would no longer suffice. The essential feature of the tree method is that when a conclusion following from a set of premises is assumed to be false, then if reasoning from it together with all the premises results in a contradiction, the original argument is proved to be valid. This is the earliest modern use of a truth tree employed to reason efficiently in the logic of classes.

On 4 August 1894 he connected his tree method with his Method of Underscoring, writing in his diary, “I have just discovered how to turn a genealogy into a scored Sorites.” (Abeles 1990, p. 30) It appears he planned to do further work with this method and its natural extensions, barred premises, and barred groups.

Three months later, he recorded:

Made a discovery in Logic,…the conversion of a “genealogical’ proof into a regular series of Sorites….Today I hit on the plan of working each column up to the junction – then begin anew with the Prem. just above and work into it the results of the columns, in whatever order works best….This is the only way I know for arranging, as Sorites, a No. of Prems much in excess of the No. of  Elims, where every Attribute appears 2 or 3 times in each column of the Table. My example was the last one in the new edition of Keynes. (Wakeling 2005, p.155)

In another letter to Louisa Dodgson, dated 13 November 1896, in which he answered questions she had raised about one of his problems that she was attempting to solve, we again see that Dodgson’s use of his visual methods progressed from his method of diagrams to his method of trees. He wrote:

As to your 4 questions,…The best way to look at the thing is to suppose the Retinends to be Attributes of the Univ. Then imagine a Diagram, assigned to that Univ., and divided, by repeated Dichotomy, for all the Attributes, so as to have 2n Cells, for n Attributes. (A cheerful Diagram to draw, with, say, 50 Attributes!

(There would be about 1000,000,000,000 Cells.) If the Tree vanishes, it shows that every Cell is: empty. (Weaver Collection, reproduced in Abeles 2005, p. 40)

Dodgson considered the tree method to be superior to the barred premises ‘method’. He wrote:

We shall find that the Method of Trees saves us a great deal of the trouble entailed by the earlier process. In that earlier process we were obliged to keep a careful watch on all the Barred Premisses so as to be sure not to use any such premiss until all its “Bars” had appeared in that Sorites. In this new Method, the Barred Premises all take care of themselves. (Bartley 1977, p. 287)

Before creating his tree method, Dodson used his ‘method’ of Barred Premises to guide the generation of the most promising (ordered) lists of the premises and partial conclusions to produce the complete conclusion of a sorites. He realized that too many of these lists would not lead to a proper conclusion, so he abandoned this approach in favor of his tree method. But modern automated reasoning programs can use a direct approach, suitably guided to prevent the proving of spurious partial results that are irrelevant to obtaining the complete result.

When Dodgson used his ‘method’ of Barred Premises to verify a tree, he guided the generation of the ordered lists by employing an ordering strategy known now as unit preference which selects first the propositions with the fewest number of terms. In his own words:

“[W]hen there are two Branches, of which one is headed by a single Letter, and the other by a Pair, to take the single Letter first, turn it into a Sorites, and record its Partial Conclusion: then take the double-Letter Branch: turn it also into a Sorites.” (Bartley 1977, p. 295)

When verifying a tree, he also employed a rule to eliminate superfluous premises (those premises that don’t eliminate anything). His rule was to ignore such a premise, even if it caused a branching of the tree. But in the absence of more powerful inference rules and additional strategies first developed in the twentieth century, he had no way to approach the solution of these multiliteral problems more efficiently.

The tree method is an extension of truth tables, migrating to trees from the tables is easy to do. (For a complete discussion of this topic, see Anellis 2004.) Using truth tables to verify inconsistency is straight forward, but very inefficient, as anyone who has worked with truth tables involving eight or more cases knows. Instead, the truth tree method examines sets of cases simultaneously, thereby making it efficient to test the validity of arguments involving a very large number of sentences by hand or with a computer. To test the validity of an argument consisting of two premises and a conclusion, equivalently determining whether the set of the two premise sentences and the denial of the conclusion sentence is inconsistent, by the method of truth tables involving say, three terms, requires calculating the truth values in eight cases to determine whether or not there is any case where the values of all three terms are true. But a finished closed tree establishes that validity of the argument by showing there are no cases in which the three sentences are true. However, if any path in a finished tree cannot be closed, the argument is invalid because an open path represents a set of counterexamples.

The modern tree method, as a decision procedure for classical propositional logic and for first order logic, originates in Gentzen’s work on natural deduction, particularly his formulation of the sequent calculus known as LK. But the route is not a direct one; the chief contributors being Evert W. Beth, Jaakko Hintikka, Raymond Smullyan, and Richard Jeffrey.

On 16 July 1894 Dodgson connected his tree method with his earlier work, the Method of Diagrams. He wrote, ‘It occurred to me to try a complex Sorites by the method I have been using for ascertaining what cells, if any, survive for possible occupation when certain nullities are given’ (Bartley 1977, p. 279)

The Journal Editor, in a Note to the article, Lewis Carroll’s Method of Trees: Its Origins in ‘Studies in Logic,’ remarked:

The trees developed by Carroll in 1894, which anticipate concepts later articulated by Beth in his development of deductive and semantic tableaux, have their roots in the work of Charles Peirce, Peirce’s students and colleagues, and in particular in Peirce’s own existential graphs.” (Anellis 1990, p. 22)

In a comprehensive article of his own, he suggested that “Perhaps this valuable contribution to proof theory [Dodgson’s tree method] ought to be called the Hintikka-Smullyan tree method, or even the Dodgson-Hintikka-Smullyan tree….” (Anellis 1990, p. 62).

In the eight books or chapters of Symbolic Logic, Part I. Elementary, Carroll introduces the concepts of things and their attributes, propositions and their types, diagrams and the diagrammatic method, syllogisms and their types, the more complex soriteses, and the two methods of subscripts and of underscoring.

When Dodgson used ‘the method barred premises’ to verify a tree, he guided the generation of the ordered lists by employing an ordering strategy known now as unit preference which selects first the propositions with the fewest number of terms. He also employed a rule to eliminate superfluous premises (those premises that do not eliminate anything) when verifying a tree. His rule was to ignore such a premise, even if it caused a branching of the tree. But in the absence of more powerful inference rules and additional strategies he had no way to approach the solution of these multi-literal problems more efficiently.

While contemporaries such as Venn used diagrams for representing logical problems in logic, Dodgson took a visual approach to doing so to a new level with his Method of Trees. It was one of two additional methods of formal logic he presented in part II of Symbolic Logic. The first, a direct approach to the solution of multi-literal soriteses that he called barred premises, is an extension of his underscoring method. A barred premise is one in which a term t occurs in one premise and its negative tN occurs in two or more premises, and conversely. For example, if a premise contains the term a and the two eliminand terms bc, then abc is a nullity implying that a has the pair of attributes: bcN or bNc or bNcN, that is, a is barred by the nullity from having attributes bc.

Dodgson extended this idea to what he called a barred group: when a term t occurs in two or more premises and tN also occurs in two or more premises. His rule for working with barred premises requires that all the premises barring a given premise be used first. Dodgson did not define this method explicitly, so we will call these definitions and the rule for working with them his Method of Barred Premises. It is an early formal technique to guide the order of use of the premises of a sorites to arrive at the conclusion.

It appears he planned to do further work with his tree method and method of barred groups. In an unpublished letter whose first page is missing, probably from late in 1896 or early in 1897, he wrote, most probably to his sister Louisa:

 I have been thinking of that matter of “Barred Groups”…. It belongs to a most fascinating branch of the Subject, which I mean to call “The Theory of Inference”:…. Here is one theorem. I believe that, if you construct a Sorites, which will eliminate all along, and will give the aggregate of the Retinends as a Nullity, and if you introduce in it the same letter, 2 or 3 times, as an Eliminand, and its Contradictory the same number of times, and eliminate it each time it occurs, you will find, if you solve it as a Tree, that you don’t use all the Premisses! (Weaver Collection, undated; reproduced Abeles 2005, p.40)

An example, called the ‘The Pigs and Balloons Problem,’ is in Bartley on pp. 378-80. There Dodgson created a Register of Attributes showing the eliminands (a term that appears in both rows of the Register, that is, in positive and in negative form in two premises). When a term appears in both rows and in one row in more than two premises, we have the case of barred premises. All other terms are retinends.

His almost obsessive concern with exactness introduced a certain stiffness into many of his serious mathematical writings, but the humor he uses is infectious and infuses these works, particularly those on logic, with an appealing lightness. That his use of humor set his work apart is apparent in reviews of Symbolic Logic, Part I. Elementary that appeared during his lifetime.

An anonymous reviewer of the book wrote in The Educational Times that “[T]his very uncommon exposition of elementary logic appears to have tickled the fancy of folk.” (July 1, 1896, 316) The quotations that continue to be cited by modern authors, particularly from his logic books, reinforce this view. However, the reaction of the mathematician, Hugh MacColl, the anonymous reviewer of Symbolic Logic, Part I. Elementary in The Athenaeum, was mixed. He described Carroll’s diagrammatic method for solving logical problems as elegant, but he was critical of Carroll’s notation (subscript method) and use of existential import which asserts the existence of the subject in A propositions. For example, the proposition, “All philosophers are logical,” implies the existence of at least one philosopher. MacColl added, ‘[W]e cannot say what important surprises parts ii. and iii. of his system may have in store for us when they make their appearance.’ (October 17, 1896, pp. 520 – 521)

Hugh MacColl’s views on logic were influenced by reading Dodgson’s Symbolic Logic, Part I. Both MacColl and Dodgson were active contributors to the ‘Mathematical Questions and Solutions’ section of The Educational Times. And at least once, they were concerned with the same question in probability. MacColl submitted a solution to Dodgson’s logical problem, Question 14122, a version of the Barbershop Paradox published posthumously.

In addition to clear exposition and the unusual style that characterize his books, there seems to be one more essential affinity that supported MacColl’s attraction to Carroll’s work. Their exchanges show that both had a deep interest in the precise use of words. And both saw no harm in attributing arbitrary meanings to words, as long as the meaning is precise and the attribution agreed upon.

It seems clear that between August and December of 1894, Dodgson  may have been considering a direction that was more formally developed later by Hugh MacColl as early as 1896-97, and expanded in his 1906 book, Symbolic Logic and Its Applications, where he defined strict implication, in which the content of the antecedent and consequent have a bearing on the validity of the conditional, twenty years before modal logic began to be placed on a modern footing beginning with the work of the American philosopher and logician, Clarence Irving Lewis.

4. The Automation of Deduction

The beginning of the automation of deduction goes back to the 1920s with the work of Thoralf Skolem who studied the problem of the existence of a model satisfying a given formula, and who introduced functions to handle universal and existential quantifiers. Other logicians such as David Hilbert, Wilhelm Ackermann, Leopold Löwenheim, Jacques Herbrand, Emil Post,  and a little   later, Alonzo Church, Kurt Gödel, and Alan Turing introduced additional  important ideas. One of the most important, a consequence of Hilbert’s metamathematical framework, was the notion that formalized logic systems can be the subject of mathematical investigation. But it was not until the 1950s that computer programs, using a tree as the essential data structure, were used to prove mathematical theorems.

The focus of these early programs was on proofs of theorems of propositional and predicate logic. Describing the 1957 ‘logic machine’ of Newell, Shaw, and Simon, Martin Davis noted that a directed path in a tree gave the proof of a valid argument where its premises and conclusion were represented as nodes, and an edge joining two premise nodes represented a valid derivation according to a given set of rules for deriving the proofs.

The modern tree method, as a decision procedure for classical propositional logic and for first order logic, originated in Gerhard Gentzen’s work on natural deduction, particularly his formulation of the sequent calculus known as LK. But the route was not a direct one, the main contributors being Evert Beth, Richard Jeffrey, Jaakko Hintikka and Raymond Smullyan. In 1955, Beth presented a tableau method he had devised consisting of two trees that would enable a systematic search for a refutation of a given (true) sequent. A tree is a left-sided Beth tableau in which all the formulae are true. The rules for decomposing the tree, that is, the inference rules, are equivalent to Gentzen’s rules in his sequent calculus.

Bartley had this to say about Dodgson’s tree method for reaching valid conclusions from sorites and puzzle problems:

Carroll’s procedure bears a striking resemblance to the trees employed . . .according to a method of ‘Semantic Tableaux’ published in 1955 by the Dutch logician, E. W. Beth. The basic ideas are identical. (Bartley 1977, p. 32)

Dodgson was the first person in modern times to apply a mechanical procedure, his tree method, to demonstrate the validity of the conclusion of certain complex problems. The tree method is a direct extension of truth tables and Dodgson had worked with an incomplete truth table in one of the solutions he gave to his Barbershop Problem in September 1894. Bartley writes, “The matrix is used…for the components; but the analysis and assignment of truth values to the compounds are conducted in prose commentary on the table.” (Bartley 1977, p. 465n.)

On 4 August, he connected the tree method with a scored sorites:

I have just discovered how to turn a genealogy into a scored Sorites: the difficulty is to deal with forks. Say ‘all a is b or c’ = ‘all A is b’ and ‘all α is c,’ where the two sets A, α make up a. Then prove each column separately. (Wakeling, 2005, p. 158)

On 30 October, using a problem from a new edition of Keynes book, Studies and Exercises in Formal Logic, he discovered how to navigate a tree representing a sorites with 21 premises having 10 attributes of which 8 are eliminated. (Wakeling 2005, p. 181)

When an open branch is divided into two branches and a term, here bʹ, appears in one of the branches and its negation is added to the other branch, we have an example of the use of the cut rule. Dodgson has anticipated a method that was not fully worked out until the 1930s. He wrote:

It is worthwhile to note that in each case, we tack on to one of the single Letters, the Contradictory of the other: this fact should be remembered as a rule….We have now got a Rule of Procedure, to be observed whenever we are obliged to divide our Tree into two Branches. (Bartley 1977, p. 287)

He continued to discover new ways to improve his handling of trees, recording in his diary on November 12/13, 1896, “Discovered [a] method of combining 2 Trees, which prove abcʹ0 † abdʹ0, into one proving ab(cd)ʹ0, by using the Axiom cd(cd)ʹ0.” (Wakeling 2005, p. 279)

In an exchange of letters in October and November of 1896 to John Cook Wilson, Wykeham Professor of Logic at Oxford, Dodgson modified an eighteen-premise version of a problem containing superfluous premises to one with fifteen premises. Bartley includes both versions as well as their solutions by the tree method.

In an unpublished letter dated 25 September 1896 to the Cook Wilson, in connection with a sorites problem Dodgson wrote:

What you say about ‘superfluous Premisses’ interests me very much. It is a matter that beats me, at present . . . &, if you can formulate any proof enabling you to say ‘all the Premises are certainly needed to prove the Conclusion,’ I shall be very glad to see it. (Dodgson, Sparrow Collection, 25 September 1896. Courtesy Morton N. Cohen)

The difficulty of establishing a theorem to determine superfluous premises troubled him. It was a problem he was unable to solve.

5. Dodgson’s Logic Circle

John Venn was another English logician whose work Dodgson was familiar with and with whom he had contact. Venn, a supporter of Boole’s approach to logic, published the first edition of his Symbolic Logic in 1881. It included his now familiar diagrams to depict the relations between classes so that the truth or falsity of propositions employing them could be established.

In 1892 William E. Johnson had published the first of three papers in Mind titled “The Logical Calculus” where he distinguished the term, conditional from the term, hypothetical. Dodgson, like most logicians of his time did not make this distinction, using the term hypothetical for both situations. Johnson’s view was that a conditional expresses a relation between two phenomena, while a hypothetical expresses a relation between two propositions of independent import. So, a conditional connects two terms, while a hypothetical connects two propositions. John Neville Keynes, with whose work Dodgson was quite familiar, agreed with Johnson’s view. Venn, however, although he, too, knew Johnson’s work, held a very different view of hypotheticals, contending that because they are of a non-formal nature, they really should not be considered part of symbolic logic.

William Stanley Jevons was another supporter of Boole whose books, Pure Logic; or, the Logic of Quality Apart From Quantity (1864) and The Principles of Science: A Treatise on Logic and Scientific Method (1874) Dodgson owned. Jevons introduced a logical alphabet for class logic in 1869, and the following year he exhibited a machine that used it for solving problems in logic mechanically, which he called the logical piano, to the Royal Society in London

Dodgson was very familiar with Keynes’ Studies and Exercises in Formal Logic in its second edition from 1887, quoting directly from it in chapter II of Book X in part II of Symbolic Logic. Keynes included Dodgson’s Barbershop Paradox as an exercise in chapter IX of the 1906 edition of his book.  (Keynes 1906, pp. 273 – 274)   

a. The ‘Alice’ Effect

 In an exchange of letters between Venn and Dodgson in 1894, and from the reviews that appeared soon after the publication of both The Game of Logic and Symbolic Logic, Part I, we see that Dodgson’s reputation as the author of the ‘Alice’ books cast him primarily as an author of children’s books and prevented his logic books from being treated seriously. The barrier created by the fame Carroll deservedly earned from his Alice books combined with a writing style more literary than mathematical, prevented the community of British logicians from properly recognizing him as a significant logician.

His own more literary style of writing contributed to this impression. That this was his reputation is apparent in reviews of Symbolic Logic, Part I during his lifetime. Certainly, most of his contemporaries were unaware of the importance of his diagrammatic method for solving syllogisms that he first presented in The Game of Logic. In an unpublished letter to Venn dated 11 August 1894, he wrote:

‘You are quite welcome to make any use you like of the problem I sent you, & (of course) to refer to the article in ‘Mind’ – [A Logical Paradox, N. S. v. 3, 1894, pp. 436-438 concerning an example of hypothetical propositions] Your letter has, I see crossed one from me, in which I sent you ‘Nemo’s algebraical illustration. I hope you may be able to find room for it in your next book. Perhaps you could add it, as a note, at the end of the book, & give it, at p. 442, a reference thereto? I shall be grateful if you will not mention to anyone my real name, in connection with my pseudonym. I look forward with pleasure to studying the new edition of your book.” (Venn Papers, Gonville and Caius Libraries, Cambridge University)

And on p. 442 of the second revised edition of his Symbolic Logic Venn wrote:

[T]hat the phrase ‘x implies y’ does not imply that the facts concerned are known to be connected, or that the one proposition is formally inferrible from the other. This particular aspect of the question will very likely be familiar to some of my readers from a problem recently circulated, for comparison of opinions, amongst logicians. As the proposer is, to the general reader, better known in a very different branch of literature, I will call it Alice’s Problem.

6. Logic Paradoxes

a. The Barbershop Paradox

An appendix to Book XXI contains eight versions of Dodgson’s Barbershop Paradox, one of which was published in Mind as “A Logical Paradox”. In another appendix to this book Bartley discusses Carroll’s other contribution to Mind, “What the Tortoise said to Achilles.” These two appendices make the issues Carroll dealt with in these published articles—along with the commentaries they engendered from modern logicians and philosophers—much more accessible.

The Barbershop problem was Dodgson’s first publication in the journal Mind. It is the transcription of a dispute which opposed him to John Cook Wilson. Bertrand Russell used the barber shop problem in his Principles of Mathematics to illustrate his principle that a false proposition implies all others. Venn was one of the first to discuss it in print, in the second edition of his Symbolic Logic. Bartley includes eight versions of Dodgson’s Barbershop Paradox, one of which was published in Mind, together with extensive commentary.

In the Barbershop Paradox, there are two rules governing the movements of three barbers Allen, Brown, and Carr.  The first is that when Carr goes out, then if Allen goes out, Brown stays in. The second rule is that when Carr goes out, Brown goes out. The challenge is to use these rules to determine Carr’s possible movements. In a lively two-year correspondence from late 1892 preserved in the Bodleian Library, Dodgson and Cook Wilson honed their differing views on the Barbershop Paradox. Wilson believed that all propositions are categorical and therefore hypotheticals could not be propositions.

The unsettled nature of the topic of hypotheticals during Dodgson’s lifetime is apparent at the beginning of the Note that Carroll wrote at the end of his article:

This paradox…is, I have reason to believe, a very real difficulty in the Theory of Hypotheticals. The disputed point has been for some time under discussion by several practised logicians, to whom I have submitted it; and the various and conflicting opinions, which my correspondence with them has elicited, convince me that the subject needs further consideration, in order that logical teachers and writers may come to some agreement as to what Hypotheticals are, and how they ought to be treated. (Carroll 1894, p. 438)

Bartley remarks in his book that the Barbershop Paradox is not a genuine logical paradox as is the Liar Paradox. Generally, a paradox is a statement that appears to be either self-contradictory or contrary to expectations.

The many versions the Barbershop Paradox that Dodgson developed demonstrate an evolution of his thoughts on hypotheticals and material implication in which the connection between the antecedent and the consequent of the conditional (if (antecedent), then (consequent)) is formal, that is, it does not depend on their truth values. This is a result of Boole’s logic. Six versions of the Barbershop Paradox provide insight into Dodgson’s thinking about the problem as it evolved. Bartley published five of these six as well as three others, two of which are examples; one is almost the same as one of the others Bartley published. Additionally, there are three earlier versions that Bartley did not publish; all are from March 1894.

Earlier versions of the “Barbershop Paradox” show the change in the way Dodgson represented conditionals. In the earlier versions, he expressed a hypothetical proposition in terms of classes, that is, if A is B, then C is D. Only later did he designate A, B, C, and D as propositions.

A version of the Barbershop Paradox that was not recognized as such by Bartley, Question 14122, was published in February 1899 in The Educational Times after Dodgson’s death and reprinted in Mathematical Questions and Solutions the next year. Two different solutions appeared that same year, one by Harold Worthington Curjel, a member of the London Mathematical Society, the other by Hugh MacColl. (For a more detailed discussion of the Barbershop Paradox, see A. Moktefi’s publications.)

An article titled, “A Logical Paradox”, published in 1894 in Mind, generated responses in the form of subsequent articles published in Mind by many of the eminent logicians of Dodgson’s time, including Hugh MacColl, E. E. Constance Jones, Lecturer in Logic at Girton, one of the Women’s colleges at Cambridge, Alfred Sidgwick, the author of Fallacies. A View of Logic from the Practical Side, as well as Johnson, Keynes, Cook Wilson, and Russell.

A letter dated 11 August 1894 from Dodgson to John Venn resulted in Venn including a version of the Barbershop Paradox in the second edition (1884) of his book, Symbolic Logic. Keynes included a version of the Barbershop Paradox in his book, and Bradley discussed it in a book of Selected Correspondence.

Bertrand Russell gave what is now the generally accepted conclusion to this problem in his 1903 book, The Principles of Mathematics. If p represents ”Carr is out’; q represents ‘Allen is out’; r represents ‘Brown is out,’ then the Barbershop Paradox can be written as (1) q implies r; (2) p implies that q implies not-r. Russell asserted that the only correct inference from (1) and (2) is: if p is true, q is false, that is, if Carr is out, Allen is in. (Russell 1903, p. 18)

b. Achilles and the Tortoise

 Dodgson published a more consequential paradox in Mind the following year: “What the Tortoise Said to Achilles.” Although it did not generate any responses during Dodgson’s lifetime, many responses were received after his death, and it remains an unsolved problem to the current day. (See Moktefi and Abeles 2016.)

This is the paradox:

    1. Things that are equal to the same thing are equal to each other,
    2. The two sides of this triangle are things that are equal to the same.
    3. The two sides of this triangle are equal to each other.

Dodgson was the first to recognize that when making a logical inference, the rule that permits drawing a conclusion from the premises cannot be considered to be a further premise without generating an infinite regress.

Both the Barbershop and Achilles paradoxes involve conditionals and Dodgson employed material implication to argue them, but he was uncomfortable with it. He struggled with several additional issues surrounding hypotheticals. In the Note to the published version of the Barbershop Paradox in July 1894, Dodgson asked several questions, the first being whether a hypothetical can be legitimate when its premise is false; the second being whether two hypotheticals whose forms are ‘if A then B’ and ‘if A then not-B’ can be compatible.

Bartley published a second edition of Symbolic Logic, Part II in 1986 in which he included solutions to some of Carroll’s more significant problems and puzzles, additional galley proof discoveries, and a new interpretation, by Mark R. Richards, of Carroll’s logical charts.

By 1897, Dodgson may have been rethinking his use of existential import. Bartley cites a diary entry from 1896, and an undated letter to Cook Wilson as evidence (Bartley 1977, pp. 34–35.) However, there is even more evidence, incomplete in Bartley’s book, to support this break with the idea of existential import. Book (chapter) XXII contains Dodgson’s solutions to problems posed by other logicians. One of these solutions to a problem posed by Augustus De Morgan that concerns the existence of their subjects appears in an unaddressed letter dated 15 March 1897. (Bartley 1977, pp. 480–481) From Dodgson’s response to this letter six days later, we now know it was sent to his sister, Louisa, responding to her solution of the problem. In this unpublished letter, Dodgson suggested:

[I]f you take into account the question of existence and assume that each Proposition implies the existence of its Subject, & therefore of its Predicate, then you certainly do get differences between them: each implies certain existences not implied by the others. But this complicates the matter: & I think it makes a neater problem to agree (as I shall propose to do in my solution of it) that the Propositions shall not be understood to imply existence of these relationships, but shall only be understood to assert that, if such & such relationships did exist, then certain results would follow. (Dodgson, Berol Collection, New York University, 21 March 1897)

7. Dodgson and Modern Mathematics

In part II of Symbolic Logic, Dodgson’s approach led him to invent various methods that lend themselves to mechanical reasoning. These are the ‘methods’ of barred premises and barred groups and, most importantly, the method of trees. Although Dodgson worked with a restricted form of the logic of classes and used rather awkward notation and odd names, the methods he introduced foreshadowed modern concepts and techniques in automated reasoning like truth trees, binary resolution, unit preference and set of support strategies, and refutation completeness.

His system of logic diagrams is a sound and complete proof system for syllogisms. The soundness of a proof system ensures that only true conclusions can be deduced. (A proof system is sound if and only if the conclusions we can derive from the premises are logical consequences of them.)  Conversely, its completeness guarantees that all true conclusions can be deduced. (A proof system is complete if and only if whenever a set of premises logically implies a conclusion, we can derive that conclusion from those premises.)

Several of the methods Dodgson used in his Symbolic Logic contain kernels of concepts and techniques that have been employed in automatic theorem proving beginning in the twentieth century. The focus of these early programs was on proofs of theorems of propositional and predicate logic.

His only inference rule, underscoring, which takes two propositions, selects a term in each of the same subject or predicate having opposite signs, and yields another proposition, is an example of binary resolution, the most important of these early proof methods in automated deduction.

Although Dodgson did not take the next step, attaching the idea of inconsistency to the set of premises and conclusion, this method for handling multi-literal syllogisms in the first figure is a formal test for inconsistency that qualifies as a finite refutation of the set of eliminands and retinends. His construction of a tree uses one inference rule (algorithm), binary resolution, and he guides the tree’s development with a restriction strategy, now known as a set of support, that applies binary resolution at each subsequent step of the deduction only if the preceding step has been deduced from a subset of the premises and denial of the conclusion, that is, from the set of retinends. This strategy improves the efficiency of reasoning by preventing the establishment of fruitless paths. And this tree test is both sound and complete, that is, if the initial set of the premises and conclusion is consistent, there will be an open path through the tree rendering it sound; if there is an open path in the finished tree, the initial set of the premises and conclusion is consistent, rendering it complete.

A comparison of the two parts of Symbolic Logic reveals the progress Dodgson made toward an automated approach to the solution of multiply connected syllogistic problems (soriteses), and puzzle problems bearing intriguing names such as “The Problem of Grocers on Bicycles”, and “The Pigs and Balloons Problem”.

Many modern automated reasoning programs employ a reductio ad absurdum argument, while other reasoning programs that are used to find additional information do not seek to establish a contradiction. In 1985, one of Dodgson’s puzzle problems, the “Problem of the School Boys”, was modified by Ewing Lusk and Ross Overbeek to be compatible with the direct generation of statements (in clausal form) by an automated reasoning program. Their program first produced a weaker conclusion before generating the same stronger conclusion Dodgson produced using his tree method. The solution by Lusk and Overbeek in 1985 to Dodgson’s ‘Salt and Mustard Problem’ and by A G. Cohn 1989 to the same problem five years later used a many sorted logic to illustrate the power of two of these programs.

In computer science, a database has a state which is a value for each of its elements. A trigger can test a condition that can be specified by a when clause, that is, a certain action will be executed only if the rule is triggerred and the condition holds when the triggering event occurs.

Dodgson defined the term, Cosmophase, as “[t]he state of the Universe at some particular moment: and I regard any Proposition, which is true at that moment, as an Attribute of that Cosmophase.” (Bartley 1977, p. 481) Curiously, Dodgson’s definition of a Cosmophase fits nicely into this modern framework.

8. Carroll as Popularizer

Dodgson was both a popularizer and an educator of both mathematics and logic. He began teaching mathematics at St. Aldate’s School across from Christ Church in 1856. He considered The Game of Logic and to a greater degree, Symbolic Logic, Part I. Elementary, to be far superior to those in current use, and to be useful in teaching students between the ages of twelve and fourteen. The objective of the game, played with a board and counters, was to solve syllogisms. He believed his entire Symbolic Logic book, including the projected parts II and III would appeal to pupils up to the age of twenty, and hence be useful at the university level.

While he was the Mathematical Lecturer at Christ Church, he often gave free private instruction to family groups of parents, their children and their children’s friends in their private homes on such mathematical topics as ciphers, particularly his Memoria Technica cipher, arithmetical and algebraical puzzles, and an algorithmic method to find the day of the week for any given date. He originally created the Memoria Technica cipher in 1875 to calculate logarithms but found many more uses for it as a general aid for remembering, writing a simplified version of it for teaching purposes in 1888.

The topics he chose to teach privately focused on memory aids, number tricks, computational shortcuts, and problems suited to rapid mental calculation, developing this last topic into a book, Curiosa Mathematica, Part 2: Pillow Problems Thought Out During Wakeful Hours (1894) that was published in 1893. He continued to provide instruction in this way on logic topics. He also gave logic lessons in his rooms at Christ Church. In June 1886 he gave lectures at Lady Margaret Hall, Oxford and in May 1887 at the Oxford High School for Girls. There he lectured to both students and, separately, their teachers. He gave lectures at St. Hugh’s Hall, another of the women’s colleges at Oxford, in May and June of 1894. In January 1897 he began a course of lectures on symbolic logic at Abbot’s Hospital in Guildford.

He used material that he eventually incorporated into his book, The Game of Logic, a work he had essentially completed in July 1886, but that did not appear until November in an edition Dodgson rejected for being substandard. The second (published) edition came out in February of the following year. Dodgson hoped the book would appeal to young people as an amusing mental recreation. He found this book, and even more so, his Symbolic Logic, Part I. Elementary essential in teaching students. He believed his own book on symbolic logic was far superior to those in current use.

On 21 August 1894, answering a letter from a former child friend, Mary Brown, now aged thirty-two, he wrote:

You ask what books I have done…. At present I’m hard at work (and have been for months) on my Logic-book. (It really has been on hand for a dozen years: the “months” refer to preparing for the Press.) It is Symbolic Logic, in 3 Parts – and Part I is to be easy enough for boys and girls of (say) 12 or 14. I greatly hope it will get into High Schools, etc. I’ve been teaching it at Oxford to a class of girls at the High School, another class of the mistresses(!), and another class of girls at one of the Ladies’ Colleges. (Cohen 1979, p. 1031)

In a letter dated 25 November 1894 to his sister, Elizabeth, he wrote:

One great use of the study of Logic (which I am doing my best to popularise) would be to help people who have religious difficulties to deal with, by making them see the absolute necessity of having clear definitions, so that, before entering on the discussion of any of these puzzling matters, they may have a clear idea what it is they are talking about. (Cohen 1979, p. 1041)

The statements of almost all the problems in both parts of his symbolic logic books are amusing to read. This attribute derives from the announced purpose of the books, to popularize the subject. But Dodgson naturally incorporated humor into much of his serious mathematical writing, infusing this work with the mark of his literary genius.

Edward Wakeling notes that his logic teaching took three forms: a series of lessons in a school, lessons to a small group of friends or families he knew or teaching a single confident, intelligent and alert child-friend. This last method was his favorite. Edith Rix, to whom he dedicated A Tangled Tale (1885) in the form of an eight-line acrostic poem in which the second letter of each line spells her name, was his first logic pupil.  Dodgson wrote many letters to her concerning problems in logic. She was, it is reported he said, the cleverest woman he ever knew.

In the Appendix Addressed to Teachers from part I of Symbolic Logic, fourth edition, Carroll indicated some of the topics he planned for part II. These include “[T]he very puzzling subjects of Hypotheticals, Dilemmas, and Paradoxes.” (Bartley 1977, p. 229) Dodgson was generally interested in the quality of arguments, particularly those that could confuse. Paradoxes fall in this category because they appear to prove what is known to be false. And paradoxes certainly challenged him to create ingenious methods to solve them, such as his tree method.

Dodgson expressed his thoughts about how best to teach logic to young people in “A Fascinating Mental Recreation for the Young” when he wrote:

As to the first popular idea – that Logic is much too hard for ordinary folk, and specially for children, I can only say that I have taught the method of Symbolic Logic to many children, with entire success…High-School girls take to it readily. I have had classes of such girls, and also of the mistresses,….As to Symbolic Logic being dry, and uninteresting, I can only say, try it! I have amused myself with various scientific pursuits for some forty years and have found none to rival it for sustained and entrancing attractiveness. (Carroll 1896, reproduced in Abeles 2010, pp. 96-97)

9. Conclusion

The inspiration for much of what Dodgson wrote about logic came from his contacts with faculty members at other colleges in Oxford, in Cambridge and elsewhere. He communicated his work within a circle of colleagues and solicited their opinions. Unlike most of them, he did not seek membership in the professional mathematical and philosophical societies, nor did he attend their meetings or give lectures, with few exceptions. He was not a traditional mathematician. Rather, he applied mathematical and logical solutions to problems that interested him. As a natural logician at a time when logic was not considered to be a part of mathematics, he successfully worked in both fields.

Although the ingenuity of the puzzles and examples Dodgson created were generally applauded, Bartley’s claims about the significance of Dodgson’s work were questioned, so that its value in the development of logic was not fully appreciated when the book was first published. But subsequently, other scholars working on Carroll’s logic and mathematical writings such as Duncan Black, George Englebretsen, Amirouche Moktefi, Adrian Rice, Mark Richards, Eugene Seneta, Edward Wakeling and Robin Wilson have made important discoveries that have greatly enhanced Carroll’s reputation.

Why did scholars become interested in Dodgson’s serious work only in the second half of the twentieth century? In addition to Bartley’s publication of Carroll’s Symbolic Logic book, there are several more reasons. One of the most important is the role certain publishers played in making his work available. These include: Clarkson N. Potter, and Dover Press in the USA, and Kluwer in the Netherlands whose books were distributed both in the USA and in the UK. The articles in Martin Gardner’s popular ‘Mathematical Games’ section of Scientific American magazine also included several of Dodgson’s mathematical ideas and were invaluable sources of information for scholars. Another important reason is that only in the twentieth century did some of his mathematical and logical ideas find application, in the sense that his work foreshadowed their use. Dodgson’s mathematical and logical work was broadly based, but his influence on important developments in the twentieth century occurred primarily after his death.

10. References and Further Reading

a. Primary

  • Boole, G. An Investigation of the Laws of Thought. London, Macmillan, 1854.
  • Boole, G. The Mathematical Analysis of Logic. London, Macmillan, 1847.
  • Bradley, F.H. The Principles of Logic, London, Oxford University Press, 1883.
  • Carroll, C.L. The Game of Logic. Macmillan, London, 1887.
  • Carroll, C.L. Symbolic logic: Part I. London, Macmillan, 1896.
  • Carroll, C. L. The Game of Logic. Published with Symbolic Logic, Part I, as The Mathematical Recreations of Lewis Carroll, New York, Dover, 1958.
  • Carroll, L. “A Logical Paradox.” Mind v.3, n.11, 1894, pp. 436-438.
  • Carroll, L. What the Tortoise said to Achilles.” Mind, v.4, n.14, 1895, pp. 278-280.
  • Cohen, M. N. The Letters of Lewis Carroll. 2 vols. New York, Oxford University Press, 1979.
  • De Morgan, A. Formal Logic. London, Taylor & Walton, 1847.
  • De Morgan, A. On the Syllogism and Other Logical Writings. London, Routledge & Kegan Paul, 1966.
  • Dodgson, C. L. Euclid and his Modern Rivals. London, Macmillan, 1879.
  • Dodgson, C. L. Curiosa Mathematica. Part I: A New Theory of Parallels. London, Macmillan, 1888.
  • Dodgson, C. L. Curiosa Mathematica. Part II: Pillow Problems. London, Macmillan, 1893.
  • Jevons, W. S. Pure Logic, or the Logic of Quality Apart from Quantity, London, E. Stanford, 1864.
  • Johnson, W. E. “The Logical Calculus I, II, III”, Mind 1, pp. 3-30; II, pp. 235-250; III, pp. 340-357, 1892.
  • Keynes, J. N.  Studies and Exercises in Formal Logic, 3rd ed.. London, Macmillan, 1894.
  • Russell, B. Principles of Mathematics. Cambridge, Cambridge University Press, 1903.
  • Sidgwick, A. Fallacies: A View of Logic from the Practical Side. London, Kegan, Paul, Trench, 1883.
  • Venn, J. Symbolic Logic. London, Macmillan, 1881.
  • Venn, J. Symbolic Logic, 2nd revised ed. London, Macmillan, 1894.
  • Wakeling, E., ed. Lewis Carroll’s Diaries. v. 6. Clifford, Herefordshire, The Lewis Carroll Society, 2001.
  • Wakeling, E., ed. Lewis Carroll’s Diaries. v. 8. Clifford, Herefordshire, The Lewis Carroll Society, 2004.
  • Wakeling, E., ed. Lewis Carroll’s Diaries. v. 9. Clifford, Herefordshire, The Lewis Carroll Society, 2005.

b. Secondary

  • Abeles, F.F. “Lewis Carroll’s Method of Trees: Its Origins in Studies in Logic.” Modern Logic, v. 1, n. 1, 1990, pp. 25-35.
  • Abeles, F. F., ed. The Mathematical Pamphlets of Charles Lutwidge Dodgson and Related Pieces. New York, Lewis Carroll Society of North America,1994.
  • Abeles, F. F. “Lewis Carroll’s Formal Logic.” History and Philosophy of Logic v. 26, 2005, pp.33-46.
  • Abeles, F. F. “From the Tree Method in Modern Logic to the Beginning of Automated Theorem Proving.” In: Shell-Gellash, A. and Jardine, D., eds. From Calculus to Computers. Washington DC, Mathematical Association of America, 2005, pp. 149-160.
  • Abeles, F. F. “Lewis Carroll’s Visual Logic.” History and Philosophy of Logic v. 28, 2007, pp. 1-17.
  • Abeles, F. F., ed. The Logic Pamphlets of Charles Lutwidge Dodgson and Related Pieces.  New York, Lewis Carroll Society of North America, 2010.
  • Abeles, F. F. “Toward a Visual Proof System: Lewis Carroll’s Method of Trees.” Logica Universalis, v. 6, n. 3/4, 2012, pp. 521-534.
  • Abeles, F. F. “Mathematical Legacy.” In: Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019, pp. 177-215.
  • Anellis, Irving. “From Semantic Tableaux to Smullyan Trees: the History of the Falsifiability Tree Method.” Modern Logic, v. 1, n. 1, 1990, pp. 36- 69.
  • Corcoron, J. “Information-Theoretic Logic.” In Martinez, C. et al. eds. Truth in Perspective, Aldershot, Ashgate, 1998, pp.113-135.
  • Englebretsen, G. “The Tortoise, the Turtle and Deductive Logic.” Jabberwocky, v. 3, 1974, pp.11-13.
  • Englebretsen, G. “The Properly Victorian Tortoise.” Jabberwocky, v. 23, 1993/1994, pp.12-13.
  • Englebretsen, G., “The Dodo and the DO: Lewis Carroll and the Dictum de Omni.” Proceedings of the Canadian Society for the History and Philosophy of Mathematics, v. 20, 2008, pp. 142-148.
  • Macula, A. “Lewis Carroll and the Enumeration of Minimal Covers.” Mathematics Magazine, v. 69, 1995, pp. 269-274.
  • MacColl, H. “Review of Symbolic Logic, Part I, by Lewis Carroll.” The Athenaeum, 17 October 1896, pp. 520-521.
  • Marion, M. and Moktefi, A. “La Logique Symbolique en Sébat à Oxford à la Fin du XIXe Siècle : Les Disputes Logiques de Lewis Carroll et John Cook Wilson.” Revue d’Histoire des Sciences, v. 67 n. 2, 2014, pp. 185-205.
  • Moktefi, A. “Beyond Syllogisms: Carroll’s (Marked) Quadriliteral Diagram.” In: Moktefi, A., Shin, S.-J.,eds. Visual Reasoning with Diagrams, Basel, Birkhäuser, 2013, pp. 55-72.
  • Moktefi, A. ”On the Social Utility of Symbolic Logic: Lewis Carroll against ‘The Logicians’.” Studia Metodologiczne 35, 2015, pp.133-150.
  • Moktefi, A., “Are Other People’s Books Difficult to Read? The Logic Books in Lewis Carroll’s Private Library.” Acta Baltica Historiae et Philosophiae Scientiarum, v. 5, n. 1, 2017, pp.28-49.
  • Moktefi, A. “Logic.” In: Wilson, R. J., Moktefi, A., eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll), Oxford, Oxford University Press, 2019, pp. 87-119.
  • Moktefi, A. and Abeles, F. F. “The Making of ‘What the Tortoise Said to Achilles’: Lewis Carroll’s Logical Investigations toward a Workable Theory of Hypotheticals.” The Carrollian, v. 28, 2016, pp. 14-47.
  • Moktefi, A. “Why Make Things Simple When You Can Make Them Complicated? An Appreciation of Lewis Carroll’s Symbolic Logic”, Logica Universalis, volume 15 (2021), pages359–379.
  • More, T., Jr. “On the Construction of Venn Diagrams.” J. of Symbolic Logic v. 24,  n.4 , 1959, pp. 303-304.
  • Rice, Adrian.  “Algebra.”  In: Wilson, R. J., Moktefi, A., eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll), Oxford, Oxford University Press, 2019, pp. 57- 85.
  • Richards, M. Game of Logic. https://lewiscarrollresources.net/gameoflogic/.
  • Seneta, E. “Lewis Carroll as a Probabilist and Mathematician.” Mathematical Scientist, v. 9, 1984, pp. 79-84.
  • Seneta, E. “Victorian Probability and Lewis Carroll.” Journal of the Royal Statistical Society Series A-Statistics in Society, v. 175, n. 2, 2012, pp. 435-451.
  • Van Evra, J. “The Development of Logic as Reflected in the Fate of the Syllogism 1600-1900.” History and Philosophy of Logic, v. 21, 2000, pp. 115-134.
  • Wilson, R. “Geometry.” In: Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019, pp. 31-55
  • Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019.

 

Author Information

Francine F. Abeles
Email: fabeles@kean.edu
Kean University
U. S. A.

Existence

Since Thales fell into the well while gazing at the stars, philosophers have invested considerable effort in trying to understand what, how and why things exist. Even though much ink has been spilled about those questions, this article focuses on the following three questions:

(1) What is the nature of existence?

(2) Are there different ways/modes of existing?

(3) Why does something exist instead of nothing?

First, we review the main attempts to answer (1) and (2). These are questions about existence as such. Then, we show how those attempts have been used to address question (3). This is an ontological question, that is, a question, not about existence as such, rather about what exists.

Questions (1) is addressed in Sections 1 and 2. In Section 1, we discuss the orthodox view of existence: Existence is not a property of individual objects (often called a first-order property); rather, it is a property of properties of individual objects (second-order property). In the orthodox view, this leads to the tight connection between existence and quantification, which is expressed by terms like ‘something’ or ‘everything’ in natural language—this tight connection is illustrated by common practice to refer to the particular quantifier (‘something’) as the existential quantifier. In Section 2, we discuss two recent views that disagree with the orthodox view: Meinongianism and universalism. Meinongianism is the view that claims that (the unrestricted) particular quantifier is separated from existence—it is existentially unloaded— and that existence is a first-order property. In other words, some objects in the domain of quantification lack the first-order property of existence. Universalism also takes existence as a first-order property, but disagrees with Meinongianism regarding the following two points: first, it takes existence as a universal property, namely, a property that everything has; second, it takes (the unrestricted) particular quantifier as existentially loaded.

Question (2) is the subject matter of Section 3. To begin with, we introduce ontological pluralism, that is, the view according to which some things exist in a different way from others. After a brief historical introduction, we present a theological reason, a phenomenological reason and philosophical reason to endorse such a controversial view. Moreover, we focus our attention on how Kris McDaniel develops his own account of ontological pluralism in relation to Heidegger’s philosophy. To conclude, we briefly analyze van Inwagen’s argument against ontological pluralism and some possible replies.

Section 4 gives an example how the views on the nature of existence contribute to considering ontological question, by wrestling with question (3). We begin by discussing van Inwagen’s statistical argument: we present his argument and summarize some of the critiques against it. Then, we present McDaniel’s approach to question (3): by relying on ontological pluralism, he argues that, instead of wondering about why there is something rather than nothing, it would be more profitable to ask why there are ‘concrete material things’ rather than no ‘concrete material things’. To conclude, we compare McDaniel’s view and the Meinongian one defended by Priest.

Table of Contents

  1. Existence as a Second-Order Property and Its Relation to Quantification
  2. Existence as a First-Order Property and Its Relation to Quantification
    1. Meinongianism
    2. Universalism
  3. How Many Ways of Being Existent?
  4. Why Is There Something Rather than Nothing?
  5. Conclusion
  6. Appendix
  7. References and Further Reading

1. Existence as a Second-Order Property and Its Relation to Quantification

The orthodox view of existence, which is influenced by Frege and Quine’s view of existence, is summarized by the following two claims:

FQ1 Existence is not a first-order property of individual objects, rather a second-order property.

FQ2 Quantifiers are existentially loaded.

This section gives a brief explanation of these two claims.

To begin with, let us see how FQ1 is connected with the famous slogan that existence is not a predicate, which is often understood in the light of Kant’s claim that ‘being’ is not a real predicate. By a real predicate, he means a predicate that can be contained in a concept (or the definition of a concept) of an object. For example, the concept of the Empire State Building contains the following predicates: being a building, having a total height of 443.2 meters, and so on. These are real predicates. According to Kant, ‘being’ cannot be a part of any concept of any object. He says:

When I think a thing, through whichever and however many predicates I like (even in its thoroughgoing determination), not the least bit gets added to the thing when I posit in addition that this thing is. (Kant, 1781/1787, A600/B628, English translation, p. 567)

Then, what does ‘A is’ do? Kant distinguishes two different usages of ‘be’. First, being is used logically in the judgements of the form ‘A is B’, and “[i]n the logical use it [that is, being] is merely the copula of a judgment” (Kant 1781/1787, A 598/B 626, English translation, p. 567). On the other hand, when being is used in judgements of the form ‘A is’, such judgements state that all predicates in A are instantiated by an object. Since Kant regards being used in the latter way as existence, ‘A is’ is the same as ‘A exists’. So, according to Kant, the judgement ‘A exists’ tells us that some object instantiates all predicates in the concept A without adding any new predicate in A. (For more exegetical details about Kant’s notion of real predicates, see Bennett (1974) and Wiggins (1995) as classics, and Kannisto (2018) as recent work.)

From the contemporary viewpoint, one crucial feature of Kant’s view on existence is that it takes existence not as a first-order property (a property of individual objects) but as a second-order property (a property of properties of individual objects). Frege is one of the most prominent proponents of this view. To see his point, let us first examine his view on numbers. According to Frege, a statement about how many things there are is not about individual objects, but about a property (concept, in his terminology) of individual objects. For example,

If I say “Venus has 0 moons”, there is simply no moon nor agglomeration of moons for anything to be asserted of; but what happens is that a property is assigned to the concept “moon of Venus”, namely that of including nothing under it. (Frege, 1884, p. 59, our translation)

Furthermore, Frege claims that existence is essentially a matter of number. He says “[a]ffirmation of existence is in fact nothing but denial of the number nought” (Frege, 1884, p. 65, English translation, p. 65), that is, existence is the second-order property of being instantiated by at least one individual object. Or, more properly, an existential statement does not attribute a first-order property to individual objects; rather, it attributes a second-order property of being instantiated by at least one individual object to a first order property—in this sense, the apparent first-order property of existence is analyzed away. Thus, (4a) and (5a) are paraphrased as (or, their logical forms are) (4b) and (5b), respectively:

     (4) a. Dogs exist.
b. The property of being a dog is instantiated by at least one individual object.
     (5) a. Unicorns do not exist.
b. The property of being a unicorn is not instantiated by any individual object.

This way of understanding existence shows how existence is related to quantification. It is helpful for understanding the notion of quantification to compare it with the notion of reference. Reference is a way to talk about a particular object as having a property. For example, ‘Gottlob Frege’ is an expression to refer to a particular man, that is, Gottlob Frege, and by using a singular statement ‘Gottlob Frege is a mathematician’, we can talk about him as having the property of being a mathematician. Quantification is not a way to talk about a particular object, but a way to talk about quantities, that is, it is about how many things in a given domain have a property. Quantifiers are expressions for quantification. For example, ‘everything’ is a quantifier, and a statement ‘everything is fine’ says that all of the things in a given domain have the property of being fine. ‘Everything’ is a universal quantifier, since by using it we state that a property is universally instantiated by all things in the domain. ‘Something’ is also a quantifier, but it is a particular quantifier: By using it we only state that a property is instantiated by at least one particular thing in the domain, without specifying by which one(s) the property is instantiated.

By using the particular quantifier ∃, (4b) is restated as (6a), which is read as (6b), and (5b) as (7a), which is read as (7b).

     (6) a. ∃xdog(x)
b. Something is a dog.
     (7) a. ¬∃x(unicorn(x))
b. Nothing is a unicorn.

Since existential statements are properly paraphrased by using particular quantifiers as illustrated above, Frege holds that the particular quantifier is an existential quantifier. This connection is also endorsed in his ‘Dialog with Puenjer on Existence’ (1979), where he claims:

Every particular judgement is an existential judgement that can be converted into the ‘there is’ [‘es gibt’] form. E. G. ‘Some bodies are light’ is the same as ‘There are light bodies’ (Frege, 1979, p. 63.)

Frege thus endorses the view that the particular quantifier is existentially-loaded. (Even though this is a standard interpretation of Frege on quantifier and existence, it is an exegetical issue how much metaphysical significance we should find in Frege’s comments on existence. Priest claims that Frege’s usage of ‘exist’ is its idiomatic use in mathematics and thus “it is wrong to read heavy-duty metaphysics into this” (Priest, 2005/2016, p. 331).)

The view that existence is properly expressed by quantification is hard-wired in Quine’s (1948) criterion of ontological commitment, one of the most influential theories of metaontology in 20th century (a brief explanation of the technical notions appearing in this paragraph is found in the Appendix). According to his criterion, the ontological commitment of a theory is revealed by what the theory quantifies over: More precisely, a theory is committed to the existence of objects if and only if they must be values of variables bound by quantifiers appearing in a theory in order for the theory to be true (for Quine, a theory is a set of sentences of first-order predicate logic). For example, if a biological theory contains a sentence ‘∃x populationwithgeneticdiversity(x)’ (that is, ‘there are populations with genetic diversity’), the theory is committed to the existence of such populations. Quine’s criterion is more popularized as the following slogan:

(8) To be is to be the value of a bound variable.

‘To be’ here is understood as ‘to exist’, given that this is a criterion of ontological, that is, existential commitment. In this way, Quine’s criterion of ontological commitment inseparably ties existence and quantification. To sum, the orthodox view holds FQ1 and FQ2:

FQ1 Existence is not a first-order property of individual objects, rather a second-order property.

FQ2 Quantifiers are existentially loaded.

In other words, the apparent first-order property of existence is analyzed away in terms of the second-order property of being instantiated by at least one object, and this second-order property is expressed by the particular quantifier.

2. Existence as a First-Order Property and Its Relation to Quantification

So far, our discussion has been about the orthodox view on the nature of existence. In this section, we review two unorthodoxies. First of all, it became popular in the early twenty-first century to deny FQ1. According to such a view, existence is a first-order property of individual objects. The proponents of this view can be further divided into two main camps. The first camp is what we call universalism, which holds that the first-order property of existence is universal in the sense that every object has it. Maintaining FQ2, the advocates of this camp usually use the unrestricted existential quantifier to define the first-order property of existence so that everything in the domain of quantification exists. The second camp is called Meinongianism, which rejects not only FQ1 but also FQ2. According to Meinongianism, existence is a non-universal first-order property in the sense that some objects lack it, and the domain of quantification contains such nonexistent objects in addition to existent ones.

a. Meinongianism

The main claim of Meinongianism is that some objects exist, but some don’t. Contemporary Meinongians cash out this claim by detaching existence from quantifiers. The domain of (at least unrestricted) quantification contains not only existent objects but also nonexistent ones. Thus, with all due respect to Quine, to be existent is not to be the value of a variable. This claim is usually accompanied with another unorthodoxy, that is, the view that existence is a first-order property of individual objects. Moreover, Meinongianism holds a specific version of this claim. Let us call a property instantiated by all objects a universal property, and one instantiated by only some objects a non-universal property. Then, Meinongians hold that existence is a first-order non-universal property.

It is not easy to characterize what existence as a first-order property is (we will address this question soon). However, whatever it is, we have some intuitive ideas on what exists. Merely possible objects like flying pigs or talking donkeys do not exist; impossible objects like the round square or the perpetual motion machine do not exist; fictional characters like Doyle’s Sherlock Holmes or Murakami’s Sheep Man do not exist; mythical objects like Zeus or Pegasus do not exist; and so on. There can be some disagreement on exactly what objects should be counted as nonexistent, but such disagreement does not undermine the fact that we (at least some of us) have the intuition that some objects do not exist. Meinongians take our intuition at face value: these objects lack the property of existence. But, Meinongians continue, this doesn’t prevent us from talking about or referring to them nor from quantifying over them. We can, as the sentences in this paragraph clearly illustrate.

Meinongians take existence as just one of many properties of individual objects—an object may have it and may not. The nonexistence of an object does not deprive the status of being a genuine object—objecthood—from the object. As a genuine object, a nonexistent object can have various properties like being possible, being a flying pig and so on. Even some object has both properties of being round and being square, and thus, of being an impossible object.

Quine says that such an “overpopulated universe is in many ways unlovely” (Quine, 1948, p. 4), and many other theorists seem to agree with him. Putting aside such aesthetic evaluations, there are two main objections to Meinongianism that have had great influence for establishing the standard view in contemporary philosophy that Meinongianism is wrong. One is due to Russell (1905), according to which a theory that admits that even inconsistent objects are genuine objects entails contradictions: the non-square square is square and not square, and this is a contradiction. The other objection is due to Quine (1948), which says that there is no identity condition for nonexistent objects, therefore, we should not admit any of such objects as genuine objects. However, contemporary Meinongians have provided several different replies to these objections, and these lead to different versions of Meinongianism (Nuclear Meinongianism (Parsons, 1980; Routley, 1980, Jacquette, 2015), Dual-copula Meinongianism (Zalta 1988), Modal Meinongianism (Priest 2005/2016, Berto 2013)). Since this is not the right place for surveying contemporary Meinongian theories in detail, we just point out that there are consistent Meinongian theories that provide us a well-defined identity condition for existent objects and nonexistent objects. For a comprehensive and useful survey of this topic, see Berto 2013.

Then, what is existence for Meinongians? Some Meinongians (in particular Parsons) simply rely on our intuitive notion of existence according to which some objects do not exist (Parsons, 1980, p. 10). In so doing, they do not try to define existence (Parsons, 1980, p. 11). On the other hand, some Meinongians have proposed several different definitions of existence.

To begin with, according to Lambert, “Meinong held that existent objects are objects having location in space-time” (Lambert 1983, p. 13). Partly echoing Meinong, Zalta says “[b]y ‘exists,’ we mean ‘has a location in space’” (Zalta, 1988, p. 21). Priest has a different definition: Owing to Plato’s Sophist, he claims that “to exist is to have the potential to interact causally” (Priest, 2005/2016, p. xxviii). Given these definitions, they typically treat abstract objects like numbers or propositions as being nonexistent: they don’t have spatial locations nor causal power. Routley (1980) proposes two alternative definitions, but their formulations heavily depend on the details of his theory of nonexistent objects, and thus, we do not discuss them here (see also Rapaport, 1984; Paoletti, 2013).

At this point, one may wonder whether Meinongians equate existence as concreteness. This is not the case. At least, Meinongians don’t need to commit themselves to the equation. First, Persons explicitly rejects to “define ‘exist’ to mean something like ‘has spatio-temporal location”’ (Parsons 1980, p. 10). Moreover, he claims that his distinction between existence and nonexistence is about concrete objects: some concrete objects exist, but some don’t (cf. ibid, p. 10). Second, Priest points out that existence and concreteness behave differently in modal contexts. For example, according to Priest (2005/2016), (9a) is true from the view point of Meinongianism, but (9b) is false. This is because (i) Holmes is concrete and (ii) if it is not concrete but abstract, it could never have been concrete, since being abstract is a necessary property.

     (9) a. Holmes doesn’t exist, but could have existed.
b. Holmes is not a concrete object, but could have been a concrete object.

ote that Linsky and Zalta consider a third option which could undermine this argument: an object is neither concrete nor abstract, but could have been concrete (see Linsky and Zalta 1994; 1996).

b. Universalism

Even though some have an intuition that some objects do not exist (and consistent theories of nonexistent objects are available), many contemporary philosophers believe that everything exists. We call this view universalism. The main tenet of contemporary universalism is that existence is a first-order universal property which every object has. Thus it rejects FQ1. In what follows, we see how universalists define existence and confirm that they still hold FQ2.

To begin with, let’s see Frege. Answering to the question what ‘exist’ does in the explicitly existential statements like ‘some men exist’, Frege claims that it does nothing, in the sense that the word ‘exist’ is a predicate that any object universally satisfies. He tries to make this point clear by comparing ‘exist’ with ‘identical with itself’. Assuming that ‘A exists’ means the same as ‘A is identical with itself’ for any A, He claims:

the judgements ‘This table exists’ and ‘This table is identical with itself’ are completely self-evident, and that consequently in these judgements no real content is being predicated of this table. (Frege, 1979, pp. 62-63)

Note that, from this, Frege concludes that the word ‘exist’ does not properly express the property of existence. Indeed, he claims that this is an example that shows how we are easily deceived by natural language (Frege, 1979, p. 67). The true notion of existence is not expressed by the predicate ‘exist’. Rather, as we have seen, according to him, it is expressed by a particular quantifier.

Some contemporary philosophers accept the first half of Frege’s claim and reject its second half (cf. Evans, 1982; Kripke, 2013; Plantinga, 1976; Salmon, 1987; and so on). For them, it is quite legitimate to use the first-order predicate ‘exist’ as a predicate universally satisfied. Moreover, some philosophers claim not only that ‘exist’ is the first-order predicate universally satisfied but also that it expresses the property of existence, a first-order universal property. Salmon says “the [first-order] property or concept of being identical with something… is the sense or content of the predicate ‘exists’’” (Salmon, 1987, p. 64). Evans is less straightforward. According to him, the reference of the predicate ‘exist’ is “a first-level concept, true of everything” (Evans, 1982, p. 345), where a first-level concept is understood as a function from individual objects to truth values, and its sense is shown by the formula ‘∀x x satisfies ‘exist”. Finally, Plantinga says:

Among the properties essential to all objects is existence. Some philosophers have argued that existence is not a property; these arguments, however, even when they are coherent, seem to show at most that existence is a special kind of property. And indeed it is special; like self-identity, existence is essential to each object, and necessarily so. For clearly enough, every object has existence in each world in which it exists. (Plantinga, 1976, p. 148)

In short, he makes the following two points: (i) existence is a first-order property; (ii) it is necessarily the case that everything exists. The claim (ii) should not be confused with the claim that everything exists in every possible world. Thus, the view is compatible with the fact that whether an object exists or not is, in many cases, a contingent matter: The Empire State Building exists, but may not; the 55th state of USA does not exist, but it may; and so on.

Kripke makes the same points with a definition of existence, while carefully distinguishing existence from self-identity (Kripke, 2013, especially pp. 36-38).

He suggests defining x’s existence as ∃yy = x (both x and y are individual variables), he and claims that every object satisfies it. Two comments should be made. First, it is clear that this definition is based on the equation of the extension of existence and the domain of quantification and thus on the endorsement of FQ2. Second, from this definition, it follows that “‘for every x, x exists’ will be a theorem of quantification theory” (Kripke, 2013, p. 37). Thus, it is necessarily the case that everything exists: □∀x∃y(y = x) holds. However, Kripke emphasizes that this doesn’t entail that everything necessarily exists. Indeed, ∀x□∃y(y = x) doesn’t hold, while everything necessarily self-identical, that is, ∀xx = x holds. Existence and self-identity should not be equated with each other.

Finally let us review two main arguments for universalism. A main argument for universalism is one that appeals to the paradox of negative singular existentials (cf. Cartwright, 1960). Berto (2013, p. 6) summarizes this argument as follows:

(P1) To deny the existence of something, one refers to that thing;

(P2) If one refers to something, then that thing has to exist;

(C) To deny the existence of something, that thing has to exist.

Since (C) means that denying the existence of something is self-refuting, universalists claim, we cannot deny the existence of any object.

This argument has had huge influence on the debates in contemporary metaphysics and compels many contemporary metaphysicians, except Meinongians. Meinongians avoid concluding (C) by rejecting (P2). Rejecting FQ2, Meinongianim holds that the domain of quantification contains nonexistent objects. And we can refer to such nonexistent objects by using referential expressions like proper names.

Another major argument for universalism is proposed by Lewis (1990). According to Lewis, Meinongians have two different particular quantifiers: the existentially-unloaded one and the existentially-loaded one. On the other hand, the universalist has only one particular quantifier, which is existentially-loaded. Lewis claims that, contrary to what Meingnongians (in particular Routley) take for granted, the Meinongian existentially-unloaded quantifier is translated as the universalist existentially-loaded quantifier—Objecthood for Meinongians is existence for universalists. Moreover, under this translation, Lewis suspects that Meinongian distinction between what exists and what doesn’t is just a distinction between what is concrete and what is not.

However, as we have seen, a Meinongian needs not equate existence with concreteness. Moreover, the Meinongian can adopt a notion of objecthood which is different from the universalist notion of existence. As we have seen, Kripke defines existence as ∃y(x = y). On the other hand, Priest sees the logical equivalence of objecthood with self-identity: x is an object iff x = x (Priest, 2014a, p. 437). As Kripke points out, these two notions behave differently: if we define x’s existence as x’s self-identity, it follows that everything necessarily exists. As we have seen, since this contradicts the fact that many objects only contingently exist, Kripke rejects this definition of existence (Kripke, 2013, p. 38). On the other hand, if we logically equate x’s objecthood with x’s self-identity, it follows that everything is necessarily an object. This consequence is compatible with the fact that many objects contingently exist from the Meinongian point of view, since quantification ranges over not only existent but also non-existent objects. So far we have seen two contemporary alternatives to the orthodox views. Both Meinongianism and universalism hold that existence is a first-order property of individual objects. The main difference between them is about whether existence is a universal property that every object has. According to Meinongianism, it is not: Some objects lack the property of existence. On the other hand, universalism holds that existence is a universal property. At this point, one may wonder why we cannot synthesize these two theories by holding that there are two different kinds of existence; one is universal and the other is not. While we leave this particular question as an open question, in the next section we introduce the basic ontological framework to which this line of thought straightforwardly leads us: ontological pluralism.

3. How Many Ways of Being Existent?

The world is packed with entities. There are tables, chairs, monuments and dreams. There are holes, cracks and shadows. There is the Eiffel Tower in Paris, Leonardo’s Mona Lisa at the Louvre and the empty set as well. Needless to say, all these entities are very different from each other. At the end of the day, we can climb the Eiffel Tower and add a mustache to the Mona Lisa; however, neither of these activities are possible with the empty set. Facing such an abundant variety of entities, some philosophers think that, even though all these entities exist, they exist in different ways. The philosophical view according to which there are different ways of existence is known as ontological pluralism.

As Turner (2010) and McDaniel (2009; 2017) have already discussed, some historical figures have been interpreted as being committed to Ontological Pluralism. Some examples are: Aristotle (1984a; 1984b) to Saint Thomas (1993; 1961), from Meinong (1983; 1960) to Moore (1983; 1904), from Russell (1988) to Husserl (2001) and Heidegger (1962). Having said that, ontological pluralism does not simply represent an important idea in the history of philosophy. Far from being an archaeological piece in the museum of ideas, in the early twenty-first century, ontological pluralism has undergone an important revival in analytic philosophy through the works of McDaniel (2009; 2010; 2017) and Turner (2010; 2012). As Spencer points out, such a revival consists in a “defence” and “explication of the [historical] views” (Spencer 2012, p. 910).

If we look back at the history of philosophy, it is possible to find at least two motivations in support of ontological pluralism. The first one is theological. Famously, God has some features that no other entity seems to have: for instance, He is eternal, infinite and omniscient. Having said that, some theologians believe that God is so different in kind that it is impossible for any given feature to be truly ascribed to both God and His creatures. Unfortunately, this seems to be patently false. In fact, there is at least one feature that they must share, namely existence. At this point, philosophers and theologians have tried to overcome this conundrum by endorsing ontological pluralism and by admitting that God’s existence is different than the existence his creatures enjoy (compare McDaniel 2010, p. 693).

The second motivation is phenomenological. Phenomenologists are famous for claiming that all sorts of entities are given to us. In our everyday life, we experience tables, chairs, people and even logical concepts such as existential quantifiers and negation. Following the interpretation favoured by McDaniel (2009; 2010), Heidegger believes that, among all these entities, different ways of existence are given to us as well. For instance, we experience a first way of existence proper to pieces of equipment (that is, readiness-to-hand), a second way of existence proper to abstract entities (that is, subsistence) and a third way of existence proper to entities that are primarily characterized by spatio-temporal features (that is, presence-at-hand). If so, ontological pluralism might have a phenomenological ground (compare McDaniel 2010, p. 694).

More recently, analytic philosophers have added a third motivation in support of ontological pluralism. Consider a material object and the space-time region in which that material objects are located. There is a sense in which these two things, somehow, exist. However, a material object exists at a certain region of space-time and, therefore, its existence is relative to that region of space-time. This is not the case for space-time regions: their existence is not relative to another space-time region. Their existence is relative to nothing at all.

All this is supposed to show that, as suggested by McDaniel (2010) and summarized by Spencer (2012, p. 916), existence is systematically variably polyadic. On the one hand, existence can be either relative to something (see material objects) or relative to nothing (see space-time regions). This is what makes existence variably polyadic. On the other hand, there are many clusters of entities which systematically share the same kind of existence. For instance, material objects always exist at space-time regions, and space-time regions simply exist. This is what makes existence systematic. Now, according to some analytic philosophers, the fact that existence is systematically variably polyadic should nudge us to believe that material objects have one mode of existence (let’s say existence-at) and space-time regions have another mode of existence (let’s say simply-existence). Ontological pluralism is thus needed.

Until now, we have reviewed what ontological pluralism is and what are its main motivations. What about all the different ways in which ontological pluralism has been articulated, though? Well, needless to say, in the history of philosophy, we can find many different kinds of ontological pluralism. However, in this section, we will focus our attention on the one endorsed by McDaniel because it represents an original way of combining traditions and ideas that are not commonly merged. On the one hand, McDaniel appeals to some ideas rooted continental philosophy and, on the other hand, he employs some of the formal tools which are proper to contemporary logic.

McDaniel abandons the familiar landscape of analytic philosophy by arguing that, according to Heidegger, ‘existence’ is an analogical expression. (For an historical review of Being as an analogical expression which goes beyond Heidegger’s philosophy, see McDaniel, 2009, footnote 13.) In other words, “[‘existence’] has a generic sense, which, roughly, applies to objects of different sorts in virtue of these objects exemplifying very different features” (McDaniel 2009, p. 295). Following McDaniel’s favourite interpretation, Heidegger would certainly agree with the idea that there is a general concept of existence: exactly in virtue of its generality, this concept covers all entities whatsoever. However, Heidegger would also argue that some of these entities, in virtue of the features they exemplify, exist in a different way than others. As such, “there is multiplicity of modes of being [that is, existence]” (McDaniel 2009, p. 296). For instance, a hammer, a stone and a number all exist. However, a hammer is ready-to-hand, a number subsists and a stone is present-at-hand, as explained above.

Having said that, McDaniel tries to formulate this idea in the most precise way possible and, in so doing, he appeals to some of the resources offered by formal logic. According to McDaniel, the general sense of existence can be spelled out through the unrestricted existential quantifier: For any entity, x, that exists, we can truly say that ∃y(y = x) (compare McDaniel 2009, p. 301). Furthermore, McDaniel believes that the various modes of existence can be represented by restricted quantifiers, that is, quantifiers ranging over some proper subsets of the domain of the unrestricted one (McDaniel 2009, p. 302). This means that, according to what we have said until now, in order to properly articulate Heidegger’s phenomenology, we should employ at least three kinds of restricted quantifiers: (1) a ready-to-hand quantifier (∃ready-to-hand) which ranges only over pieces of equipment; (2) a present-at-hand quantifier (∃present-at-hand) which ranges only over entities that are uniquely characterized by spatio-temporal features and (3) a subsistential quantifier (∃subsistence) which ranges only over abstract entities.

Before continuing, it might be interesting to notice that McDaniel’s approach to ontological pluralism seems to be faithful to its Heideggerian roots at least in the following sense. Coherently with what Heidegger labels ontological difference, McDaniel’s ontological pluralism does not treat existence as an entity. Existence is neither a constant symbol (compare McDaniel 2009, p. 301) nor any special kind of property (compare McDaniel 2009, p. 301); otherwise, existence would appear to be, in the eyes of a philosopher engaging with first-order logic, something over which we can quantify and, therefore, an entity as well. Existence, following McDaniel, is captured by the quantifiers themselves and, for this reason, it cannot be taken to be an entity of any sort.

Regardless of the prestigious historical heritage that grounds ontological pluralism, in the contemporary analytic debate, this theory has not always been welcome. Even though many arguments had been moved against it (compare McManus 2009; Turner 2010), the one proposed by Peter van Inwagen (1998) resonated particularly loud. To begin with, van Inwagen underscores the deep connection between the activity of quantifying over entities and the activity of counting over entities. At the end of the day, when we say that, in the drawer of my desk, there are a pen, an eraser and a ruler, we are simply saying that there are three entities in the drawer of my desk. Now, in light of this connection, it seems fair to say that, if we believe that there are different ways of existing and that these different ways are captured by different quantifiers, there should be different ways of counting too. Moreover, if there are different ways of counting, there should be different kinds of numbers as well. However, with all due respect to the ontological pluralists, this seems evidently false. When we refer to three stones, three hammers and three geometrical figures, we do not use different numerical systems. In all these cases, the number three seems to be pretty much the same. Therefore, facing such an absurd conclusion, van Inwagen declares that there cannot be more than one way of existing.

Of course, the reply was not long in coming. First of all, while McDaniel (2009) agrees with van Inwagen that there is only one way of counting, which is represented by the unrestricted quantifier, he denies the validity of the inference that goes from the fact that there are many ways of existing to the fact that there are many ways of counting. Secondly, Turner (2010) argues that, from the fact that there are different ways of counting, it does not necessarily follow that there are different numbers. In order to argue so, he distinguishes between numbering relations (the relation between, for instance, two pens and the number two) and the numbers themselves. Against van Inwagen, Turner believes that there might be many different numbering relations and only one kind of numbers.

A final remark: Contemporary advocates of ontological pluralism presuppose the Quinean interpretation of quantification and hold that being with its modes is equated with existence with its modes. Someone might wonder whether this is an essential feature of ontological pluralism. This is not the case. For example, Meinong is an ontological pluralist to the extent that there are at least two different ways of being, that is, existence and subsistence, but he never equated existence with being.

4. Why Is There Something Rather than Nothing?

This article has been concerned with the notion of existence as such. It is natural to ask how different views on existence can influence ontological questions. In this section we examine how the views presented in the previous sections have been used to address a long-standing worry in the history of ontology: Why there is something rather than nothing? In particular, we discuss a Quinean strategy (that is, the strategy presented by van Inwagen), a strategy which employs ontological pluralism (that is, the strategy presented by McDaniel) and a Meinongian strategy (that is, the strategy presented by Priest).

Let’s begin with Peter van Inwagen, the champion of the so-called statistical argument (1996). Consider the actual world. This is the world we inhabit: in the actual world, the Eiffel Tower is in Paris, Leonardo painted the Mona Lisa and Duchamp added a moustache and a goatee to it. In the actual world, there is St. Andrews University, I miss the amazing lasagne of my grandmother and a terrible war was declared after the 11th of September 2001. This is the world we live in.

Of course, just by using our imagination, we can fantasize about other worlds that are not ours, even though they could have been. We can imagine a first possible world in which Duchamp painted the Mona Lisa and Leonardo added a moustache and a goatee to it. We can imagine a second possible world in which St. Andrews University does not exist, a third possible world in which I hate the horrible lasagne of my grandmother and a fourth possible world in which there was no war after the 11th of September 2001. In front of this uncontrolled proliferation of worlds, someone might wonder how many possible worlds we can actually conceive. Given the boundless power of our imagination, van Inwagen replies that we can have infinite possible worlds. Among them, one and only one is an empty world, that is, a world with no entities whatsoever.

At this point, it is important to recall that, since van Inwagen is a faithful Quinean, he understands existence in quantificational terms. This is the reason why, according to him, the theoretical framework introduced above can help us to understand why there is something (that is, there is at least an entity in the actual world) rather than nothing (that is, there are no entities in the actual world). Now, think about a lottery in which there are infinite tickets: only one of them is the lucky winner. In this case, the chance that a given ticket is the lucky winner is 0. By analogy, think about the infinite number of possible worlds described above: only one of them is the empty one. In this case, the chance that the actual world is the empty one is 0. Thus, the reason why there is something rather than nothing is that the empty world is, if not impossible, improbable as anything else.

Needless to say, many philosophers did not miss the opportunity to challenge this argument. Some of them have discussed the explanatory power of van Inwagen’s argument (see Sober 1983). Some others have debated van Inwagen’s assumption about the uniqueness of the empty world (see Grant 1981; Heil 2013; Carroll 1994). However, a very different approach has been proposed by Kris McDaniel (2013). He provocatively asks: What if it is not really philosophically important to establish why there is something rather than nothing? What if this riddle is just a silly question we might want to forget and move on to more interesting topics? Perhaps, McDaniel suggests, there are better questions to be asked. Let’s see why.

McDaniel is very serious about absences. This might be taken to be unsurprising since we all engage with them in a fairly liberal way. According to McDaniel, all of us are able to count absences, be gripped by grief because of them and ruminate on them. If we have a philosophical training, we might be able to classify them in different kinds as well. For instance, the shadow is one kind of absence (that is, the absence of light) while a hole is another kind of absence (that is, the absence of matter). In light of all these observations, McDaniel concludes that (a) an absence is something and (b) this something exists. He writes: “the absence of Fs exists if and only if there are no Fs” (McDaniel, 2013, p. 277).

Now, given what we wrote above, worrying about why there is something rather than nothing turns into a trivial matter. Consistent with the intuition defended by Baldwin (1996) and Priest (2014b), McDaniel argues that, when we use ‘nothing’ as a term, it is natural to think that it refers to a global absence: the absence of everything. If so, this absence is something and, consequently, it exists. This means that, even if there were nothing, the absence of everything would exist and, therefore, there would be something.

At this point, two remarks are necessary. First of all, McDaniel does not want to be committed to the line of thought presented above. At best, he cautiously claims that “it is not clearly incorrect” (McDaniel, 2013, p. 278). However, McDaniel is convinced that such a line of thought represents a warning light: in seeking the reason why there is something rather than nothing, we might easily discover that this worry is a pretty superficial one. According to McDaniel, we might avoid this danger by looking at the problem as an ontological pluralist and by moving our attention from the general meaning of existence (that is, the unrestricted quantifier) to some specific modes of existence (that is, the restricted quantifier). In other words, McDaniel suggests that it would be safer and more profitable to engage with what he labels a Narrow Question: why are there ‘concrete material things’ rather than no ‘concrete material things’?

The second remark is concerned with Meinongiansim. In a passing remark (2014b, p. 56), Priest runs a similar argument to the one presented by McDaniel. Like McDaniel, Priest takes ‘nothing’ to be a term which refers to a global absence. Furthermore, like McDaniel, he argues that there is something rather than nothing because, even when there is nothing, there is something, namely the absence of everything. It is interesting to notice that the difference between the two positions is all about their understanding of existence. According to McDaniel, existence in always spelled out in quantificational terms. As such, since there is something like the absence of everything, this something exists. Now, Priest believes that there is something like the absence of everything as well; however, contrary to McDaniel and given his Meinongian credo, he can still hold the position that it does not exist. From this point of view, given Priest’s Meinongian stance, wondering why there is something rather than nothing is not necessarily equivalent to wondering why something exists rather than nothing. This is all meant to show how different accounts of existence can generate different ways of understanding and answering one of the oldest philosophical questions: why is there something rather than nothing?

5. Conclusion

Here we are at the end of our survey about existence. Even though we covered a lot of material, it is fair to say that there are even more topics that, unfortunately, we don’t have space to address. For instance, in this article, we did not review the current philosophical debate devoted to understanding what kinds of things exist. A prominent example of this philosophical enterprise is represented by the dispute between nominalism and realism about properties. Furthermore, much more could have been said about the philosophical attempt of spelling out which things, in a given kind, exist. For instance, as we did not cover the debate on whether properties are abundant or sparse, we did not present the dispute about whether mathematical objects or fictional characters exist either. Finally, in order to suggest that there are many different ways in which ontological pluralism has been understood, we could not do more than present a long and, nonetheless, incomplete list of names.

Having said that, we hope that this article helps readers to navigate the extremely rich secondary literature about existence. We have tried to give a fairly complete overview of the most important ways of understanding such a complicated topic while, at the same time, trying to underscore the importance of the more unorthodox and under-considered philosophical accounts of existence. Given this abundance of intuitions, ideas, philosophical traditions, and ontological accounts, the hope is that the present work can represent, on the one hand, a helpful map to orient ourselves in this vast debate and, on the other hand, a nudge to explore new directions of research.

6. Appendix

In the formal language of logic, a quantifier like ∃ or ∀ is prefixed to a sentence together with a variable to form a new sentence. For example, dog(x) is a sentence, and prefixing ∃ together with a variable x to it results in a new quantificational sentence, that is, ∃xdog(x). A quantifier prefixed to a sentence together with a variable x binds every occurrence of x in the sentence in so far as it is not bound by another quantifier. So the variable x in dog(x) in ∃xdog(x) is bound by ∃ appearing at its beginning. Now, consider a formula that contains an unbound—free—variable x:

(10) dog(x) x is a dog

In (10), x is, so to speak, a blank which we can fill with different values. Note that it is nonsense to ask whether (10) intrinsically is true or not, as the question ‘is x plus 4 6?’ is an unanswerable question. Sentences like (10) are truth-evaluable only relative to a value that fills the blank—a value that is assigned to x: If we fill the blank with a particular dog, say, Fido, the result is true; if we do with a particular man, say, Frege, the result is false. The truth conditions of quantificational sentences like ∃xdog(x) or ∀xdog(x) are defined by using values of variables bound by quantifiers:

(11) a. xdog(x) is true iff for some value ddog(x) is true relative to assigned to x
(12) b. xdog(x) is true iff for any value ddog(x) is true relative to assigned to x

The domain of quantification is the set of things that can be values of variables of quantification.

7. References and Further Reading

  • Aquinas, T. (1993). Selected Philosophical Writings, Oxford University Press.
  • Aquinas, T. (1961). Commentary on the Metaphysics of Aristotle. Volume I, Henry Regnery Company.
  • Aristotle. (1984a). The Complete Works of Aristotle. Volume I, Oxford University Press.
  • Aristotle. (1984b). The Complete Works of Aristotle. Volume II, Oxford University Press.
  • Baldwin, T. (1996). ‘There might be nothing’, Analysis, 56: 231 ‒ 38.
  • Bennett, J. (1974). Kant’s Dialectic, Cambridge, Cambridge University Press.
  • Berto, F. (2013). Existence as A Real Property: The Ontology of Meinongianism, London: Springer.
  • Berto, F. and Plebani, M. (2015). Ontology and Metaontology: A contemporary Guide, London: Bloomsbury Academic.
  • Carroll, J. W. (1994). Laws of Nature, Cambridge: Cambridge University Press.
  • Cartwrite, R. (1960). ‘Negative existentials’, Journal of Philosophy, 57, 629-639.
  • Casati, F., and Fujikawa, N. (2019). ‘Nothingness, Meinongianism and inconsistent mereology’. Synthese, 196(9), 3739-3772.
  • Chisolm, R. (1960(. Realism and the Background of Phenomenology, The Free Press.
  • Evans, G. (1982). The Varieties of Reference, Oxford: Clarendon Press.
  • Findlay, J. N., (1963). Meinong’s Theory of Objects and Values, Second, Clarendon Press, Oxford University Press.
  • Fine, K. (1982). ‘The problem of non-existents’, Topoi, 1: 97-140.
  • Frege, G. (1884). Die Grundlagen der Arithmetik: Eine Logisch Mathematische Untersuchung ueber den Begriff der Zahl, Breslau: Verlag von Wilhelm Koebner. (English translation, The Foundation of Arithmetic: A Logico-Mathematical Enquiry into the Concept of Number, Second Revised Edition, translated by J. L. Austin, Oxford: Basil Blackwell, 1968).
  • Frege, G. (1979). Posthumous Writings, Oxford: Basil Blackwell.
  • Grant, E. (1981). Much ado about Nothing. Theories of Space and Vacuum from the Middle Ages to the Scientific Revolution, Cambridge: Cambridge University Press.
  • Heidegger, M. (1962). Being and Time, Harper & Row Publishing.
  • Heil, J. (2013). ‘Contingency’, in The Puzzle of Existence, Tyron Goldschmidt (ed.), New York: Routledge, 167 ‒ 181
  • Husserl, E. (2001). Logical Investigations. Volume II, Routledge Press.
  • Jacquette, D. (2015). Alexius Meaning: the Shepherd of Non-Being, (Synthese Library, vol 360), Springer.
  • Kannisto, T. (2018). ‘Kant and Frege on existence’, Synthese, 195: 34073432.
  • Kant, I. (1781/1787). Kritique der Reinen Vernunft. English translation: Guyer, R. and A. W. Wood (trans and eds). (1998), Critique of Pure reason, The Cambridge Edition of the works of Immanuel Kant, Cambridge: Cambridge University Press.
  • Kripke, S. (2013). Reference and Existence: The John Locke Lectures, Oxford: Oxford University Press.
  • Lambert, K. (1983). Meinong and the Principle of Independence: its Place in Meinong’s Theory of Objects and its Significance in Contemporary Philosophical Logic, Cambridge: Cambridge University Press.
  • Lewis, D. (1990). ‘Noneism or allism?’, Mind, 99, 23-31.
  • Linsky, B., and Zalta, E. (1994). ‘In defense of the simplest quantified modal logic’, Philosophical Perspectives 8: Logic and Language, J. Tomberlin (ed.), Atascadero: Ridgeview, 431-458.
  • Linsky, B., and Zalta, E. (1996). ‘In defense of the contingently concrete’, Philosophical Studies, 84: 283-294.
  • McDaniel, K. (2009). ‘Ways of being’ in Chalmers, Manley and Wasserman (eds.) Metametaphysics, Oxford University Press.
  • McDaniel, K. (2010). ‘A return to the analogy of being’, Philosophy and Phenomenological Research, 81: 688-717.
  • McDaniel, K. (2013). ‘Ontological pluralism, the gradation of being, and the question of why there is something rather than nothing’, in The Philosophy of Existence, Tyron Goldschmidt (ed.), New York: Routledge, 290-320. [33] McDaniel, K. (2017). The Fragmentation of Being. Oxford University Press.
  • Meinong, A. (1960). ‘On the Theory of Objects’ in Chisholm. 1960.
  • Meinong, A. (1983). On Assumptions, University of California Press.
  • Paoletti, M. (2013), ‘Commentary on Exploring Meinong’s Jungle and Beyond: an Investigation of Noneism and the Theory of Items’, Humana. Mente: Journal of Philosophical Studies, 25. 275-292.
  • Parsons, T. (1980). Nonexistent Objects, New Haven: Yale University Press.
  • Plantinga, A. (1976). ‘Actualism and possible worlds’, Theoria, 42, 139160.
  • Priest, G. (2014a). ‘Sein language’, The Monist, 97(4), 430-442.
  • Priest, G. (2014b). One: Being an Investigation into the Unity of Reality and of its Parts, including the Singular Object which is Nothingness, Oxford: Oxford University Press.
  • Priest, G. (2005/2016). Towards Non-Being: The Logic and Metaphysics of Intentionality, the 2nd Edition, Oxford: Oxford University Press.
  • Rapaport, W. (1984). ‘Review of Exploring Meinong’s JUngle and Beyond’, Philosophy and Phenomenological Research, 44(4), 539-552.
  • Quine, W. V. O. (1948). ‘On what there is’, in Review of Metaphysics, 48, 21-38, reprinted in Quine, W. V. O.(1953). From a Logical Point of View, Cambridge: Harvard University Press, pp. 1-19.
  • Routley, R. (1980). Exploring Meinong’s Jungle and Beyond, Canberra: RSSS, Australian National University.
  • Russell, B. (1905). ‘On denoting’, Mind, 14, No. 56, pp. 479-493.
  • Russell, B. (1988). The Problems of Philosophy, Prometheus Books.
  • Salmon, N. (1987). ‘Existence’, Philosophical Perspetives 1: 49-108.
  • Sober, E. (1983). ‘Equilibrium explanation’, Philosophical Studies, 43: 201210.
  • Sylvan, R. (1995). ‘Re-exploring item-theory’, Grazer Philosophische Studien, 50, 47-85.
  • Sylvan, R. (1997). Transcendental Metaphysics: From Radical to Deep Pluralism, Isle of Harris: White Horse Press.
  • Spencer, J. (2012). ‘Ways of being’, Philosophy Compass, 12: 910 ‒ 918.
  • Turner, J. (2010). ‘Ontological pluralism’, The Journal of Philosophy, 107: 5 ‒ 34.
  • Turner, J. (2012). ‘Logic and ontological pluralism’, Journal of Philosophical Logic, 41: 419-448.
  • Van Inwagen, P. (1996). ‘Why Is there anything at all?’, Proceedings of the Aristotelian Society, 70: 95-110.
  • Van Inwagen, Peter. (1998). ‘Meta-ontology’, Erkenntnis, 48: 233-250.
  • Wiggins, D. (1995). ‘The Kant-Frege-Russell view of existence: toward the rehabilitation of the second-level view’, in Modality, Morality and Belief, Essays in Honor of Ruth Barcan Marcus.

 

Author Information

Filippo Casati
Email: filippo.g.e.casati@gmail.com
The University of Lehigh
U. S. A.

and

Naoya Fujikawa
Email: fjnaoya@gmail.com
The University of Tokyo
Japan

 

Roderick M. Chisholm: Epistemology

Roderick M. Chisholm, a luminary of 20th century philosophy, is best known for his contributions in epistemology and metaphysics. His groundbreaking theory of knowledge opened the door to the late 20th and early 21st century work on the analysis of knowledge, skepticism, foundationalism, internalism, the ethics of beliefs, and evidentialism, to name just a few topics. Chisholm’s analysis of knowledge was the basis of the Gettier problem.

Chisholm responds to skepticism as one of three alternatives to the ancient, insoluble problem of the wheel, which he termed the problem of the criterion—the vicious circle encountered in answering the two fundamental questions of epistemology: ‘What kinds of things can we know?’ and ‘What are the sources of knowledge?’. Answering either requires first answering the other. Chisholm adopts particularism, Thomas Reid’s and G. E. Moore’s ‘common sense’ approach, which proceeds by proposing a tentative answer to the first question, in order to answer the second question.

Chisholm provides an analysis of epistemic justification as a response to the Socratic question “What is the difference between knowledge and true opinion?” He explains justification as epistemic preferability, a primitive relationship based on the epistemic goals and ethics of belief. Chisholm defines terms of epistemic appraisal associated with various levels of justified belief to elucidate the level required for knowledge. The sufficiency of Chisholm’s analysis is examined in light of the Gettier problem.

Chisholm’s epistemology is the standard bearer of foundationalism, first proposed by René Descartes. In its defense, Chisholm proposes a unique answer to explain why empirical knowledge rests on foundational certainties about one’s mental/phenomenal experiences, that is, sense-data propositions.

Chisholm resolves the metaphysical objections to sense-data raised by philosophers such as Gilbert Ryle. Chisholm argues that under certain conditions, sense-data propositions about how things appear are self-presenting, certain, and directly evident—the foundation of empirical knowledge.

Chisholm defines a priori knowledge to explain how necessary truths are also foundational. This definition explains Kant’s claims about synthetic a priori propositions and provides insight into the status of Chisholm’s epistemic principles.

Finally, Chisholm answers the problem of empiricism that plagued philosophers since John Locke, the problem of accounting for the justification of beliefs about the external world (non-foundational propositions) from propositions about the contents of one’s mind (foundational propositions). Chisholm proposes epistemic principles explaining the roles of perception, memory, and coherence (confirmation and concurrence) to complete his account of justification.

Table of Contents

  1. Introduction
    1. The Fundamental Questions of Epistemology
    2. Chisholm’s Philosophical Method
  2. The Traditional Analysis of Knowledge
    1. Chisholm’s Analysis
    2. The Gettier Problem
  3. Why Foundationalism?
    1. The Myth of the Given and the Stopping Place for Socratic Questioning
  4. The Directly Evident—The Foundation
    1. Sense-Data and the Problem of the Speckled Hen
    2. The Directly Evident—Seeming, Appearing, and the Self-Presenting
  5. The Truths of Reason and A Priori Knowledge
    1. The Justification of A Priori Knowledge
    2. Chisholm, Kant, and the Synthetic A Priori
  6. The Indirectly Evident
    1. Chisholm’s Solution to the Problem of Empiricism
    2. Epistemic Principles—Perception and Memory, Confirmation and Concurrence
  7. Postscript
  8. References and Further Reading

1. Introduction

Roderick M. Chisholm, one of the greatest philosophers of the 20th century (Hahn 1997), was not only a prolific writer best known for his works on epistemology (theory of knowledge) and metaphysics, but for his many students who became prominent philosophers. In epistemology, Chisholm is best known as the leading proponent of foundationalism, claiming that:

  • empirical knowledge is built on a foundation of the evidence of the senses; and
  • we have privileged epistemic access to the evidence of our senses.

Foundationalism has its roots in Rene Descartes’ classic work of early modern philosophy, Meditations on First Philosophy. Foundationalism was central to the debate concerning the nature of human knowledge between the Continental Rationalists (Descartes, Spinoza, and Leibniz), the British Empiricists (Locke, Berkeley, and Hume) and Kant. In the 20th century, Bertrand Russell and A. J. Ayer, luminaries of British Analytic Philosophy, and C. I. Lewis, the American Pragmatist, defended foundationalism; while Logical Positivists, including Hans Reichenbach, a member of the Vienna Circle, argued that foundationalism was untenable. After World War II, Chisholm entered this debate defending foundationalism from attacks by W. V. O. Quine and Wilfred Sellars; all three having been students of C. I. Lewis at Harvard University.

Chisholm’s writings on epistemology first appeared in a 1941 article and his comprehensive and detailed account of perceptual knowledge was first presented in his 1957 book Perceiving (Chisholm 1957). He refined his epistemology over the next forty years in response to counterexamples, objections, and questions raised by his colleagues, students, and critics. These refinements first appeared in Chisholm’s numerous published articles, and were incorporated into the three editions of Theory of Knowledge published in 1966, 1977 and 1989.

Chisholm’s epistemology was unique in not only addressing the “big questions”, but in presenting a detailed theory accounting for the structure of epistemic knowledge and justification.  

a. The Fundamental Questions of Epistemology

Chisholm opens his final edition of Theory of Knowledge addressing the most basic problem of epistemology, the challenge of skepticism—the view that we do not know anything. (For an explanation of skepticism, see: https://iep.utm.edu/epistemo/#SH4b). Chisholm explains that to answer this challenge an answer is required to the question:

    1. What do we know or what is the extent of our knowledge?

This, in turn, requires an answer to the question:

    1. How are we to decide, in a particular case, whether we know or what are the criteria of knowing?

But, to answer this second question, about the criteria of knowing, we must answer the first question, that is, of what we know. The challenge of skepticism thus ensnares us in the ancient problem of “the diallelus”—the problem of “the wheel”, or, as Chisholm calls it, “the problem of the criterion”. (For a detailed explanation of this problem, see https://iep.utm.edu/criterio/.)

The problem of the criterion can only be resolved by adopting one of three views: skepticism, particularism, or methodism. Skepticism claims that we do not know the extent of our knowledge and we do not know the criteria of knowledge, hence, we do not or cannot know anything. Particularism claims that there are particular propositions or types of propositions that we know, so we have at least a partial answer to the first question, and we can use these paradigm cases of knowledge to develop criteria of knowing, answering the second question. Methodism claims that we can identify certain criteria of knowing, answering the second question, which in turn provides criteria which can be employed to determine the extent of our knowledge, answering the first question.

Chisholm asserts that deciding between these three views cannot be done without involving ourselves in a vicious circle or begging questions: assuming an answer to one or both of the questions posed. That is, Chisholm maintains that the problem of the criterion cannot be solved. Chisholm adopts a “common sense” brand of particularism following in the footsteps of Thomas Reid (the 18th century Scottish philosopher) and G. E. Moore (the 20th century English philosopher). The “common sense” assumption is that we know more or less what, upon careful reflection, we think that we know. To justify this working assumption of commonsense particularism, Chisholm sets out as a goal of epistemology to improve our beliefs by ranking them with respect to their relative reasonability. Doing this leads him to adopt an internalist conception of justified belief, presupposing “that one can know certain things about oneself without the need of outside assistance” (Chisholm 1989, pg. 5).

The breadth and depth of Chisholm’s epistemology require focusing here on his solution to four fundamental questions and problems in the theory of knowledge:

    1. The analysis of knowledge or, in his terms, the problem of the Theatetus;
    2. Why knowledge must rest on a foundation of sense-data (or why foundationalism)?
    3. What is the nature of the data of the senses (the Directly Evident) and the truths of reason (the a priori), conferring on them privileged epistemic status to serve as a foundation of knowledge?
    4. How is epistemic justification transmitted from sense-data to empirical propositions about the external world (the Indirectly Evident)?

The primary focus of this discussion will be Chisholm’s account of empirical knowledge (or a posteriori knowledge). In the process Chisholm’s account of the knowledge of necessary truths or a priori knowledge is examined.

b. Chisholm’s Philosophical Method

Chisholm introduces his epistemology by clearly articulating the specific philosophical puzzles or problems he proposes to solve, stating the problem unambiguously, presenting his solution clearly defining the terms in which his proposal is cast, considering a series of counterexamples or conceptual tests, and responding to each in detail. This approach not only characterized Chisholm’s philosophical writings, but also his pedagogical methodology. He conducted seminars attended by graduate students and faculty members, at Brown University (where he was on the faculty for virtually his entire academic career) and at the University of Massachusetts at Amherst (to which, for many years, he traveled 100 miles to conduct a weekly seminar). In addition to his colleagues at Brown, notable attendees at these seminars included Edmund Gettier, Herbert Heidelberger, Gareth Matthews, Bruce Aune, Vere Chappell, and his former graduate students Robert Sleigh and Fred Feldman.

Chisholm would present philosophical puzzles and his solutions, and the attendees would challenge his solutions by raising counterexamples, objections, and problems. Chisholm would respond to them, and then return the next week with a revised set of definitions and principles to be defended from the welcomed onslaught of a new set of counterexamples. Honoring this methodology, the Philosophical Lexicon defined a term:

chisholm, v. To make repeated small alterations in a definition or example. “He started with definition (d.8) and kept chisholming away at it until he ended up with (d.8””””).”

2. The Traditional Analysis of Knowledge

Chisholm opens the first edition of Theory of Knowledge (Chisholm 1966) considering Socrates’ claim in Plato’s Meno that, even though he does not know much, he knows that there is a difference between true opinion and knowledge. The Socratic challenge is to explain the difference between knowing a proposition and making a lucky guess. Plato’s answer, known as the Traditional Analysis of Knowledge (TAK), can be expressed as:

(TAK) S knows p =Df 1. S believes (or accepts) p;
2. p is true; and
3. S is justified in believing p.
(where S is the knower and p is a proposition or statement believed).

Thus, according to TAK, the difference between knowing that the Mets won last year’s World Series and making a lucky guess is having an account for, having a good reason for, or being justified in believing that they won the World Series.

Chisholm raises Socrates criticism of the traditional analysis of knowledge in Plato’s Theatetus:

We may say of this type of definition, then, what Socrates said of the attempt to define knowledge in terms of reason or explanation: “If, my boy, the command to add reason or explanation means learning to know and not merely getting an opinion…, our splendid definition of knowledge, would be a fine affair! For learning to know is acquiring knowledge, is it not?” (Theatetus 209E; cited in Chisholm 1966, pg. 7).

Chisholm explains that justified belief is ordinarily understood to presuppose the concept of knowledge. Therefore, the traditional analysis of knowledge appears to be circular, that is, defining knowledge in terms that are themselves defined in terms of knowledge. Chisholm sets out to explain a notion of epistemic justification which inoculates TAK from the charge of pernicious circularity.

a. Chisholm’s Analysis

Chisholm proposes to define knowledge as follows:

(TAK-C) S knows p =Df p is true, S accepts p, and p is evident for S. (Chisholm 1977 pg. 102)

He then undertakes to explain how his version of the traditional analysis avoids the circularity problem of the Theaetetus.

In this analysis the requirement of justified belief, the justification condition of knowledge, is replaced by a condition that the proposition is evident, where evident is a technical term of epistemic appraisal for which Chisholm provides a definition. Roughly speaking, a proposition is evident for a person on the condition that the evidence available to a person is sufficiently strong to constitute a good reason for believing the proposition.

Chisholm does not think that replacing the term justified belief with the term evident magically solves the circularity problem of the Theatetus. In fact, Chisholm concedes that his terms of epistemic appraisal, for example, evident, justified belief, know, more reasonable, certain, beyond reasonable doubt, acceptable, having some presumptions in its favor, gratuitous, and unacceptable, possibly form a closed circle of concepts that cannot be defined in non-epistemic terms. To avoid this seeming circularity, Chisholm first specifies a primitive term which expresses a relationship of epistemic preferability, more reasonable than, explaining this in terms of a specified set of epistemic goals or intellectual requirements of rationality. Next, he defines all of the terms of epistemic appraisal in terms of this primitive relationship. These technical terms, in turn, define various levels of epistemic justification. His final step is to identify the level of epistemic justification, that is, evident, as being the level of justification required for knowledge.

Chisholm fills in the details by providing an ethics of belief and a logic of epistemic terms. The full account of empirical knowledge is completed with a set of epistemic rules or principles, analogous to the rules of morality or logic, explaining the structure of the justification of empirical beliefs. Chisholm believed that the adequacy of a theory of knowledge is dependent on these principles.

The first condition of Chisholm’s analysis of knowledge requires the knower to accept (or, as more commonly expressed, believe) the proposition. Acceptance is one of three mutually exclusive propositional attitudes a person may take with respect to any proposition that he/she considers; the other two being: (1) denying the propositions, that is, accepting the denial (negation) of the proposition, and (2) withholding or suspending judgment about the proposition, that is, neither accepting nor denying the proposition. For example, a person who has considered the proposition that God exists is either (i) a theist who accepts the proposition that God exists, (ii) an atheist who denies that God exists (accepts that ‘God exists’ is false), or (iii) an agnostic who withholds or suspends judgment with respect to the proposition that God exists.

Chisholm draws a parallel between epistemology and ethics to explain epistemic appraisal. Ethics and epistemology are essentially normative or evaluative disciplines. Ethics is concerned with the justification of actions, and analogously, epistemology with the justification of belief (Chisholm 1977 pp. 1-2). A goal of ethics is to provide an account or explanation of moral appraisal, for example, good, right, and so forth. Similarly, epistemology seeks to provide an account or explanation of epistemic appraisal or evaluation. Chisholm’s account of knowledge proceeds by defining knowing a proposition in terms of the proposition’s being evident; and a proposition’s being evident in terms of the primitive relationship of epistemic evaluation, that is, more reasonable than.

Chisholm distinguishes two types of evaluation, absolute and practical, illustrating the distinction as follows. It may have been absolutely morally right to have killed Hitler or Stalin when they were infants, however, it would not have been practically right from the moral standpoint because no one could have foreseen the harm they would cause. Absolute rightness depends on an objective view of reality, that is, taking into consideration all of the truths related to an action. By contrast, practical rightness only depends on what a person could have foreseen.

Chisholm’s view is that justified belief depends on what is practically right for the person to believe. He is committed to the view that epistemic justification and, hence, knowledge, depends on evidence ‘internally’ available to the knower, a view known as Internalism. In support of this view, he points out that we rarely have direct access to the truth of propositions, that is, to reality. Being justified in believing a proposition is dependent on how well believing a given proposition meets the goals or requirements of rationality. The degree to which a proposition meets these goals is relative to the evidence available to a person; not relative to the absolute or ‘God’s eye’ view. Epistemic appraisal or evaluation is a function of the relative rationality of a person’s adopting a propositional attitude (acceptance, denial, withholding) given the evidence available to that person.

In his classic essay “The Ethics of Belief”, W. K. Clifford suggested that “it is always wrong … to believe on the basis of insufficient evidence” (Clifford 1877 pg. 183). In explicating one’s epistemic duties, Chisholm adopts the somewhat lower standard that one’s beliefs are innocent until proven guilty, that is, it is permissible to believe a proposition unless one has evidence to the contrary. At the foundation of Chisholm’s account of justified belief is the primitive concept of epistemic preferability, more reasonable than, which he explains by appealing to “the concept of what might be called an ‘intellectual requirement’.” (Chisholm 1977 pg. 14). He elaborates:

We may assume that every person is subject to a purely intellectual requirement–that of trying his best to bring it about that, for every proposition h that he considers, he accepts h if and only if h is true (Chisholm 1977 pg. 14).

Epistemic preferability, captured by Chisholm’s term more reasonable than, expresses a person’s relationship between two propositional attitudes. The more reasonable propositional attitude is the one that best fulfills one’s epistemic goals/intellectual requirements better than another attitude. Chisholm explains:

What is suggested when we say that one attitude is more reasonable than another is this: If the person in question were a rational being, if his concerns were purely intellectual, and if he were to choose between the two attitudes, then he would choose the more reasonable in preference to the less reasonable. (Chisholm 1966, pp. 21-22).

It may occur to some that this requirement may be satisfied by suspending judgment concerning every proposition, thereby, believing only what is true. However, Chisholm appeals to William James’s criteria, explaining that:

“There are two ways of looking at our duty in the matter of opinion–ways entirely different… We must know the truth: and we must avoid error…”

Each person, then is subject to two quite different requirements in connection with any proposition he considers;         (1) he should try his best to bring it about that if a proposition is true then he believe it; and (2) he should try his best to bring it about that if a proposition is false then he not believe it. (Chisholm 1977 pp. 14-15).

Analogizing believing a proposition to betting on the truth of a proposition is a useful way of thinking about this requirement (Mavrodes 1973). Our epistemic goals can be thought of as providing the bettor with two pieces of advice: (1) win many bets; and (2) lose few bets. If we refrain from betting, we will have followed the second piece of advice and disregarded the first, and, if we bet on everything, we will have followed the first piece of advice to the exclusion of the second.

On Chisholm’s view, the epistemic goals or duties require not merely that we avoid believing false propositions, but that we also strive to find the truth. Our “intellectual requirement” is to strike a happy median between believing everything and believing nothing. While these are two distinct and independent intellectual requirements, one’s epistemic duty is to do one’s best to fulfill both requirements at the same time. If, for example, you saw a small furry animal with a tail in the distance but could not discern what kind of animal it was, you could better fulfill your epistemic duties by believing that it is a dog than denying that it is a dog (believing that it is not a dog); but you would best meet both epistemic goals by withholding or suspending belief rather than by either believing or denying it.

Chisholm elaborates that more reasonable than is a relationship between propositional attitudes p and q which person S may adopt with regard to propositions at time t, which means that “S is so situated at t that his intellectual requirement, his responsibility as an intellectual being, is better fulfilled by p than by q” (Chisholm 1977 pp. 14-15). More reasonable than is an intentional concept, that is, “if one proposition is more reasonable than another for any given subject S, then S is able to understand or grasp the first proposition.” It expresses a transitive and asymmetric relationship. “And finally, if withholding is not more reasonable than believing, then believing is more reasonable than disbelieving” (Chisholm 1977 pg. 13), for example, if agnosticism is not more reasonable than theism, then theism is more reasonable than atheism. Thus, for example, to say that accepting a proposition, p, is more reasonable than denying p for person, S, at time, t, is to say that believing p better fulfills S’s epistemic duties than does S’s denying p. In other words, S ought (epistemically) to accept p rather than deny p, given the evidence available to S.

Chisholm distinguishes levels of justified belief in terms of one propositional attitude being more reasonable than another, given a person’s evidence at a specific time. He defines terms of epistemic appraisal or evaluation that correspond to the level of justification a person has for a given proposition. Chisholm defines these terms to specify a hierarchy of levels of justified belief and specifies the minimum level of justified belief required for knowing. In Chisholm’s hierarchy of justified belief any proposition justified to a specific level is also justified to every lower level.

There is a concept of beyond reasonable doubt that is the standard in English Common Law. This is the level to which the members of the jury must be justified in believing that the accused is guilty in order to render a guilty verdict in a criminal trial. In this context it means that given the available evidence there is no reasonable explanation of the facts other than that the accused person has committed the crime. The underlying idea is an epistemic one, that is, the jurors must have good reason for believing that the accused committed the crime in order to convict the accused.

Chisholm adopts the term beyond reasonable doubt to identify a key level of epistemic justification, that is, when a person has an epistemic duty to believe a proposition. He defines it in a different way than Common Law as follows:

D1.1 h is beyond reasonable doubt for S =Df accepting h is more reasonable for S than withholding h. (Chisholm 1977 pg. 7).

In this sense, a proposition is beyond reasonable doubt for a person if and only if accepting it better fulfills the person’s intellectual requirements of believing all and only true propositions than withholding or suspending belief. More simply put, propositions which are beyond reasonable doubt are ones that a person epistemically ought to believe given the evidence he or she has.

Chisholm considers the proposition that the Pope will be in Rome on October 5th five years from now as an example of a proposition that has a positive level of justification for most of us, but not quite meeting this standard of beyond reasonable doubt. He points out that although it is more reasonable to believe it than to deny it (given that the Pope is in Rome on most days), but it is even more reasonable to withhold judgment about the Pope’s location five years from now. While the Pope spends most of his time in Rome, circumstances five years from October 5 th may require that he be somewhere else. Chisholm defines this slightly lower lever of justified belief, having some presumption in its favor, as follows:

D1.2. h has some presumption in its favor for S =Df Accepting h is more reasonable for S than accepting non-h. (Chisholm 1977 pg. 8).

Given the limited evidence that we have about the Pope’s whereabouts five years from now, it is more reasonable to believe that the Pope will be in Rome than believing that he will not be in Rome five years from October 5th. However, it is even more reasonable in these circumstances to withhold judgment about the Pope’s whereabouts. According to Chisholm, the proposition in question about the Pope’s whereabouts in the future has some presumption in its favor, that is, it is more reasonable to believe that it is true than false.

Chisholm defines a level of epistemic justification for propositions that are not beyond reasonable doubt and yet have a higher positive epistemic status than having some presumption in their favor. The level of justified belief is that of the proposition’s being acceptable which is defined as follows:

D1.3. h is acceptable for S =Df Withholding h is not more reasonable for S than accepting h. (Chisholm 1977 pg. 9).

An example of a proposition that has this level of justified belief is the proposition that I actually see something red when I seem to see something red under certain questionable lighting conditions. Withholding belief that I actually see something red is not more reasonable than believing it, and yet believing that I actually see something red may not be more reasonable than withholding it, that is, they may be equally reasonable to believe. Anything that is beyond reasonable doubt is acceptable and anything that is acceptable also has some presumption in its favor. As noted in the outset of this discussion, every higher level of justified belief is also justified to every lower level.

Chisholm thinks that while propositions that are beyond reasonable doubt have a high-level justification, that is, they ought to be believed, even though they do not have a sufficiently high level of justification for knowledge. The lower levels of justified belief, that is, acceptable or having some presumption in their favor, play an important role for Chisholm’s account as their level of justification may be raised to the level of evident when they support and are supported by other propositions at the lower levels.

Occupying the high end of Chisholm’s justification hierarchy are proposition which are certain. Chisholm distinguishes this epistemological sense of certain from the psychological sense. We may feel psychologically certain of the truth of a proposition, even though we are not justified in believing it. The epistemological sense of certainty represents the highest level of justified belief and is not merely a feeling of confidence in believing. Chisholm defines certainty as:

D1.4. h is certain for S =Df (i) h is beyond reasonable doubt for S, and (ii) there is no i, such that accepting i is more reasonable for S than accepting h. (Chisholm 1977, pg. 10)

As with all propositions justified to a higher level of epistemic justification, any proposition that is certain for a person also meets the criteria of every lower level of positive epistemic justification. Chisholm claims that propositions that describe the way things appear to a person and some truths of logic and mathematics, under the right conditions, are certain for us. The levels of justification do not come in degrees. Thus, no proposition that is certain, according to Chisholm is more reasonable than (or for that matter less reasonable than) any other proposition having this epistemic status. That is, certainty, in Chisholm’s technical sense, (as every level of justified belief) does not come in degrees.

Some philosophers thought that the criterion of the epistemic justification required for knowledge was certainty. Descartes equated certainty to not being subject to any possible doubt. Chisholm argues that this standard is too high a standard for knowledge because this would rule out, by definition, the possibility of our knowing many contingent truths which we think we can know. This would make skepticism about empirical knowledge true by definition.

Believing that the President is in Washington today because we saw him there yesterday and he spends most of his time there, is beyond reasonable doubt. However, we need stronger justification for knowing that he is there today. Knowledge requires justification to a higher level than the proposition’s merely being beyond reasonable doubt. These considerations indicate to Chisholm that the minimum level of justification required for knowledge is higher than being beyond reasonable doubt and lower than certainty. Capturing this intuition, Chisholm defines evident to single out the level of justification required for knowledge as:

D1.5 h is evident for S =Df (i) h is beyond reasonable doubt for S, and (ii) for any i, if accepting i is more reasonable for S than accepting h, then i is certain. (Chisholm 1977 pg. 10).

According to Chisholm’s version of the Traditional Analysis of Knowledge, a necessary condition of knowledge is that the proposition is evident.

Chisholm’s ethics of belief solves the problem of the Theatetus as follows: He identified the epistemic goals with respect to any proposition under consideration as: (1) believe it if true and (2) do not believe it if false. These goals determine the relative merit of adopting the attitudes of accepting, denying, and withholding judgment with respect to any given proposition and the person’s evidence, that is, in determining which attitude is more reasonable than another. Chisholm’s definition of knowledge identifies the level of justification required for knowing a proposition as being evident, meaning that (i) it is more reasonable for the person to believe the proposition than to withhold belief (beyond reasonable doubt) and (ii) any proposition that is more reasonable is maximally reasonable (certain). This analysis sheds light on the level of justification required for knowing a proposition. Moreover, it is not defined in terms of something which, in turn, is defined in terms of knowledge. Chisholm’s analysis of knowledge thus avoids the circularity problem of the Theatetus.

b. The Gettier Problem

Before leaving the analysis of knowledge to explore the other important parts of Chisholm’s theory of knowledge, a critical objection to Chisholm’s analysis of knowledge, the Gettier Problem, needs to be outlined. Edmund Gettier, in a monumental paper “Is Knowledge Justified True Belief?” (Gettier 1963), proposes a set of counterexamples to Chisholm’s definition of knowledge which identify a genetic defect in the Traditional Analysis of Knowledge. Gettier argues these counterexamples demonstrate a defect in Chisholm’s version of the Traditional Analysis of Knowledge, as well as Plato’s and Ayer’s versions.

To illustrate the problem, let us consider one of Gettier’s examples. Suppose that Smith and Jones are the only two applicants for a job. Smith believes that the person who will get the job has ten coins in his pocket because he knows that he has ten coins in his pocket and that the boss has told him that he will be offered the job. Unbeknownst to Smith, the other applicant, Jones, also has ten coins in his pocket and ultimately gets the job. Thus, Smith believes that the person who has ten coins in his pocket will get the job, it is true, and Smith is justified in believing (it is evident to him) that the person who will get the job has ten coins in his pocket. However, Smith’s evidence is defective and, thus, Smith does not know that the man who will get the job has ten coins in his pocket.

Gettier examples, as they have become known, point to a genetic defect in the Traditional Analysis of Knowledge, that is, a person can have a justified (evident) true belief which is not knowledge. The Gettier problem became a major focus of epistemology in the 1960’s and continues today, more than a half century later. Solutions were proposed that add a fourth condition of knowledge or explain why the Gettier examples were not really problematic. Chisholm notes the Gettier Problem in the first edition of his Theory of Knowledge (Chisholm 1966) suggesting that the solution to the problem lies in adding a fourth condition of knowledge. In his characteristic style, Chisholm presented major revisions of his definitions intended, among other things to address the Gettier Problem, in Theory of Knowledge (Second Edition) in 1977 and in the third edition in 1989.

3. Why Foundationalism?

Chisholm’s epistemology does not begin and end with the analysis of knowledge. His work on the analysis of knowledge clears the conceptual landscape for answering fundamental questions about the structure of empirical knowledge, and for providing an account of the justification of empirical beliefs. In the process, he provides an answer to the much debated question of whether or not empirical knowledge rests on a foundation of epistemically privileged beliefs? Philosophers thought that the answer is obvious, the problem being some maintain it is obviously yes, while others obviously no. Those answering the question in the affirmative defend foundationalism, and those in the negative defend the coherence theory of justification, or simply coherentism.

Chisholm characterizes foundationalism (or the myth of the given as its detractors refer to it) as supporting two claims:

  1. The knowledge which a person has at any time is a structure or edifice, many parts and stages of which help support each other, but which as a whole is supported by its foundation.
  2. The foundation of one’s knowledge consists (at least in part) of the apprehension of what has been called, variously, “sensations”, “sense impressions”, “appearances”, “sensa”, “sense-qualia”, and “phenomena”. (Chisholm 1946, pp. 262-3).

Chisholm joins philosophy luminaries including Rene Descartes, Bertrand Russell, and C. I. Lewis as a leading defender of foundationalism. However, his unique take on why empirical knowledge rests on a foundation of self-justified beliefs reveals much about his approach to epistemology.

Foundationalism’s historical roots are found in the work of Rene Descartes, the father of Modern Philosophy. In his Meditations on First Philosophy, Descartes embarks on securing a firm basis for empirical knowledge, having discovered many of the beliefs on which he based this knowledge to be false. He proposes to rectify this by purging all beliefs for which there was any possible reason to doubt. By applying this methodological doubt, he finds a set of propositions about which he cannot be mistaken. These include certainties about the contents of his mind, for example, about the way things appear to him. Applying the infallible method of deductive reasoning to this foundation of certainties, Descartes claims to build all knowledge in the way that the theorems of Geometry are derived from axioms and definitions, thereby eliminating possibility of error. Descartes argues that foundationalism is the only way to account for our knowledge of the world external to ourselves, and thereby, refutes skepticism.

Locke, Berkeley, and Hume, the British Empiricists, reject the claim that knowledge requires certainty and argue that Descartes’ deductive proof of the external world is unsound. They agree with Descartes that the foundation of empirical knowledge is sense-data but maintain that knowledge of the external world is merely justified as probable. As this probable justification is fallible and can justify false beliefs, the British Empiricists think that foundationalism is true, but compatible with skepticism.

Bertrand Russell, the 20th century founder of Anglo-American Analytic Philosophy, picks up the mantle of British empiricism. He too advocates for foundationalism, claiming that we have epistemically privileged access, as he calls it knowledge by acquaintance, to a foundation of sense-data. Russell, like his British Empiricist predecessors, thought that all empirical knowledge rested on this foundation, but did not claim that external world skepticism was refuted by foundationalism.

Russell’s empiricist successors, the logical empiricists or logical positivists, assumed that empirical knowledge was possible as they viewed science as the paradigm example of knowledge. Hans Reichenbach, a leading proponent of logical empiricism, rejects Russell’s and Descartes’ view arguing that empirical knowledge is not justified, and need not be justified, by a set of foundational beliefs to which we have epistemically privileged access. He claimed that, like scientific claims, empirical propositions are justified by their conformance with other merely probably propositions.

C.I. Lewis, a leading figure in 20th century American Philosophy (and Chisholm’s doctoral dissertation advisor), engages in a famous debate with Reichenbach on this very issue (see: Lewis 1952, Reichenbach 1952, van Cleve 1977, Legum 1980). In “Given Element in Empirical Knowledge”, (Lewis 1952) he argues that empirical knowledge must rest on a foundation of certainty, hence, foundationalism is the only viable alternative. Lewis’s rejection of Reichenbach’s position is based on the claim that there cannot be an infinite regress of justified beliefs. While agreeing that many empirical beliefs are justified because they are probable, he argues:

  1. No proposition is probable unless some proposition is certain;
  2. Therefore, there cannot be an infinite regress of merely probable beliefs;
  3. Some empirical propositions are justified because they are probable;
  4. Therefore, empirical knowledge must rest on a foundation of epistemic certainties.

a. The Myth of the Given and the Stopping Place for Socratic Questioning

Chisholm finds Descartes’ approach to epistemology attractive but is not persuaded by Descartes’ defense of foundationalism. Chisholm, also, endorses Lewis’s premise that if any proposition is probable then some proposition is certain (Chisholm 1989, pg. 14), but takes a different tack to defend foundationalism. Chisholm, not one to accept any philosophical dogma on something akin to religious faith, appeals to his method of developing a theory of knowledge in support of foundationalism. He suggests adopting the hypothesis that we know, more or less, what we think we know, and then, by asking Socratic questions about what justifies our believing of the things we think we know, develops the account for their justification.

In Perceiving (Chisholm 1957, pg. 54) and in Theory of Knowledge (Chisholm, 1966 pg. 18) Chisholm compares justifying a belief to confirming a scientific hypothesis. A hypothesis is confirmed by the evidence the scientist has supporting the hypothesis. Similarly, a belief is justified by the evidence a person has that supports its. Scientists often seek out further evidence to confirm the hypothesis. By contrast, only evidence already ‘internally’ possessed, can justify belief.

Chisholm claims that when we consider cases of empirical propositions which we think we know, we discover through Socratic questioning what justifies our believing these propositions. This process may begin by considering a proposition that I think I know, for example, that:

    1. There is a tree.

To identify what justifies my believing this, we need to answer the Socratic question, ‘What justifies me in believing this proposition?’. One might mistakenly think that as this proposition is obviously true, that is, that the proper answer to the Socratic question is the proposition itself, that is, that there is a tree. However, this would be to misunderstand what the Socratic question is asking. It is not asking from what other beliefs have I inferred this proposition. The question is, “What evidence do I currently possess that supports my believing this proposition?” Sitting here in my office, looking out my window at a tree, I clearly have evidence that there is a tree, namely, that:

    1. I see a tree.

This answer does not imply that I am currently thinking, or have ever thought, about seeing a tree, nor that I consciously believe that I see a tree and from this I infer that there is a tree. It merely implies that the proposition that I see a tree is evidence which is already available to me and which would serve to justify my belief that there is a tree.

This answer, however, is only the first part of a complete answer to the Socratic question “Why do you think that there is a tree?” or “What justifies you in thinking that there is a tree?”. The second part of the answer to the Socratic question is a rule of evidence, in this case a rule specifying conditions related to the proposition serving as evidence which are sufficient for being justified in believing the proposition in question, for example:

RE1. If S is justified in believing that S sees a tree, then S is justified in believing that there is a tree.

The answer to a Socratic question identifies two things: (1) a proposition that serves as evidence for the proposition in question, and (2) a rule of evidence specifying conditions met by the evidence which are sufficient for a person to be justified in believing the proposition in question.

This does not yet amount to a complete account of my being justified in believing that there is a tree. A proposition cannot justify a person’s belief unless the person is justified in believing it. This in turn suggests another step or level is required in the process of Socratic questioning, that is, “What justifies my believing that I see a tree?” The first part of the answer is the evidence that I have for believing this, for example:

    1. I seem to see a tree.

This proposition asserts that I am experiencing a certain psychological or phenomenological state of the kind that I would have in cases where I am actually seeing a tree, dreaming that I am seeing a tree, or hallucinating that I am seeing a tree. The second part of the answer to this question is a rule of evidence, in this case:

RE2. If S is justified in believing that S seems to see a tree and has no evidence that S is not seeing a tree, then S is justified in believing that S sees a tree.

This in turn raises the next step or level in the process of Socratic questioning, that is, “What justifies my believing that I seem to see a tree?” The appropriate answer to this question in this case is not some other proposition that serves as evidence that I seem to see a tree. Rather, the truth of the proposition, that is, my having the psychological experience of seeming to see a tree, is my evidence for believing that I seem to see a tree. The second part of the answer is a rule of evidence like the following:

RE3. If it is true that S seems to see a tree, then S am justified in believing that S seems to see a tree.

This rule of evidence is a different kind than the others encountered in the process of Socratic questioning. This rule conditions justified belief on the truth of a proposition, in contrast to the other rules which condition justified belief on being justified in believing another proposition.

Chisholm questions whether our process of Socratic questioning goes on without end, ad infinitum, by either justifying one claim with another or by going around in a circle? He believes that “…if we are rational beings, we will do neither of these things. For we will find that our Socratic questions lead us to a proper stopping place.” (Chisholm 1977, pg. 19). We come to a final empirical proposition whose justification is that the proposition believed is true.

When we encounter an answer that the truth of the proposition justifies believing the proposition, Chisholm points out, we have reached the stopping place of Socratic questioning, that is, we have completed the account of the justification of the initial proposition. Furthermore, according to Chisholm, we typically find that the proposition reached at the end of the process of Socratic questioning describes the person’s own psychological state, that is, describes the way things appear to that person. Thus, at least hypothetically, Chisholm identifies a class of beliefs which may serve as the foundation of all empirical knowledge. This is tantamount to saying that if we know any empirical or perceptual propositions, then believing these propositions is, at least in part, justified by the relationship they have to the psychological proposition describing the way things appear to us.

Chisholm claims that when we consider cases of empirical propositions which we think we know, we discover that the process of Socratic questioning, that is, that the account of the justification for believing these propositions, comes to a proper stopping place in a finite number of steps. When we reach the final stage of Socratic questioning, we have discovered, as foundationalism implies, the foundation on which empirical knowledge rests. In contrast to Descartes, Chisholm does not think that the only alternative to skepticism is foundationalism. While he may agree with C. I. Lewis that you cannot have an infinite regress of propositions which are probable, he does not claim that this proves that that there is no viable alternative to foundationalism. Chisholm thinks that the fact that we find a proper stopping place to Socratic questioning makes it plausible to accept foundationalism as a postulate or an axiom for developing a theory of knowledge. That foundationalism is acceptable, for Chisholm, should be judged by how well it explains the justification of empirical beliefs. Thus, to defend foundationalism, Chisholm presents one of the most detailed and complete explanations answering the two fundamental questions: (i) What makes the foundational beliefs about one’s psychological states or sense-data ‘self-justified’ (as Chisholm calls them, ‘directly evident’)?, and (ii) How does the foundation of sense-data serve as the ultimate justification of all empirical knowledge? Chisholm’s answer to these two critical questions is thus discussed in the next section.

4. The Directly Evident—The Foundation

Descartes proposed that empirical knowledge must rest on a foundation of certainties, propositions about which one cannot be mistaken. His foundation was composed of propositions about which even an all-powerful and all-knowing evil genius could not deceive him. It included not only logical or necessary truths, but also psychological propositions about himself. These propositions are of the form that I exist…, I doubt …., I understand …, I affirm …, I deny …, I imagine …., and, most importantly, I have sensory perceptions of … . Propositions of the last type, that is, propositions about one’s psychological or phenomenological states describing the raw data of the senses, are perhaps the most crucial propositions upon which all empirical knowledge is founded. These sense-data propositions, expressed by statements like ‘I seem to see a fire’, describe the inner workings of one’s mind, and do not imply the existence of anything in the external world.

Locke, Berkeley, and Hume (the British Empiricists) recognize that the data of the senses, that is, sense-data, serves as the foundation of empirical knowledge. Bertrand Russell agrees with them that these propositions constitute the foundation of empirical knowledge and claims that they have a privileged epistemic status which he dubbed knowledge by acquaintance. C. I. Lewis agreed that empirical knowledge rests on a foundation of sense-data, which are the given element in empirical knowledge.

Many 20th century empiricists were skeptical about the existence of sense-data which led to their doubting that empirical knowledge rests on a foundation of epistemically privileged sense-data. Gilbert Ryle, for example, raised this type of objection and argued that sense-data theory is committed to an untenable view of the status of appearance. Chisholm enters the historical debate, defending foundationalism from Ryle’s objection in one of his earliest papers on epistemology, “The Problem of the Speckled Hen” (Chisholm 1942).

a. Sense-Data and the Problem of the Speckled Hen

Ryle asks you to suppose you take a quick look at a speckled hen which has 48 speckles that are in your field of vision. According to the view under consideration you have a 48 speckled sense-datum. If, as foundationalism claims, you can never be wrong about the sense-data that present themselves to the mind, it would seem that you could never be mistaken in thinking that you were presented with a 48 speckled datum. Ryle’s point is that while we might concede that one could never be mistaken in thinking that a sense-datum had two or three speckles, but, as the number of speckles gets sufficiently large, for example, 48, we may be mistaken the number of speckles in the sense-datum. Chisholm points out that this is not an isolated problem in an odd situation, but that similar issues can be raised concerning most perceptual presentations, that is, most sense-data are complex like the speckled hen case.

A.J. Ayer, a leading logical positivist and a defender of foundationalism, replies that the example is mistaken, arguing that any inaccuracy introduced in counting the speckles can be accounted for because sense-data do not have any definite number of speckles. (Ayer 1940). Chisholm points out that it is odd to think that the sense-data caused by a looking at a hen having a definite number of speckles do not have a specific number of speckles. Thus, Ayer must adopt either of two unacceptable positions. The first is that it is neither true nor false that the hen sense-datum has 48 speckles. This amounts to saying that certain propositions expressing sense-data are neither true nor false. But this is hardly acceptable because it would commit one to denying the logical law of the excluded middle. The alternative would be that while the hen sense-datum had many speckles, it did not have any specific number of speckles. Chisholm argues that this is untenable because it is like claiming that World War II would be won in 1943, but not in any of the months that make up 1943.

Chisholm thinks that the Problem of the Speckled Hen demonstrates that not all sense-data propositions are foundational. One’s justification for believing complicated sense-data propositions, for example, that I seem to see a 48 specked hen, are not the propositions themselves, but are other sense-data propositions, for example, that I seem to see an object with red speckles. Chisholm grants that complex sense-data propositions can and often do refer to judgments that go past what is presented or given in experience. Such propositions assign properties to our experience that compare the given experience to another experience. Any such judgment goes beyond what is given or presented in the experience and, as such, introduces the possibility of being mistaken in the comparison. When a sense-data judgment goes beyond what is presented in experience, its justification is not the truth of the proposition, but is justified by other simpler sense-data which in turn are either simpler or foundational.

Chisholm’s concludes that the class of sense-data proposition is larger than the class of epistemically foundational or basic propositions. Only the subset comprised of simple sense-data propositions, for example, propositions about colors and simple shapes that appear, may be foundational beliefs. The challenge for Chisholm is two-fold: (a) to provide an account of which sense-data propositions are foundational, or as he calls them Directly Evident, that avoids the metaphysical pitfalls Ryle identified with sense-data; and (b) to identify what enables them to serve as the foundation of perceptual knowledge, that is, to explain their privileged epistemic status.

b. The Directly Evident—Seeming, Appearing, and the Self-Presenting

In short, Chisholm claims that to discover what justifies our believing some proposition can be determined by a process of Socratic questioning which identifies the evidence we have for believing the proposition, and then the evidence we have for believing the evidence, until we reach a proper stopping point. The proper stopping point arises when the proper answer is that the evidence that justifies one in believing the proposition is the truth of the proposition. These propositions whose truth constitutes their own evidence are the given element in empirical knowledge, that is, the Directly Evident.

The following example of a line of Socratic questioning illustrates Chisholm’s point. Suppose I know that:

    1. There is a blue object.

In response to the question of what evidence I have for believing this I may cite that:

    1. I perceive (see, feel, hear, and so forth) that there is a blue object.

In response to the question of what evidence I have for accepting (2), I would cite that:

    1. I seem to see to see something blue (or alternatively that I have a blue sense-datum).

When we reach an answer like (3), we have reached Chisholm’s proper stopping point of Socratic questions. On Chisholm’s view, psychological or phenomenological propositions like (3) are self-justifying or self-presenting, they are the given element in empirical knowledge, and they serve as the foundation of perceptual knowledge.

Chisholm defends Descartes’ and C. I. Lewis’ assertion that propositions which describe a person’s phenomenological experience, that is, propositions which describe the way that things seem or appear to a person, are important constituents of the foundation of perceptual knowledge. These phenomenological propositions which constitute the foundation of perceptual knowledge are expressed by statements using ‘appears’ or ‘seems’, and they do not imply that one believes, denies, has evidence supporting, or that is hedging about whether there is something that actually has a certain property. Rather, ‘appears’ or ‘seems’ describes one’s sensory or phenomenological state, for example, that I seem to see something white.

Chisholm distinguishes comparative and non-comparative uses of ‘appears’ in statements describing one’s sensations or phenomenological state. The comparative use describes the way that we are appeared to by comparing it with the way that certain physical objects have appeared in the past. Thus, when I use ‘appears’ in the comparative way, I am “saying that there is a certain manner of appearing, f, which is such that: (1) something now appears f, and (2) things that are blue may normally be expected to appear f.” (Chisholm 1977 pg. 59). By contrast, if I use ‘appears’ in the noncomparative way, I am saying that there is a blue way of appearing (or seeming) and I am now in this phenomenological state or having this kind of phenomenological experience. Chisholm claims that only those propositions expressed by sentences using the noncomparative descriptive phenomenological sense of ‘appear’ or ‘seems’ are directly evident.

Chisholm’s solution to the Problem of the Speckled Hen is that sense-data compose the given element in empirical knowledge, that is, the foundation on which all perceptual knowledge stands, but not all sense-data are foundational. Only sense-data statements referring to some sensory characteristics are candidates for this special status, and they can be called basic sensory characteristics. It should be said that, at least for most of us, the characteristic like the speckled hen’s appearing to have 48 speckles are not basic sensory characteristics, and therefore are not foundational beliefs. Rather, only appearance propositions using the basic or simple sensory characteristics, for example, basic visual characteristics (for example, blue, green, red), olfactory characteristics, gustatory characteristics, auditory characteristics, or tactile characteristics, will be candidates for the directly evident.

One might wonder what distinguishes appearing blue from appearing to have 48 speckles that makes the former a basic sensory characteristic while the latter not. Most people can recognize a triangle at a glance and do not need to count the three sides or angles in order to recognize that the object is a triangle. Moreover, at a glance, we can distinguish it from a square, rectangle, or pentagon. Contrast that with recognizing a chiliagon (1000 sided polygon). Other than perhaps a few geometric savants (perhaps Thomas Hobbes who attempted to make a polygon into a circle), we cannot recognize a chiliagon at a glance. In fact, we would have to go through a long process of counting to discover that a given polygon in fact had 1000 sides. Clearly, appearing chiliagon shaped is not going to be a basic sensory characteristic in contrast to appearing triangular.

Like most adults I can discern a triangle immediately, while very young children cannot. A child playing with a toy containing holes of different shapes and blocks to be inserted into the corresponding shaped hole may have difficulty matching the triangle to the triangular hole, indicating that it is difficult for the very young child to recognize a triangle. It seems reasonable to conclude that appearing triangular is a basic sensory characteristic for me but not for the very young child. Thus, one and the same characteristic may be a basic sensory characteristic for one person while not a basic characteristic for another depending on their visual acuity. Moreover, visual acuity may change from time to time for the same person, hence, at different times, the same characteristic may be a basic sensory characteristic at one time and not at another time. (Chudnoff 2021 discusses empirical evidence that training can help one develop new recognitional abilities).

The distinction between basic sensory characteristics from non-basic ones is based on whether or not a person requires evidence to be justified in believing that the sensation has a certain characteristic. For most of us (at least those of us who are not color-blind), being justified in believing the proposition that I seem to see something green would require no evidence beyond our phenomenological state or experience. By contrast, being justified in believing that I seem to see a 48 speckled thing would require our having evidence from counting up the speckles. Thus, being 48 speckled would not be a basic sensory characteristic. By contrast, being 5 speckled (or fewer) for most of us would be a basic sensory characteristic. The test of whether a sensory characteristic is basic would be the answer to the Socratic question of what justifies the person in believing that it is an experience of that characteristic.

Chisholm’s solution to the Problem of the Speckled Hen addresses the metaphysical concerns about sense-data. A standard view of sense-data is that if I am looking at a white wall that is illuminated by red lights, there is a red wall sense-datum, which is really red and this object is what ‘appears before my mind’. Philosophers have objected to the sense-data theory’s dependence upon the existence of non-physical ghost-like entities serving as intermediaries between physical objects and the perceiver. Ayer, for example, proposed that these odd metaphysical entities may have seemingly contradictory properties, for example, having many speckles but no specific number of speckles. Others rejected these metaphysical claims as entailing skepticism about the external world as we only have access to the sense-data. Chisholm intends his theory to account for the epistemic properties of sense-data, that is, that they are directly evident, without entailing the objectionable metaphysical assumption sense-data are ghost-like entities.

Chisholm explains that if we are going to be precise, something appears f to me is not directly evident because this implies that there are objects, sense-data, which appear to me and for which it is proper to seek further justification. Consider a case of my hallucinating a green table and as a result it is true that:

I seem to see a green table.

A defender of sense-data would say that this means the same as:

There is a green sense datum which is appearing to me.

But, it is perfectly proper to seek justification for the belief that a green sense-datum exists, hence, the proposition that I seem to see a green table is not directly evident.

Such examples suggest to Chisholm that a better formulation of the statement which expresses the directly evident is “I am experiencing an f appearance.” “But,” he is concerned that, “in introducing the substantive ‘appearances’ we may seem to be multiplying entities beyond necessity.” (Chisholm 1977, pg. 29). Chisholm, therefore, wants to avoid referring to any unusual entities like sense-data in statements intended to express the directly evident. The reason cited for avoiding references to sense-data is parsimony, that is, the principle of assuming the existence of no more types of objects than required.

Chisholm shows us how to get rid of the substantives in appear-statements. We begin with a statement:

a. Something appears white to me;

which is to be rewritten more clearly as:

b. I have a white appearance.

But this sentence contains the substantive ‘appearance’. To avoid reference to any strange metaphysical entities, sense-data, the sentence must be rewritten to read as:

c. I am appeared white to.

Chisholm notes that we have not yet succeeded in avoiding referring to sense-data, for ‘white’ is an adjective and, thus, must describe some entity (at least according to the rules of English grammar). We, however, want ‘white’ to function as an adverb that describes the way that I am appeared to. Thus, the sentence becomes:

d. I am appeared whitely to;

 or, to put it in a somewhat less awkward way,

e. I sense whitely.

Chisholm does not propose that we should use this terminology in our everyday discourse, nor even that we should use this terminology whenever we are discussing epistemology. Rather, he wants us to keep in mind that when we use sentences like (a), (b), and (c) to express the foundation of empirical knowledge, what we are asserting is what (d) and (e) assert.

Chisholm concludes that (d) and (e), the directly evident propositions, do not imply that there are non-corporeal entities, sense-data, that are something over and above the physical objects and their properties which we perceive. When our senses deceive us, we are not seeing, hearing, feeling, smelling, or tasting (that is, perceiving) a non-physical entity. Rather, we are misperceiving or sensing wrongly, which is why Chisholm calls this the Adverbial Theory. We are sensing in a way that, if taken at face value, gives us prima facie evidence for a false proposition, that is, that we are actually perceiving something that has this property, hence, we do know the proposition we are perceiving something to have this property.

Chisholm maintains that our empirical knowledge rests on a foundation of propositions expressed by true noncomparative appear statements which we are sufficiently justified in believing to know, or to use his technical term, are evident. He further asserts that they have the highest level of epistemic justification, that is they are certain. However, in saying that they are certain, Chisholm is not endorsing Descartes’ view that they are incorrigible, that is, that we can never be wrong in believing these propositions. Nonetheless, Chisholm is agreeing with Descartes that they have a special level of justification., that is, they are in some sense self-evident or, as he prefers, directly evident.

To explain this special epistemic status Chisholm appeals to Gottfried Wilhelm Leibniz’s notion of primary truths, which are of two types: primary truths of reason and primary truths of fact. A paradigm primary truth of reason is a mathematical truth like a triangle has three sides. Such truths are knowable, a priori, independently of experience, because the predicate of the statement, having three sides, is contained in the subject, triangle (Leibniz 1916, Book IV Chapter IX). Knowing primary truths of reason requires no proof, rather they are immediately obvious. Leibniz claims that similarly our knowledge of our own existence and our thoughts (the contents of our own minds) are primary truths of fact immediately known through experience, a posteriori. There is no difference between our knowing them, their being true, and our understanding or being aware of them.

Leibniz likens our immediately intuiting the truth of basic logical truths to our direct awareness of our psychological or phenomenological states at the time they occur. We are directly aware of both primary truths of reason and experience because the truth of the propositions themselves is what justifies us in believing them. We reach the proper stopping point in Socratic questioning when have reached a primary truth of fact, a proposition describing our psychological or phenomenological state when it occurs. Its truth (or the occurrence of the state) constitutes our immediate justification in believing, hence, knowing such propositions. There is no need to appeal to any other reason that justified our believing them, hence, they are directly evident.

In explaining the epistemic status of appearance, Chisholm appeals to Alexis Meinong’s observation that psychological or phenomenological states of affairs expressed by propositions of the form: ‘I think …’, ‘I believe …’, ‘I am appeared to …’, and so forth, are self-presenting in the sense that whenever these propositions are true, they are certain for the person (Chisholm 1989 pg. 19). Thus, when one is appeared to whitely (in the non-comparative sense of appeared to) one is justified in believing it and there is no proposition that the person is more justified in believing. On Chisholm’s view these self-presenting propositions, for example, about the way things (non-comparatively) appear are paradigm examples of the directly evident, hence, serve as the foundation of empirical knowledge.

Chisholm explains that while all self-presenting propositions are directly evident, not all directly evident propositions are self-presenting. The directly evident, the foundation on which empirical knowledge rests, also contains propositions that are related to, but are not themselves, propositions that are self-presenting. Chisholm writes:

But isn’t it directly evident to me now both that I am thinking and that I do not see a dog? The answer is yes. But the concept of the directly evident is not the same as that of the self-presenting. (Chisholm 1977, pg. 23)

Thus, according to Chisholm, the foundation of empirical knowledge is comprised of a broader class of propositions that include the class of self-presenting propositions, their logical implications, and the negation of the self-presenting. He uses the term directly evident to designate this class of foundational beliefs, which he defines as:

Def 2.2  h is directly evident for S =Df. h is logically contingent; and there is an e such that (i) e is self-presenting for s, and (ii) necessarily, whoever accepts e accepts h. (Chisholm 1977 pp. 23-24)

Thus, for example, Descartes’ first foundational proposition, that I exist, is directly evident for me whenever I think, for this later proposition is self-presenting for me, and (per Descartes’s insight) necessarily whoever accepts the proposition, that I think, accepts the proposition, that I exist.

The class of propositions which are directly evident for a person include propositions concerning a person’s occurrent beliefs and thoughts and propositions describing the way that things appear to a person. These latter propositions are expressed by noncomparative appear-statements of the form, ‘I am appeared F-ly to’. This class of propositions serve as the foundation of knowledge, the set propositions in relationship to which all other propositions are justified.

5. The Truths of Reason and A Priori Knowledge

Leibniz divided true propositions into two types: truths of reason and truths of fact. Truths of reason are necessarily true and their negation is impossible, while truths of fact are contingent and their negation is possible. Leibniz’s division is also based on the source of knowledge of propositions of each kind. We find out that a necessary proposition is true by analyzing it into simpler ideas or truths until we reach what he calls ‘primary truths’. He concludes that the source of knowledge of necessary truths is reason and can be known a priori, that is, independently of experience, while the source of knowledge of contingent truths is experience and can be known a posteriori.

The focus to this point has been on Chisholm’s views on the empirical foundation of knowledge or foundational knowledge a posteriori. Chisholm believes that some of our knowledge is based on necessary truths which are known a priori. He provides the following account of the justification of necessary truths, including logical truths, mathematical truths, and conceptual truths, explaining how some of these truths serve as evidence for empirical knowledge.

a. The Justification of A Priori Knowledge

Chisholm appeals to Leibniz, Frege (the late 19th and early 20th century German philosopher, logician, and mathematician), and Aristotle to explain the basis of a priori knowledge. Leibniz writes: “The immediate awareness of our existence and of our thoughts furnishes us with the first a posteriori truths of facts… while identical propositions”, [propositions of the form A=A], “embody the first a priori truths or truths of reason… Neither admits of proof, and each can be called immediate.” (Leibniz 1705, Book IV, Ch 9). The traditional term for Leibniz’s first truths of reason is ‘axioms’. Frege explains: “Since the time of antiquity an axiom has been taken to be a thought whose truth is known without being susceptible by a logical train of reasoning” (Chisholm 1989 pg. 27). Chisholm explains the meaning of ‘incapable of proof’ by appealing to Aristotle’s suggestion in Prior Analytics that “[a]n axiom or ‘basic truth’… is a proposition ‘which has no proposition prior to it’; there is no other proposition which is ‘better known’ than it is” (Chisholm 1989 pg. 27).

Chisholm proposes that axioms are necessary propositions known a priori serving as foundational propositions. They are similar in the following respect to the self-presenting (directly evident) propositions about how we are appeared to, for example, that we are appeared redly to. When we are appeared to in a certain way, we are justified in believing the proposition that we are appeared to in this way; in Chisholm’s terminology, they are evident to us. We ‘immediately’ know about these mental states because they present themselves to us. Analogously, there are some necessary truths which are evident to us ‘immediately’ upon thinking about them.

Chisholm defines axioms, the epistemically foundational propositions, as follows:

D1 h is an axiom =Df h is necessarily such that (i) it is true, and (ii) for every S, if S accepts h, then h is certain for S. (Chisholm 1989 pg. 28)

His examples of axioms include:

If some men are Greeks, then some Greeks are men;
The sum of 5 and 3 is 8;
All squares are rectangles.

Notice that according to this definition, if a person accepts a proposition which is an axiom, the proposition is certain for that person. But being an axiom is not sufficient for the proposition’s being evident or justified for a person. The person may never have considered the proposition or, worse, may believe or accept that it is false, and hence cannot be justified in believing it at all. To be sufficient for an axiom is to be certain or evident (justified) for a person, and the person must also accept the proposition.

Chisholm, therefore, adds the condition that the person accepts the proposition, defining a proposition’s being axiomatic for S as:

D2 h is axiomatic for S =Df (i) h is an axiom, and (ii) S accepts h. (Chisholm 1989 pg. 28).

Thus, for the proposition that all squares are rectangles to be axiomatic for a person requires not only that the proposition be an axiom, that is, necessarily true and necessarily such that if the person accepts it then it is certain for the person, but also that the person believes or accepts the proposition. Note that a proposition which is axiomatic for a person has the highest level of justification for the person, putting axiomatic propositions on a par, epistemically, with propositions that are directly evident.

Chisholm claims that the class of propositions that are axiomatic is the class of foundational propositions known a priori. There are also non-foundational propositions known a priori. For example, propositions that are implied by axioms may also be known a priori. However, it is not sufficient that the axiom implies the other proposition for the second proposition to be known a priori, as that would imply that all implications of axioms are also justified, whether the person is aware of the implications or not. Rather, it must also be axiomatic for the person that the axiom implies the other proposition. Suppose, for example, that it is axiomatic for a person that all interior angles of a rectangle are right angles, and also that it is axiomatic for that person that something’s being a square implies that it is a rectangle. In that case, the proposition that all the interior angles of a square are right angles is also known a priori for that person.

As it is axiomatic for every person that any proposition implies itself, axiomatic propositions are also known a priori. The theorems of logic or mathematics are also known a priori, as long as the person accepts the axiom and as long as it is axiomatic for that person that the axiom implies the theorem. Chisholm adds these additional propositions to the class of a priori knowledge by defining a priori knowledge as:

D3 h is known a priori by S =Df There is an such that (i) e is axiomatic for S, (ii) the proposition, e implies h, is axiomatic for S, and (iii) S accepts h. (Chisholm 1989 pg. 29)

Chisholm defines a proposition’s being a priori as:

D4 h is a priori =Df It is possible that there is someone for whom h is known a priori.

(Chisholm 1989 pg. 31).

b. Chisholm, Kant, and the Synthetic A Priori

Kant distinguished two types of a priori propositions, analytic and synthetic. Roughly, an analytic proposition is one in which the predicate adds nothing new to the subject, for example, ‘all squares are rectangles’. It asserts that a component of the complex concept of the subject, for example, square or equilateral rectangle, is the concept of the predicate, rectangle. The underlying idea is that the concept of the subject can be analyzed in a way that it includes the concept of the predicate. By contrast, synthetic propositions are propositions in which the predicate is ascribing properties to the subject over and above what is contained in the concept of the subject, for example, the square is large.

It is generally thought that analytic propositions are not only necessarily true, but also a priori. However, as analytic propositions seem to be redundant and trivial, they appear to contribute little or no content to a person’s knowledge. This led Kant to raise the much-debated question of whether there are propositions which are synthetic and known a priori.

Chisholm argues that much of the debate concerning Kant’s question is based on a much broader concept of ‘analytic’ than the one which Kant had in mind. To clarify the epistemological importance of Kant’s question, Chisholm provides definitions of ‘analytic’ and ‘synthetic’. Underlying the concept of an analytic proposition are two concepts, that of a property implying another property and two properties being conceptually equivalent. He defines the first as:

D5 The property of being F implies the property of being G =Df The property of being F is necessarily such that if something exemplifies it then something exemplifies the property of being G. (Chisholm 1989 pg. 33)

Thus, for example, the property of being a bachelor implies the property of being single. A bachelor is necessarily such that if something is a bachelor, then something is a single person. He then defines what it is for two properties to be conceptually equivalent as:

D6 P is conceptually equivalent to Q =Df Whoever conceives P conceives Q, and conversely. (Chisholm 1989 pg. 33)

For example, the property of being a bachelor is conceptually equivalent to being a single male, as anyone conceiving of or thinking of a bachelor conceives of a single male, and vice versa, anyone who conceives of a single male conceives of a bachelor.

Chisholm defines the concept an ‘analytic proposition’ in terms of the forgoing concepts as follows:

D7 The proposition that all Fs are Gs is analytic =Df The property of being F is conceptually equivalent to a conjunction of two properties, P and Q, such that: (i) P does not imply Q, (ii) Q does not imply P, and (iii) the property of being G is conceptually equivalent to Q. (Chisholm 1989 pg. 34)

A proposition which is not analytic is synthetic, as per the following definition:

D8 The proposition that all Fs are Gs is synthetic =Df The proposition that all Fs are Gs is not analytic. (Chisholm 1989 pg. 34)

Chisholm’s definitions clarify the philosophical importance of Kant’s question, which is whether a synthetic proposition—a proposition in which the predicate cannot be found in the analysis of the subject, that is, a proposition that is not redundant and adds content to the subject—can be known to be true a priori. Finding such a proposition implies that “the kind of cognition that can be attributed to reason alone may be more significant” (Chisholm 1989 pg. 34).

Chisholm suggests that there are four types of examples of synthetic a priori propositions. Examples of the first type are the propositions expressed by the following sorts of sentences: “Everything that is a square is a thing that has shape” and “Everyone who hears a something in C-sharp major hears a sound”. Some have claimed that the property of being a square is conceptually equivalent to the properties of having a shape and some additional properties. The second possible type of synthetic a priori propositions are referred to by Leibniz as ‘disparates’. An example of such a proposition is ‘Nothing that is red is blue’. Chisholm notes that while attempts have been made to show propositions of these two types to be analytic, none has been successful.

The third possible type of synthetic a priori propositions are statements of moral judgments like the one expressed by the following sort of sentence: “All pleasures, as such, are intrinsically good, or good in themselves, whenever and wherever they may occur”. Chisholm concurs with Leibniz’s assertion that while such propositions can be known, no experience of the senses could serve as evidence in their favor.

The final possible type of synthetic a priori propositions are propositions of arithmetic. Kant asserted that propositions like 2 +1 = 3 are not analytic, hence, they are synthetic. Some might question whether the propositions they assert are of the right form, that is, all Fs are Gs, but there may be a way to formulate them into the ‘all Fs are Gs’ form. While the principles of arithmetic have been analyzed in terms of sets, this has not been done in such a way that the predicate can be analyzed out of the subject, in which case they have yet to be shown to be analytic.

While various epistemic principles to account for the justification of certain types of propositions are discussed in this section, more of these epistemic principles are examined in the next one. Chisholm’s use of these principles raises meta-epistemological questions related to the status of these principles and the nature of epistemology itself. Are the epistemic principles necessary or contingent? Can they be known a priori or a posteriori? And are they analytic or synthetic? While Chisholm’s answers to these questions are not clear, it would not be surprising if he thought that they are synthetic a priori necessary truths.

6. The Indirectly Evident

Chisholm’s account of the Directly Evident, explains and defends one of the two main theses of foundationalism. It identified a set of propositions describing one’s psychological or phenomenological states, the way things appear or seem, as being self-presenting. These propositions and some of their logical consequences are epistemic certainties composing the directly evident foundation of empirical knowledge. The second main thesis of foundationalism is that these directly evident propositions ultimately serve, at least in part, as the justification of all empirical knowledge. To complete his theory of knowledge Chisholm undertakes to explain how empirical propositions which are indirectly evident are justified. In the process Chisholm undertakes to solve the problem of empiricism that has plagued epistemology since Descartes.

Chisholm’s account of the Indirectly Evident proposes to answer the fundamental question of how propositions about the external world are justified by propositions about one’s internal psychological experiences or states, solving what he calls the problem of empiricism. This problem finds its roots in Descartes and was inherited by his British Empiricist successors, Locke, Berkeley, and Hume.

Descartes proposed to solve this problem by employing deductive logic. Starting from his discovered foundation of certainties, composed of propositions about the contents of his mind and some necessary truths that could be proven with certainty, he sets out to prove the truth of propositions about the external world. Following his method of Geometric proof (deductive logic), he sets out to derive, through deductive reasoning the existence of the external world from the empirical certainties about the contents of his mind and some other necessary truths.

Descartes argues that, for example, when he is having a certain sort of visual experience, he is certain of the proposition that I seem to see something red. From this certainty and some additional logical certainties, among them being the proposition that God exists and is not a deceiver (for which he provides deductive proofs) using deductive reasoning he derives propositions about the external world, for example, that there is something red. In this manner, Descartes purports to have built knowledge about the external world, from foundational certainties using only deductive reasoning, a method which cannot go wrong. This epistemological program, its methodology and its associated philosophy of mind earned Descartes, and his European continental successors Baruch Spinoza and Gottfried Wilhelm Leibniz, the title of Continental Rationalists.

John Locke, the progenitor of British Empiricism, claims that all knowledge of the external world is based on experience. He argues that Descartes’ demonstrations are flawed, claiming that knowledge of the external world cannot be justified by applying deductive reasoning to the foundational propositions. Locke argues that a fundamental mistake in Descartes’ program is setting the standard of certainty for knowledge and epistemic justification too high. On Locke’s view, avoiding skepticism merely requires the probability of truth to account for the transfer of justified belief from the contents of the mind to propositions about the external world.

To account for the transfer of justification, Locke appeals to his empiricist philosophy of mind according to which the mind is a blank slate, a tabula rasa, upon which sense-data is deposited. The data provided by the senses is the source knowledge about the world. Knowledge of the external world is ultimately justified by the experience of the senses. Locke, allowing for fallible epistemic justification, claims that one can be completely justified in believing a proposition that is not entailed by the evidence or reasons that one has for believing the proposition.

Locke claims that the move from one’s sensing something to be red to the proposition that there is something that is red is justifiable because of the resemblance between the contents of one’s minds (sensations and ideas) and objects in the external world which cause these sensations and ideas. Thus, for example, we are justified in believing the proposition that I am actually seeing something red because our idea or mental representation of the red thing resembles the object which caused it. Thus, according to Locke, any proposition about the external world is justified only if the mental representation resembles the corresponding physical object.

George Berkeley, Locke’s successor, finds Locke’s justification of beliefs about physical objects from their resemblance to the contents of one’s mind problematic. Berkeley is concerned that this inference is justifiable if, and only if, we have reason to think that the external world resembles our ideas. To be justified in believing that there is a resemblance one would have to be able to compare the ideas and the physical object. However, we cannot possibly compare the two. Noting that on Locke’s view we only have epistemic access to our ideas, Berkeley objects that this inference is problematic. We can never get “outside our minds” to observe the physical object and compare it to our mental image of it, and thus, we can have no reason to think that one resembles the other. Berkeley concludes that Locke’s view entails skepticism about the external world, that is, the belief that no empirical beliefs about the external world could ever be justified.

To avoid these skeptical consequences while maintaining that knowledge of the external world is based on sense-data, Berkeley advocates phenomenalism, the view captured by his slogan “esse est percipi” (“to be is to be perceived”). (Berkeley 1710, Part I Section 3). Berkeley’s phenomenalism claims that physical objects are made up of sense-data which are the permanent possibility of sensation. There are no physical objects over and above sense-data. Propositions about the physical world are to be reduced to propositions about mental experiences of perceivers, that is, phenomenological propositions.

Berkeley’s position can be clarified with an example. The proposition that the ball is red may be analyzed in terms of (and thus entails) propositions about the perceivers’ sensations, for example, that the ball appears red, spherical shaped, and so forth. Common sense propositions about the physical objects composing the external world are justified on the basis of an inductive inference from the propositions describing how things appear to perceivers that are entailed by the external world proposition. The ball’s being red is confirmed by the phenomenal or mental sensations of spherical redness that perceivers have. One’s having certain spherical and round sense-data confirms via induction the proposition that the ball is red which entails those sense-data.

The objection that Berkeley raises to Locke’s theory is a problem endemic to empiricism. It leaves open a gap that needs to be bridged to account for knowledge on the basis of the evidence of the senses. The gap to be explained is how sense-data, that is, propositions about one’s own mental states, justifies one’s beliefs in propositions about object in the external world. Berkeley avoids the problem of explaining the reason to think that one’s sensations resemble the real physical objects by adopting phenomenalism. In Berkeley’s metaphysics, Idealism, there are no non-phenomenal entities; physical objects are just the permanent possibility of sensation. The difference between sense-data that are veridical and non-veridical is that the veridical perceptions are sustained by God’s continuously having these sense-data in His mind. Thus, for Berkeley, the meaning of statements about physical objects may be captured by statements referring only to sense experience. In his theory physical objects just are reducible to sense-data. Berkeley proposes that God’s perceptions “hold” the physical universe together.

David Hume, the next luminary of British Empiricism, finds the dependence of epistemic justification on God’s perceptions to be unacceptable. Hume’s answer to the explanatory gap in Locke’s theory is that we naturally make the connection. One way of understanding Hume is by pointing to the fact that he adopts the view which in the 20th century would become known as epistemology naturalized which relegates the explanation of the inference to science. Others have understood Hume as embracing skepticism with respect to the external world. Thomas Reid, Hume’s contemporary, invokes common sense to explain the inference. He argues that we have as good a reason to think that these common sense inferences are sufficient justification for knowledge as we have for thinking that deductive reasoning is sufficient justification for the derivation of knowledge.

Bertrand Russell rejects Berkley’s view that there are really no physical objects, but they are just bundles of perceptions. Russell accounts for perceptual knowledge claiming that we have direct access to sense-data, and these sense-data serve as the basis of empirical knowledge. Russell, in The Problems of Philosophy, admits that “in fact ‘knowledge’ is not a precise conception: it merges into ‘probable opinion’.” (Russell 1912 pg. 134).

C. I. Lewis (Lewis 1946) proposes a pragmatic version of phenomenalism to bridge the explanatory epistemological gap between sense-data and the external world. Lewis agrees with the British Empiricists that sense-data are what is ‘directly’ experienced and, moreover, serve as the given element in empirical knowledge. Berkeley’s phenomenalism attempts to bridge the explanatory gap with metaphysical Idealism. Lewis agrees with Russell that this is problematic. Lewis proposes a version of phenomenalism that is compatible with metaphysical Realism. On this view, external world propositions entail an infinite number of conditional propositions stating that if one initiates a certain type of actions (a test of the empirical proposition in question), then one experiences a certain type or sense-data. Thus, the explanatory gap is bridged by the rules of inductive inference.

Lewis’s example helps to explain his phenomenalistic account. Consider the external world proposition that there is a doorknob. Lewis claims that this proposition logically entails an unlimited number of conditional propositions expressing tests that could be undertaken to confirm that there really is a doorknob causing the experience. One such conditional would “if I were to appear to reach out in a certain way, I would have a sensation of grasping a doorknob shaped object.” One’s undertaking to appear to reach out and grab the doorknob, followed by the tactile experience of doorknob sense-data, provides confirmation, hence, pragmatic justification for believing the proposition that there is a doorknob. Thus, according to Lewis, the justification of empirical beliefs is based on an inductive interference, having confirmed sufficiently many such tests.

a. Chisholm’s Solution to the Problem of Empiricism

Chisholm enters the fray arguing (Chisholm 1948) that Lewis’s view is defective and thus fails to solve the problem of empiricism, that is, it cannot account for the inference from the mind to the external world. According to Lewis’s phenomenalism, statements about the external world, for example:

This is red;

entail propositions only referring to mental entities, sense-data, for example:

Redness will appear.

Chisholm argues that certain facts about perceptual relativity demonstrate that physical world propositions like P do not entail any propositions which refer exclusively to mental entities like R.

Chisholm explains that P, when conjoined with:

This is observed under normal conditions; and if this is red and observed under normal conditions, redness will appear;

entails R, that redness will appear. But P in conjunction with:

This is observed under normal conditions except for the presence of blue lights; and if this is red and observed under conditions which are normal except for the presence of blue lights, redness will not appear;

entails that R is false. If P and any other proposition that is consistent with P, for example S, entails that R is false, then P cannot entail Q. Perceptual relativity, the way that things appear in any circumstance being relative to the conditions under which the object is observed, makes it clear that S is consistent with P. However, P and S entail that redness will not appear, for red things do not appear red under blue light. Therefore, P does not entail R. Similarly, no physical object statement (like P) entails any proposition that is only about sense-data (like R). Chisholm concludes that Lewis’s phenomenalism is untenable.

The problem of empiricism, the justification of beliefs about physical objects based on perception, that Chisholm embarks on solving requires a plausible account of perceptual knowledge. The Lockean account claims that sense-data (mental ideas) justifies or makes evident beliefs about physical objects because sense-data resembles the physical objects. agreement of ideas. However, on Locke’s account we can never “get outside” the mind to observe the resemblance between the sense-data and the physical object causing them. Thus, it fails to provide the requisite reason for thinking that physical objects resemble the mental images. Phenomenalism proposes to avoid this problem by claiming that propositions about physical objects entail purely psychological propositions about apparent experience, sense-data. Thus, inductive reasoning (or the hypothetico-deductive method) from the psychological propositions could provide justification for the physical proposition they entail; thereby, justifying beliefs about the physical object.

But, Chisholm argues, propositions about physical objects do not entail any purely psychological propositions, that is, phenomenalism is false. Thus, if empiricism is to be salvaged, there must be another account or explanation of how propositions about appearances provide evidence that justifies beliefs about the physical world. Chisholm answers this problem with his account of the Indirectly Evident.

b. Epistemic Principles—Perception and Memory, Confirmation and Concurrence

Chisholm adopts three methodological assumptions in developing his epistemic principles or rules of evidence to account for empirical knowledge:

  • We know, more or less, what (upon reflection) we think we know (an anti-skeptical assumption);
  • On occasions when we do have knowledge, we can know what justifies us in believing what we know;
  • There are general principles of evidence that can be formulated to capture the conditions which need to be satisfied by these things that we know.

Carneades of Cyrene (ca.213-129 B.C.E.), the Ancient Greek Academic Skeptic who was a successor to Plato as the leader of the Academy, developed three skeptical epistemic principles to explain the justification that perception provides for beliefs about the external world. On Carneades’ view, while ‘perceiving’ something to be a cat is not sufficient for knowing that there is a cat, nonetheless, it makes the proposition that there is a cat acceptable. While disagreeing with Carneades’ skeptical conclusion, Chisholm thinks that Carneades’ approach to developing epistemic principles is correct.

Chisholm formulates Carneades’ first epistemic principle related to perception as:

C1. “Having a perception of something being F tends to make acceptable the proposition that something is an F.” (Chisholm 1977 pg. 68).

Having identified these acceptable propositions, Carneades notes that the set of perceptual propositions that are “hanging together like the links of a chain”, and which are “uncontradicted and concurring” (Chisholm 1977 pg. 69), including the perceptions of color, shape, and size, are also acceptable. The concurrence of these propositions makes all of them acceptable, hence his second principle

C2. Confirmed propositions that concur and do not contradict each other are more reasonable than those that do not.

Finally, the acceptable propositions that remain after close scrutiny and testing are even more reasonable than merely acceptable. This is captured in a third epistemic principle:

C3. “Concurrent propositions that survive ‘close scrutiny and test’ are more reasonable than those that do not.” (Chisholm 1977 pg. 70).

Pyrrho, Carneades’ Ancient Greek skeptic predecessor, believed that our common sense beliefs about the world were completely irrational and thus irrational to act upon. This extreme form of skepticism on its face is unacceptable to most; for example, people in general do not try to walk off cliffs nor think that doing so would be rational. Carneades’ approach to epistemology and his development of principles explaining how our beliefs are rational or acceptable, but not knowledge, by contrast, seems intuitively plausible. Thus, Chisholm adopts Carneades’ common sense approach in developing his account of the indirectly evident.

Chisholm points out that there are three possible ways in which the indirectly evident may be justified:

  1. By the relationship they bear to what is directly evident;
  2. By the relationship they bear to each other; and
  3. By their own nature, independent of any relationship they bear to anything else.

He notes that theories that account for the justification of beliefs in this first way is considered foundationalist, and that ones that account for justification of beliefs in the second way is coherentism or a coherence theory. Chisholm claims that the indirectly evident is justified in all three ways. This is consistent with his view that focusing on the ‘isms’ is not conducive to solving philosophical problems.

Chisholm claims that just about every indirectly evident proposition is justified, at least in part, by some relationship it bears to the foundation of empirical knowledge, to the directly evident. The foundation cannot serve as justification for knowing anything unless we are justified in believing these propositions. To remind us of how the directly evident is justified, Chisholm presents the following epistemic principle

(A) S’s being F is such that, if it occurs, then it is self-presenting to S that he is F. (Chisholm 1977 pg. 73).

The epistemic principles are formulated as schemata, that is, abbreviations for infinitely many principles. In (A), ‘F’ may be replaced by any predicate that would be picked from “a list of various predicates, each of such a sort as to yield a description of a self-presenting state of S.” (Chisholm 1977 pg. 73). This principle asserts that, for example, if I am appeared redly to, then it is self-presenting to me that I am appeared redly to. Moreover, whenever I am in such a state, for example, the state of being am appeared redly to, the proposition that I am appeared to redly is evident, directly evident, for me.

Chisholm’s next two epistemic principles concern perception. These principles are intended to show how the indirectly evident is justified by the combination of their relationship to the foundation and by their own nature, that is, the nature of perception and memory. Some clarification of the terminology used in these principles will aid in understanding the theory. Chisholm uses ‘believes he/she perceives that’ to assert a relationship between a person and a proposition, for example, that Jones believes that she perceives that there is a cat. This can be true even when it is false that there is a cat in front of her, for example, even when she is hallucinating. An alternative way Chisholm sometimes expresses this is as ‘Jones takes there to be a cat’. Chisholm prefers to use ‘believes that she perceives’ in place of ‘takes’ as it makes the ‘that’-clause explicit, noting that we will assume that ‘believes that she perceives’ means simply that the person has a spontaneous nonreflective experience, one that she would normally express by saying, “I perceive that…”.

Chisholm observes that, to account for the justification of perceptual propositions, one might be inclined (as he was in the first edition of Theory of Knowledge) to formulate a principle along the line of Carneades’ first epistemic principle that if a person believes that he perceives that there is a sheep, then the person is justified in believing that there is a sheep.

However, such a principle must be qualified because of cases like the following. Suppose that a person seems to see an animal that looks like a sheep, but also knows that (i) there are no sheep in the area, and (ii) many dogs in the area look like sheep. Contrary to what this principle implies, the person is not justified in believing that it is a sheep (but rather a dog).

Chisholm qualifies this principle to exclude cases like this one because the person has grounds for doubt that the proposition in question. To qualify this principle he defines a term, ground for doubt, which in turn depends on having no conjunction of propositions that are both acceptable for the person which tend to confirm that the proposition in question is false.

Chisholm defines the requisite notion of confirmation as

D4.1 e tends to confirm h =Df Necessarily, for every S, if e is evident for S and if everything that is evident for S is entailed by e, then h has some presumption in its favor for s.

He explains that confirmation is both a logical and an epistemic relation. If it obtains between two propositions, that is, if e confirms h, then necessarily it obtains between these two propositions (it is a matter of logic that it obtains). Furthermore, if e confirms h, and if one knew that e was true, one would also have reason for thinking that h was true. Chisholm cautions that from the fact that e confirms h, it does not follow that the conjunction e and another proposition, g, also confirms h. What we assert in saying that e confirms h may also be expressed by saying that h has a certain (high) probability in relation to e.

Armed with this concept notion of confirmation, he now defines what is it for something to be believed without ground for doubt as≔=

D4.3 S believes, without ground for doubt, that p =Df (i) S believes that p, and (ii) no conjunction of propositions that are acceptable for S tends to confirm the negation of the proposition that p. (Chisholm 1977 pg. 76)

Chisholm qualifies the epistemic principle under consideration as:

(B) For any subject S, if S believes without ground for doubt that he perceives something to be F, then it is beyond reasonable doubt for S that he perceives something to be F. (Chisholm 1977 pg. 76).

While this principle justifies beliefs about the external world to a higher degree than did Carneades’ principle, it falls short of rendering them sufficiently justified for knowledge. Chisholm’s reason for this is that the property designated in the schema by F can be replaced by a property like being a sheep about which the person may have mistakenly classified the object he is looking at.

Chisholm proposes a third epistemic principle which yields knowledge level justification of propositions by restricting the properties that one is perceiving the object to have to the ‘proper objects’ of perception. These are sensible characteristics, such as visual characteristics like colors and shapes, auditory characteristics like loud and soft, tactile characteristics like smooth and rough, olfactory characteristics like spicy and burn, gustatory characteristics like salty and sweet, and ‘common sensibles’ like movement and number. This principle states:

(C) For any subject S and any sensible characteristic F, if S believes, without ground for doubt, that he is perceiving something to be F, then it is evident for S that he perceives something to be F. (Chisholm 1977 pg. 78)

This principle accounts for justifying the indirectly evident on the basis of the directly evident. Consider how it works in a case that Jones is looking at a red object. Assume that she is appeared to redly (the object appears to her to be red) and that she has no evidence that would count against her actually perceiving that there is something red (that is, that nothing acceptable to her that tends to confirm that she is not actually perceiving something to be red). In such a case it is evident to her that (i) she actually perceives something to be red, and (ii) that there is something red. Moreover, she knows those propositions assuming that she believes them.

These epistemic principles account only for the justification of our beliefs concerning what we are perceiving at any given moment. To account for the justification of anything about the past, Chisholm proposes epistemic principles to account for the justification of beliefs based on memory. We should note that “’memory’ presents us with a terminological difficulty analogous to that presented by ‘perception’.“ (Chisholm 1977 pg. 79). Chisholm proposes that the expression ‘believes that he remembers’ be used in a way analogous to the way he uses ‘believes that he perceives’; that is, ‘S believes that he remembers that p’ does not imply the truth of p (nor does it imply that p is false).

Chisholm notes that “[s]ince both our memory and perception can play us false, we run a twofold risk when we appeal to the memory of a perception.” (Chisholm 1977 pg. 79). If we are justified in believing a proposition based on our seeming to remember that we perceived it to be true, we can go wrong in two ways: our perception or our memory (or possibly both) may mislead us. For this reason, Chisholm formulates principles for remembering that we perceived, principles along the same lines as the principles concerning perception but takes into account that the evidence that memory of a perception provides is weaker than that of perception. The principles about memory of perceptions, therefore, only justify propositions to a lower epistemic level of justification than those of perception. Corresponding to (B), Chisholm proposes:

(D) For any subject S, if S believes, without ground for doubt, that he remembers perceiving something to be F, then the proposition that he does remember perceiving something to be F is one that is acceptable for S. (Chisholm 1977 pg. 80).

Restricting the range of ‘F’ to sensible predicates, he presents the following analogue to (C):

(E)     For any subject S, if S believes, without ground for doubt, that he remembers perceiving something to be F, then it is beyond a reasonable doubt for S that he does remember perceiving something to be F. (Chisholm 1977 pg. 80).

Chisholm points out that if our memory of perceptions is reasonable, then so must our memory of self-presenting states, and thus, proposes that:

(F)     For any subject S and any self-presenting property F, if S believes, without ground for doubt, that he remembers being F, then it is beyond a reasonable doubt for S that he remembers that he was F. (Chisholm 1977 pg. 81).

Although Chisholm explained the justification of some empirical beliefs, the epistemic principles presented thus far do not account for knowledge of ordinary common-sense propositions (for example, ‘There is a cat on the roof’). The epistemic principles proposed to this point account for these propositions being acceptable or beyond reasonable doubt, but do not account for their being evident, that is, justified to the level required for knowledge.

Chisholm appeals to the coherence of propositions about memory and perception, that is, their being mutually confirming and concurring, to explain how beliefs can be justified to the level required for knowledge. A proposition justified to a certain level is also justified to all lower levels, for example, an evident proposition, is also beyond a reasonable doubt, acceptable, and has some presumption in its favor. Some propositions justified according to the epistemic principles of perception and memory are evident, others beyond reasonable doubt, and others as acceptable, but all propositions justified by these principles are at least acceptable.

Chisholm proposes that if the conjunction of all propositions that are acceptable for someone tend to confirm another proposition, then this latter proposition has some presumption in its favor, that is:

(G)    If the conjunction of all those propositions e, such that e is acceptable for S at t tends to confirm h, then h has some presumption in its favor for S at t. (Chisholm 1977 pp. 82-83).

He then defines the concept of a concurrent set of propositions, which may be thought of as coherence of propositions or beliefs, as follows:

D4.4   A is a set of concurrent propositions =Df A is a set of two or more propositions each of which is such that the conjunction of all the others tends to confirm it and is logically independent of it. (Chisholm 1977 pg. 83).

He proposes the following, somewhat bold, epistemic principle of how concurring proposition raise their level of epistemic justification:

(H)      Any set of concurring propositions, each of which has some presumption in its favor for S, is such that each of its members is beyond reasonable doubt for S.

To explain how this principle works, Chisholm asks us to consider the following propositions:

    1. There is a cat on the roof today;
    2. There was a cat on the roof yesterday;
    3. There was a cat on the roof the day before yesterday;
    4. There is a cat on the roof almost every day.

He asks us to suppose that (1) expresses a perceptual belief and, therefore, is beyond reasonable doubt; (2) and (3) express what I seem to remember perceiving, and thus, are acceptable; and (4) is confirmed by the conjunction of these acceptable statements. As this set of propositions is concurrent, each is beyond reasonable doubt (according to (H)).

Chisholm completes his theory of evidence by accounting for our knowledge which is gained via perception with the following principle:

(I)    If S believes, without ground for doubt, that he perceives something to be F, and if the proposition that there is something F is a member of a set of concurrent propositions each of which is beyond reasonable doubt for S, then it is evident for S that he perceives something to be F. (Chisholm 1977 pg. 84).

This completes the account that Chisholm promised of how the Indirectly Evident may be justified in three ways. The first is, at least partially, by its relationship to the foundation. The epistemic principles accounting for justification of propositions based on perception and memory are partially dependent on the self-presenting states of being appeared to in a certain way, that is, seeming to perceive and seeming to remember perceiving. The second is that they provide justification because of their own nature, that is, that perception and memory provide prima facie evidence for believing that they represent the way things are in the world. The third is that by the coherence, the confirmation and concurrence, of propositions that are justified to a lower level than required by knowledge. Chisholm’s account is not only a version of foundationalism, but also incorporates elements of the coherence theory.

Chisholm admits that these epistemic principles provide, at best, an outline of a full-blown account of our knowledge of the external world, and a very rough one at that. They do not give an account of our knowledge of many of the very common things that we know about the past and about the complex world that we live in. However, this was all that Chisholm promised to provide. Moreover, this account is one of the most complete and comprehensive accounts developed of empirical knowledge.

7. Postscript

This presentation of Chisholm’s epistemology is largely based on the version of his theory from the second edition of his book Theory of Knowledge (Chisholm 1977). Chisholm continuously revised and improved his theory based on counterexamples and objections raised by his colleagues and students. His subsequent works provided his answers to the Gettier Problem, as well as a more detailed account of the epistemic principles accounting for the Indirectly Evident. In 1989 Chisholm published his third, and what became his final, edition of his Theory of Knowledge. This article is intended to introduce Chisholm’s theory of knowledge to lay the groundwork for the reader to undertake a detailed examination of the final version of his theory. It is left as an exercise to the reader to decide whether Chisholm’s principles do what he intends them to do.

8. References and Further Reading

  • Ayer, A. J., 1940. The Foundations of Empirical Knowledge, New York, NY: St. Martins Press.
  • Berkeley, George, 1710. The Principles of Human Knowledge, (http://www.earlymoderntexts.com/assets/pdfs/berkeley1710.pdf)
  • Chisholm Roderick, 1942, “The Problem of the Speckled Hen,” Mind 51(204): 368-373.
  • Chisholm, Roderick, 1946, Monograph: “Theory of Knowledge” in Roderick M. Chisholm, Herbert Feigl, William H. Frankena, John Passmore, and Manley Thomson (eds.), Philosophy: The Princeton Studies: Humanistic Scholarship in America, 233-344, reprinted in Chisholm 1982.
  • Chisholm, Roderick, 1948. “The Problem of Empiricism,” The Journal of Philosophy 45, 512-517.
  • Chisholm, Roderick, 1957. Perceiving: A Philosophical Study, Ithaca: Cornell University Press.
  • Chisholm, Roderick, 1965. “‘Appear’, ‘Take’, and ‘Know’”, reprinted in Robert J. Swartz (ed.), Perceiving, Sensing, and Knowing, (University of California, Berkeley, California, 1965).
  • Chisholm, Roderick, 1966. Theory of Knowledge, Englewood Cliffs, NJ: Prentice-Hall.
  • Chisholm, Roderick, 1977. Theory of Knowledge, 2nd edition. Englewood Cliffs, NJ: Prentice-Hall.
  • Chisholm, Roderick, 1979. “The Directly Evident,” in George Pappas (ed.), Justification and Knowledge, Dordrecht, D. Reidel Publishing Co.
  • Chisholm, Roderick, 1982. The Foundations of Knowing, Minneapolis: University of Minnesota Press.
  • Chisholm, Roderick, 1989. Theory of Knowledge, 3rd edition. Englewood Cliffs, NJ: Prentice-Hall.
  • Chudnoff, Elijah, 2021. Forming Impressions: Expertise in Perception and Intuition, Oxford, Oxford University Press.
  • Clifford, W. K., 1877. “The Ethics of Belief”, Contemporary Review (177), reprinted in Clifford’s Lectures and Essays (London, MacMillan, 1879).
  • Conee, Earl and Feldman, Richard, 2011. Evidentialism: Essays in Epistemology, Oxford, Oxford University Press.
  • Descartes, Rene, 1641, Meditations on First Philosophy (Third Edition), edited by Donald J. Cress, Indianapolis, IN, Hackett Publishing Company, 1993.
  • Feldman, Richard, 1974. “An Alleged Defect in Gettier-Counterexamples,” Australasian Journal of Philosophy 52(1), 68-69.
  • Feldman, Richard, 2003. Epistemology, Upper Saddle River, NJ, Prentice Hall.
  • Foley, R., 1997. “Chisholm’s Epistemic Principles,” in Hahn 1997, pp. 241–264.
  • Gettier, Edmund, 1963. “Is Knowledge Justified True Belief?” Analysis 23, 121-123.
  • Goldman, Alvin, 1976. “A Causal Theory of Knowing.” Journal of Philosophy, 64, 357-372.
  • Hahn, Lewis Edwin 1997. The Philosophy of Roderick M. Chisholm, (The Library of Living Philosophers: Volume 25), Lewis Edwin Hahn (ed.), Chicago, La Salle: Open Court.
  • Heidelberger, H., 1969. “Chisholm’s Epistemic Principles,” Noûs, 3(1), 73–82.
  • Hume, David, A Treatise of Human Nature, (London, Oxford Clarendon Press, 1973).
  • Hume, David, Enquiries Concerning Human Understanding and Concerning the Principles of Morals (3rd Edition), (London, Oxford Clarendon Press, 1976).
  • James, William, 1896. “The Will to Believe”, in The Will to Believe, and Other Essays in Popular Philosophy, and Human Immortality, New York, Dover Publications, 1960.
  • Kyburg, Henry E. Jr., 1970. “On a Certain Form of Philosophical Argument,” American Philosophical Quarterly, 7,  229-237.
  • Legum, Richard A. 1980. “Probability and Foundationalism: Another Look at the Lewis-Reichenbach Debate.” Philosophical Studies 38(4), 419–425.
  • Lehrer, Keith, 1997, “The Quest for the Evident” in Hahn 1997, pp. 387–401.
  • Leibniz, Gottfreid Wilhelm, 1705, New Essays Concerning Human Understanding, (LaSalle, Il: Open Court Publishing Company, 1916), http://www.earlymoderntexts.com/assets/pdfs/leibniz1705book4.pdf.
  • Lewis, C.I., 1929. Mind and the World Order. New York, Dover Publications.
  • Lewis, C.I., 1946. An Analysis of Knowledge and Valuation. La Salle, IL, Open Court.
  • Lewis, C.I., 1952. “The Given Element in Empirical Knowledge.” The Philosophical Review 61, 168-175.
  • Locke, John, 1690, An Essay Concerning Human Understanding, edited by Peter H. Niddich (Clarendon Press; 1st Clarendon edition (2 Aug. 1979)).
  • Mavrodes George, 1973. “James and Clifford on ‘The Will to Believe’,” in Keith Yandell (ed.), God, Man, and Religion (McGraw-Hill, New York, 1973).
  • Pollock, J and Cruz, J. 1999. Contemporary Theories of Knowledge, 2nd edition. New York, Rowman & Littlefield.
  • Pryor, J. 2001. “Highlights of Recent Epistemology,” The British Journal for the Philosophy of Science 52, 95-124. (Stresses that modest foundationalism looks better in 2001 than it looked ca. 1976).
  • Quine. W.V.O. 1951. “Two Dogmas of Empiricism.” The Philosophical Review 60, 20-43.
  • Reichenbach, Hans, 1952. “Are Phenomenal Reports Absolutely Certain.” The Philosophical Review 61, 168-175.
  • Russell, Bertrand, 1912, The Problems of Philosophy, Hackett Publishing Company, Indianapolis.
  • Sellars, Wilfred, 1956, “Empiricism and the Philosophy of Mind,” in H. Feig l and M. Scriven, Minnesota Studies in the Philosophy of Science, Vol. I, (Minneapolis: University of Minnesota Press, 1956), pp. 253-329.
  • Sosa, Ernest, 1980. “The Raft and The Pyramid: Coherence versus Foundations in the Theory of Knowledge,” in French, Uehling, and Wettstein (eds.), Midwest Studies in Philosophy, Volume V, Studies in Epistemology, (University of Minnesota Press, Minneapolis, 1980).
  • Sosa, Ernest, 1997, “Chisholm’s Epistemology and Epistemic Internalism,” in The Philosophy of Roderick M. Chisholm (The Library of Living Philosophers: Volume 25), Lewis Edwin Hahn (ed.), Chicago, La Salle, Open Court, pp. 267–287.
  • van Cleve, James. 1977. “Probability and Certainty: A Reexamination of the Lewis-Reichenbach Debate’,” Philosophical Studies 32(4), 323-34.
  • van Cleve, James. 2005. “Why Coherence is Not Enough: A Defense of Moderate Foundationalism,” in Contemporary Debates in Epistemology, edited by Matthias Steup and Ernest Sosa. Oxford, Blackwell, pp. 168-80.
  • Vogel, Jonathan. 1990. “Cartesian Skepticism and Inference to the Best Explanation.” The Journal of Philosophy 87, 658-666.

 

Author Information

Richard Legum
Email: richard.legum@kbcc.cuny.edu
City University of New York
U. S. A.

Meaning and Communication

Communication is crucial for us as human beings. Much of what we know or believe we learn through hearing or seeing what others say or express, and part of what makes us human is our desire to communicate our thoughts and feelings to others. A core part of our communicative activity concerns linguistic communication, where we use the words and sentences of natural languages to communicate our ideas. But what exactly is going on in linguistic communication and what is the relationship between what we say and what we think? This article explores these issues.

A natural starting point is to hold that we use words and sentences to express what we intend to convey to our hearers. In this way, meaning seems to be linked to a speaker’s mental states (specifically to intentions). Given that this idea is at the heart of Paul Grice’s hugely influential theory of meaning and communication, this article begins by spelling out in detail how Grice makes the connection between communication and thought in section §1. The Intentionalist approach personified by Grice’s model has been endorsed by many theorists, and it has provided a very successful paradigm for empirical research; however, it is not without its problems. Section §2 surveys a number of problems faced by Grice’s specific account, and §3 considers challenges to the core Intentionalist claim itself, namely, that meaning and communication depend on the intentions of the speaker. Given these concerns, the article closes in section §4 with a sketch of two alternative approaches: one which looks to the function expressions play (teleology), and one which replaces the Intentionalist appeal to mental states with a focus on the social and normative dimensions of language and communication.

Table of Contents

  1. The Intentionalist Stance: Grice’s Theory of Meaning and Communication
    1. Grice’s Theory of Meaning
      1. Natural and Non-Natural Meaning
      2. Speaker-Meaning and Intention
      3. Speaker-Meaning as Conceptually Prior to Sentence-Meaning
    2. Grice’s Theory of Conversation
  2. Problems with Grice’s Theory of Meaning
    1. Problems with the First Condition of Grice’s Analysis
    2. Problems with the Third Condition of Grice’s Analysis
    3. The Insufficiency of Grice’s Analysis
    4. Problems with Conventional Speech Acts
    5. Problems with Explaining Sentence-meaning in Terms of Speaker-Meaning
  3. Are Intentionalist Approaches Psychologically Implausible?
    1. Is Grice’s Model Psychologically Implausible?
      1. Grice’s Response to the Challenge: Levels of Explanation
      2. A Post-Gricean Response: Relevance Theory
    2. Is the Intentionalist Assumption of the Priority of Thought over Language
  4. Rejecting Intentionalism
    1. Teleological Approaches: Millikan
    2. Normative Social Approaches
  5. Conclusion
  6. References and Further Reading

1. The Intentionalist Stance: Grice’s Theory of Meaning and Communication

a. Grice’s Theory of Meaning

Paul Grice’s seminal work has had a lasting influence on philosophy and has inspired research in a variety of other disciplines, most notably linguistics and psychology. His approach to meaning and communication exemplifies a general thesis which came to be called ‘Intentionalism’. It holds that what we mean and communicate is fixed by what we intend to convey. This idea is intuitively compelling but turns out to be hard to spell out in detail; this section thus offers a fairly detailed account of Grice’s view, divided into two subsections. §1.a summarises the core claims and concepts of Grice’s theory of meaning as well as Grice’s proposed definition of communication. §1.b. briefly summarises a different strand of Grice’s theory, namely, his theory of conversation, which concerns the role that the assumptions of cooperation and rationality play in communication. In setting out the different elements of Grice’s theory, the distinctions between his definition of communication and his theory of conversation become clear. References to other important Intentionalist accounts are provided throughout the article as well.

i. Natural and Non-Natural Meaning

Grice (1957/1989, pp. 213-215) starts by distinguishing two different kinds of meaning: natural and non-natural. The former occurs when a given state of affairs or property naturally indicates a further state of affairs or property, where “naturally indicates” entails standing in some kind of causal relation. So, for instance, we might say “Smoke means fire” or “Those spots mean measles”. This kind of natural meaning relation is very different, however, from non-natural meaning. For non-natural meaning, the relationship between the sign and what is signified is not straightforwardly causal. Examples of non-natural meaning include:

  1. “Three rings on the bell mean the bus will stop”;
  2. “By pointing at the chair, she meant you should sit down”;
  3. “In saying ‘Can you pass the salt?’ he meant Pass the salt”.

It is non-natural meaning (of which linguistic meaning forms a central case) that is the main explanatory target of Grice’s theory of meaning and it is therefore also the focus of this article.

In order to provide a systematic theory of non-natural meaning, Grice distinguishes two main types of non-natural meaning: speaker-meaning and sentence-meaning (this has come to be used as the standard terminology in literature, though it should be noted that Grice himself often preferred a different terminology and made more fine-grained distinctions). As the terms suggest, speaker-meaning is what speakers mean when uttering a sentence (or using a type of gesture, and so forth) on a particular occasion, although sentence-meaning is what sentences mean, where ‘sentences’ are understood as the abstract entities that can be used by speakers on different occasions. Importantly, what a speaker means when using a sentence on a particular occasion need not correspond to its sentence-meaning. For instance, in (3), it seems the speaker means something (pass the salt) which differs slightly from the sentence-meaning (can you pass the salt?). This point returns in subsection §1.b.

Having distinguished speaker-meaning from sentence-meaning, Grice (1957/1989) advances his central claim that speaker-meaning grounds and explains sentence-meaning, so that we can derive sentence-meaning from the more basic notion of speaker-meaning. This claim is crucial to Grice’s overarching aim of providing a naturalistically respectable account of non-natural meaning. That is to say, Grice’s aim is to give an account of non-natural meaning which locates it firmly in the natural world, where it can be explained without appeal to any strange abstract entities (see the article on Naturalism for further details on naturalistic explanations). To do this, Grice was convinced that gestural and conventional meaning needed to be reduced to claims about psychological content—the things that individual gesturers and speakers think and intend. In order to explain how this reduction comes about, Grice’s analysis of speaker-meaning is considered in §1.a.ii, and Grice’s explanation of sentence-meaning in §1.a.iii.

ii. Speaker-Meaning and Intention

According to Grice, speaker-meaning is to be explained in psychological terms, more specifically in terms of a speaker’s intentions. Grice’s analysis thus lines up with the intuition that what a speaker means when producing an utterance is what she intends to get across. Starting with this intuition, one might think that speaker-meaning simply occurs when a speaker intends:

(i) to produce an effect in the addressee.

For instance, imagine that Anne says, “I know where the keys are”. We might think that Anne means Anne knows where the keys are by this utterance because this is the belief she intends her audience to form (that is, the effect she intends to have on her audience is that they form that belief). However, Grice (1957/1989, p. 217) argues that this condition is not sufficient. To see this, imagine that Jones leaves Smith’s handkerchief at a crime scene in order to deceive the investigating detective into believing that Smith was the murderer. Jones intends the detective to form a belief (and so satisfies condition (i)), but it seems incorrect to say that Jones means that Smith was the murderer by leaving the handkerchief at the crime scene. In this situation, it does not seem right to think of Jones as non-naturally meaning or communicating anything by leaving the handkerchief (this intuition is reinforced by recognising that if the detective knew the origin of the sign—that it had been left by Jones—that would fundamentally change the belief the detective would form, suggesting that the belief the detective does form is not one that has been communicated by Jones).

To address this worry, Grice suggests adding a second condition to the definition of speaker-meaning, namely that the speaker also intends:

(ii) that the addressee recognises the speaker’s intention to produce this effect.

This condition avoids our problem with Jones and the handkerchief, demanding the intentions involved in communication to be overt and not hidden, in the sense that the speaker must intend the addressee to recognise the speaker’s communicative aim. Although Grice (1957/1989, pp. 218-219) believes that these two conditions are indeed necessary for speaker-meaning, he argues that one more condition is required for a sufficient analysis. This is because of cases such as the following. Imagine that Andy shows Bob a photograph of Clyde displaying undue familiarity with Bob’s wife. According to Grice, we would not want to say in such a case that Andy means that Bob’s wife is unfaithful, although Andy might well fulfil our two conditions (he might intend Bob to form such a belief and also intend Bob to recognise his intention that Bob forms such a belief). The reason this does not count as a genuine case of communication, Grice claims, is that Andy would have acquired the belief that his wife is unfaithful just by looking at the photo. Andy’s intentions do not then, Grice contends, stand in the right relation to Bob’s belief, because Bob coming to have the belief in question is independent of Andy’s intentions. So, according to Grice, the speaker must not only intend to produce an effect (condition (i)) and intend this intention to be recognised (condition (ii)) but the recognition of (ii) must also play a role in the production of the effect.

Grice holds that these three conditions are necessary and jointly sufficient for a speaker to mean something by an utterance. His account of speaker-meaning (for assertions) can be thus summarised as follows:

A speaker, S, speaker-means that p by some utterance u if and only if for some audience, A, S intends that:

(a) by uttering u, S induces the belief that p in A;
(b) A should recognise that (a);
(c) A’s recognition that (a) should be the reason for A forming the belief that p.

Grice further specifies that what the speaker means is determined, that is, established, by the intended effect. (i-iii) define what is needed for an assertion, as the intended effect is for the audience to form a certain belief—for example, in the example above, Anne intends her audience to form the belief that Anne knows where the keys are. On the other hand, in cases where the speaker means to direct the behaviour of the addressee, Grice claims that the intended effect specified in (i-iii) should instead be that the audience performs a certain action—for example, a person pointing at a chair means the addressee should sit down.

iii. Speaker-Meaning as Conceptually Prior to Sentence-Meaning

As noted above Grice held that speaker-meaning is more basic than sentence-meaning. Note that Grice’s analysis of speaker-meaning does not contain any reference to sentence-meaning. The analysis refers only to utterances and although these can be productions of a sentence or of some other abstract entity (such as a recurrently used type of gesture) this does not have to be the case. To illustrate this, consider an example from Dan Sperber and Deidre Wilson (1995, pp. 25-26) in which a speaker responds to the question “How are you feeling?” by pulling a bottle of aspirin out of her handbag. Here, the speaker means that she is not feeling well but she does not produce anything appropriately considered an utterance of a sentence or some other type of behaviour that has an established or conventionalised meaning. This illustrates that speaker-meaning can be explained in psychological terms without reference to sentence-meaning.

Sentence-meaning, on the other hand, does (according to Grice) require appeal to speaker-meaning. The idea here is that the sentences of a language do not have their meaning in virtue of some mind-independent property but in virtue of what speakers of the language do with them. Hence, Grice claims that “what sentences mean is what (standardly) users of such sentences mean by them; that is to say what psychological attitudes toward what propositional objects such users standardly intend […] to produce by their utterance.” (Grice, 1989, p. 350). For example, the sentence “snow is white” means snow is white because this is what speakers usually intend to get across by uttering this sentence. In this way, Grice explains non-natural meaning in purely psychological terms.

One reservation one might have about this approach is that it looks like Grice gets the relationship between speaker-meaning and sentence-meaning ‘backwards’ because sentence-meaning seems to play an important role in how speakers get across what they intend to convey. For example, Anne’s intention to convey that she knows where the keys are by uttering “I know where the keys are” seems to depend on the meaning of the sentence. So how can speaker-meaning be conceptually prior to sentence-meaning? This worry can be addressed, however, by highlighting a distinction between constitutive and epistemic aspects of speaker-meaning. On Grice’s account, what a speaker means is constituted by the kind of complex intention described in (i-iii) above. However, speakers can nevertheless exploit an established sentence-meaning when they try to get their audiences to recognise what they mean (Grice, 1969/1989, p. 101). The idea here is that an efficient way for a speaker such as Anne to get across that she knows where the keys are will simply be to say, “I know where the keys are”. Although what she means is determined by her intention, the use of the respective sentence allows her to convey this intention to her audience in a straightforward way. At the same time, an established use of a sentence will also inform (or constrain) the formation of communicative intentions on the side of the speaker. For example, a speaker who knows the meaning of the sentence “I know where the keys are” will not utter it intending to convey that ostriches are flightless birds, say, because—in the absence of a very special context—it would be irrational for the speaker to think that this intention would be correctly recognised by the addressee. Importantly, this does not entail that speaker-meaning cannot diverge from sentence-meaning at all and, as is discussed in §1.b, in fact, such divergences are not uncommon.

This section concludes with a note on communication. As has been noted by Intentionalists such as Peter Strawson, for Grice the kind of complex intention that he described seems to play two roles. First, it is supposed to provide an analysis of what it takes for a speaker to mean something. In addition, however, it is also “undoubtedly offered as an analysis of a situation in which one person is trying, in a sense of the word ‘communicate’ fundamental to any theory of meaning, to communicate with another.” (1964, p. 446). Put somewhat differently, on Grice’s account for an agent to communicate something she will need to mean what she tries to get across and this meaning is analysed in terms of the complex intention that Grice provided as an analysis of speaker-meaning. The communicative attempt will be successful—that is, communication will occur—if and only if this intention is recognised by the speaker. For this reason, such intentions have come to be called “communicative intentions” in the literature following Grice (he himself preferred to speak of “m-intentions”).

b. Grice’s Theory of Conversation

The basis of Grice’s theory of conversation lies in his distinction between what is said and what is implicated by a speaker, both of which are part of the content that a speaker communicates (and therefore part of the speaker’s communicative intention). Roughly speaking, what is said lines up with sentence-meaning, with adjustments being made for context-sensitive expressions, such as “I” or “tomorrow” and ambiguous terms (such as “bank” or “crane”; see the article on Meaning and Context-Sensitivity). Although what is implicated is what a speaker communicates without explicitly saying it. To illustrate this distinction, consider the following examples:

(4)         Alice: Do you want to go to the cinema?

Bill: I have to work.

(5)         Professor: Some students have passed the exam.

In an exchange such as (4), it is clear that Bill does not merely communicate what he literally says, which is that he has to work. He also communicates that he cannot go to the cinema. Similarly, in (5) the professor does not merely communicate that some students have passed the exam (which is what she explicitly says) but that not all students have passed. Such implicitly or indirectly communicated contents are what Grice calls “implicatures” and his theory of conversation attempts to explain how they are conveyed.

At the basis of his explanation lies a crucial assumption about communication, namely that communication is a rational and cooperative practice. The main idea is that communicative interactions usually serve some mutually recognised purpose—such as the exchange of information—and that participants expect everyone to work together in pursuing this purpose. This assumption is captured in Grice’s Cooperative Principle:

Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged. (1975/1989, p. 26)

Grice further specifies what communicative cooperation involves by distinguishing four central conversational categories: quantity, quality, relation, and manner. All categories come with one or more conversational maxims that are supposed to be observed by cooperative communicators. Roughly, maxims under the category of quantity require interlocutors to provide the right amount of information, whereas quality maxims require speakers to provide only information that they believe to be true (or for which they have sufficient evidence). The only maxim under the category of relation is that interlocutors must make relevant contributions. Finally, maxims of manner require that interlocutors’ contributions are perspicuous.

How do the Cooperative Principle and the associated maxims help to explain how implicatures are communicated? Consider again the exchange in (4). In order to figure out what Bill implicated when he said, “I have to work”, Alice can reason as follows: I assume that Bill is being cooperative, but what he said, on its own, is not a relevant response to my question. I should infer, then, that Bill intended to communicate something in addition to what he merely said, something that does provide a relevant response to my question. This further information is probably that he cannot come to the cinema because he will be at work. A similar explanatory method is available for (5).

More generally, Grice (1975/1989, p. 31) proposes the following as a general pattern for inferring implicatures:

  • S has said that p;
  • there is no reason to suppose that S is not observing the maxims, or at least the Cooperative Principle;
  • S could not be doing this unless he thought that q;
  • S knows […] that I can see that the supposition that S thinks that q is required;
  • S has done nothing to stop me thinking that q;
  • S intends me to think […] that q;
  • and so S has implicated that q.

Notice that this pattern is not supposed to be deductively valid but rather to provide a pattern for an inference to the best explanation. In the first two decades of the twenty-first century it became common for theorists to hold that inferences of this kind not only play a role in the process of interpreting implicatures but also in many other types of linguistic phenomena, such as resolving ambiguous and vague terms. Because such inferences rely not just on sentence-meaning but also on considerations of cooperation, context, and speaker intention, they are often referred to as “pragmatic inferences” or instances of “pragmatic reasoning”.

2. Problems with Grice’s Theory of Meaning

Grice’s theory of meaning has been subject to numerous challenges. This section briefly outlines the standard objections and explains how Grice and other Intentionalists have tried to deal with them. A further problem for Grice’s account, concerning its psychological plausibility, is considered separately in §3.

a. Problems with the First Condition of Grice’s Analysis

As noted, the first condition of Grice’s analysis states that in order to communicate, a speaker must have the intention to bring about an effect in her audience. There are two main types of objection to this claim: first, there are cases in which the intended effects are not the ones that Grice claims they are and, second, there are cases where speakers do not intend to have any effect on an audience, yet which still communicate something.

To appreciate the first worry, it is useful to focus on assertions, where Grice claims that the intended effect is that the audience forms a belief. Reminders, however, seem to be a counterexample to this claim. For instance, imagine a scenario in which the addressee knows that a woman’s name is ‘Rose’ but cannot recall this piece of information in a particular situation. In such a case, the speaker might hold up a rose or simply say the following to remind the addressee:

(7) Her name is Rose.

It seems that the speaker does not want to make the addressee believe that the woman’s name is Rose because the addressee already believes this. It is equally clear, however, that the speaker means or communicates that the woman’s name is Rose. A second type of counterexample concerns examination cases. Imagine that a teacher asks a student when the Battle of Waterloo was fought to which the student replies:

(8) The Battle of Waterloo was fought in 1815.

Again, the student is not intending that the teacher forms the respective belief because the teacher already knows when the Battle of Waterloo occurred. But the student still clearly communicates that the Battle of Waterloo was fought in 1815.

Turning to the second objection, some cases suggest that a speaker need not have any audience-directed intention whatsoever. For instance, sometimes a speaker lacks an audience, such as when writing in a personal diary or practicing a speech in an empty room. However, we still want to say that the speaker means something in such cases. In other cases, speakers do have an audience, but they do not intend their utterances to have any effect on the audience. For example, imagine that an employee at a train station is reading out a list of departures over the loudspeaker. They utter:

(9) The next London train departs at 4 pm.

It is possible that the employee does not care in the least if—and therefore also does not intend that—any passenger thereby comes to believe that the train leaves at 4 pm. Rather the employee is just complying with the requirements of her job. Nonetheless, it seems highly counterintuitive to claim that the employee does not communicate that the train departs at 4 pm. These are just a few illustrations of the numerous examples that, theorists have argued, challenge the fundamental Intentionalist claim that communication requires intentions to cause certain effects in an audience.

Grice (1989) and other scholars who are sympathetic to his theory (for example, Neale 1992, Schiffer 1972) have responded to such examples in numerous ways. A first strategy is to try to accommodate the problem cases by modifying Grice’s original analysis. For example, Grice suggests that reminders could be dealt with by specifying that the intended effect for assertions is not that a belief is formed but that an activated belief is formed (1969/1989, p. 109). Or again, to deal with cases such as examinations Grice proposes further modifying the first condition so that it requires not that the addressee forms an activated belief, but rather that the addressee forms the activated belief that the speaker has the respective belief (1969/1989, pp. 110-111). Hence, in the examination case, the student would intend his utterance to have the effect that the teacher forms the activated belief that the student believes that the Battle of Waterloo was fought in 1815. To deal with other counterexamples of this kind, Grice (1969/1989) proposed further (and increasingly complex) refinements of his original analysis.

Another strategy to deal with alleged counterexamples is to argue that they are in fact compatible with Grice’s original analysis. For example, in cases in which no audience is directly present one might argue that the utterances are still made with the intention of having an effect on a future or possible audience (Grice, 1969/1989, pp. 112-115; Schiffer, 1972, pp. 73-79). Finally, a third strategy is to argue that the definitions of meaning provided by Grice and his followers capture speaker-meaning in its primary sense and that the counterexamples only involve cases of meaning in “an extended or attenuated sense, one derived from and dependent upon the primary sense” (Schiffer, 1972, p. 71). The idea seems to be that the counterexamples are in a sense parasitic upon more standard cases of meaning. For example, it might be argued that an utterance such as (8) is a case of meaning in an extended sense. The utterance can be taken to mean that the next London train departs at 4 pm because this is what speakers usually mean by uttering this sentence (standard meaning), and this is captured by the Gricean analysis.

However, counterexamples to Grice and his followers have been numerous and varied and not everybody has been convinced that the responses proposed can successfully deal with them all. For instance, William Alston (2000, pp. 45-50) has presented a critical examination of Intentionalist defences and pointed out—among other things—that it is far from clear that examples such as (8) can be treated as extended cases of meaning. Further, Alston asks the more general question of whether such a treatment would be attractive from a methodological point of view. In the face of these challenges, then, it remains an open question whether Grice’s fundamental claim—that communication necessarily involves intentions to cause effects on one’s audience (captured in the first clause of his analysis)—can be defended.

b. Problems with the Third Condition of Grice’s Analysis

A less fundamental but nonetheless important worry concerns the third condition of Grice’s analysis, according to which recognition of the speaker’s intention must be (at least part of) the reason that the intended effect comes about. As noted, Grice introduced this condition to deal with cases in which the first two conditions are fulfilled but where there is no relation between the two (because the speaker presents evidence that is already sufficient to bring about the intended effect, as in the photograph case discussed in §1.a.ii).

Theorists have worried that this condition might be too strict because it excludes cases that do intuitively involve communication. A first concern is that intuitions are far from clear even for Grice’s own photograph example. Is Grice right to hold that Andy failed to mean that Bob’s wife is unfaithful when showing the photograph to Bob? In addition, however, there are also counterexamples in which the third condition is not fulfilled but where there is a rather clear intuition that the speaker communicated something. Consider reminders again. It seems that reminders might not only pose a problem for Grice’s first but also for the third condition because they usually have their effects by means of prompting the speaker to remember something rather than because the hearer recognises that the speaker intended this effect. For example, a speaker who utters (6) to remind the addressee that a certain woman’s name is ‘Rose’ might well intend the speaker to remember this not because the addressee recognises this intention but because seeing the rose is sufficient to prompt the addressee’s memory. Nonetheless, one might still want to allow that by producing the utterance the speaker meant and communicated that the woman’s name is ‘Rose’.

Such examples undermine the necessity of Grice’s third condition and, despite his insistence to the contrary (1969/1989, p. 109), several Intentionalists have suggested that it might be dropped or at least weakened (Neale, 1992, pp. 547-549; Schiffer, 1972, pp. 43-48; Sperber and Wilson, 1995, pp. 29, 50-54). One such weakening strategy is considered in §3.a.ii, when the main tenets of Relevance Theory are discussed.

Another strategy that has been proposed is to move the third clause outside the scope of the communicator’s intentions (Moore, 2017a). The idea is that, in general, communicative acts are efficacious when they are properly addressed, and the recipient produces the intended response partly because she recognises that she is being addressed. However, the communicator need not intend that the recipient recognises the overtness of a communicative act as a reason to produce the intended response. This strategy is part of a wider attempt at de-intellectualising the cognitive and motivational requirements for engaging in Gricean communication (as §3.b shows).

c. The Insufficiency of Grice’s Analysis

Although the first two objections claim that certain aspects of Grice’s analysis are not necessary, one can also object by claiming that Grice’s conditions are insufficient to account for communication. This objection, first raised by Strawson (1964, pp. 446-447) and further developed, among others, by Schiffer (1972, pp. 17-27), maintains that there are cases in which all three of Grice’s conditions are satisfied but which do not count as cases of communication. These examples are complicated, but Coady (1976, p. 104) nicely summarises the clearest one:

The most intelligible of such examples is one we owe to Dennis Stampe (although it is not cited by Schiffer) in which a man playing bridge against his boss, and anxious to curry favour, wants his boss to win and to know that the employee wants him to win. He has reason to believe that the boss will be pleased to see that the employee wants him to win but displeased at anything as crude as a signal or other explicit communication to the effect that now he has a good hand. Hence, when he gets a good hand the employee smiles in a way that is rather like but just a bit different from a spontaneous smile of pleasure. He intends the boss to detect the difference and argue (as Grice puts it): ‘That was not a genuine give-away smile, but the simulation of such a smile. That sort of simulation might be a bluff (on a weak hand), but this is bridge, not poker, and he would not want to get the better of me, his boss, by such an impropriety. So probably he has a good hand, and, wanting me to win, he hoped I would learn that he has a good hand by taking his smile as a spontaneous give-away. That being so, I shall not raise my partner’s bid.

What cases of this kind suggest, then, is that Grice’s original analysis is insufficient to ensure the overtness required for communication, that is, that the relevant intentions of speakers are transparent to their interlocutors. As noted above, Grice’s second condition was introduced to prevent deceptive intentions (as in the case of the murderer who left Smith’s handkerchief at the crime scene). However, the example of the bridge players shows that the second condition is insufficient to exclude deception at higher levels. In response to this worry, Strawson (1964, p. 447) proposed adding a fourth condition to Grice’s analysis, but Schiffer (1972, pp. 18-23) argued that in fact five conditions would be needed. However, as Schiffer himself highlights, the problem with all such moves to add additional clauses seeking to rule out certain orders of deceptive intentions is that they will always be open to the construction of ever more complex counterexamples in which all the conditions are fulfilled but where a deceptive intention at a still higher level causes problems. Hence, there is a threat of an indefinite regress of conditions.

Grice himself discusses two main strategies for responding to the concern about deceptive intentions. The first is to insist that complexity has an upper bound, so the regress stops at some point and no further conditions are needed (1969/1989, pp. 98-99). His claim is that at some point the intention that a speaker would need to have for it to constitute a further counterexample would just be too complex to count as psychologically real for the speaker or addressee. However, Grice himself (1969/1989, p. 99) and other Intentionalists (for example, Schiffer 1972, pp. 24-26) raised doubts about this response, objecting both that it fails to specify exactly where the cut-off point is and that it fails to provide the philosophically rigorous analysis of communication that Grice set out to deliver (since it fixes the cut-off point on the basis of contingent facts about the cognitive capabilities of current interlocutors rather than on the basis of the nature of communication).

The second strategy—the one that Grice prefers—is simply to rule out deceptive or hidden intentions (1969/1989, pp. 99-100). However, as Grice (1982/1989, p. 303) realises, there seems to be a worry that the introduction of these conditions is ad hoc and, related to this, other theorists such as Kent Bach and Robert Harnish (1979, p. 153) have wondered why it would be appropriate to introduce a condition against these complex forms of deception but not against simple forms of deception such as lying. Further, Schiffer (1972, p. 26) claims that Grice’s condition might be incapable of accounting for some of the more intricate counterexamples that he constructs. Despite these worries, Grice (1982/1989, pp. 302-303) and some other theorists such as Neale (1992, p. 550) have maintained that a condition against hidden intentions provides the best remedy for deception cases.

Schiffer himself proposes dealing with deceptive cases by appealing to what he calls “mutual knowledge” (1972, p. 30). (David Lewis (1969) is generally credited with introducing the related notion of common knowledge; Schiffer’s notion of mutual knowledge is also clearly related to other so-called ‘common ground’ views, such as those of Stalnaker (2002; 2014), Bach and Harnish (1979), and Clark (1992; 1996).) Roughly, two or more people have mutual knowledge of some state of affairs if they all know that the state of affairs obtains, know that the others know that the state of affairs obtains, know that the others know that they know that the state of affairs obtains, and so on indefinitely. For example, two people who are facing each other while sitting around a table with a candle on it will have mutual knowledge that there is a candle on the table because each of them knows that there is a candle on the table, knows that the other one knows that there is a candle on the table, knows that the other knows that one knows that there is a candle on the table, and so on. Schiffer’s (1972, p. 39) proposal is then to build into Grice’s analysis of speaker-meaning a condition that requires a speaker to be intending to bring about a state of affairs that makes it mutual knowledge that the speaker is intending to cause an effect in the audience by means of the audience’s recognition of this intention. In other words, for Schiffer, communication consists in contributing to the set of mutually known things by making one’s intention known to one’s audience. This, according to Schiffer, handles counterexamples that involve deception because in these cases the interlocutors lack mutual knowledge of the speaker’s intentions. For instance, in the example above, it is not mutual knowledge that the employee intends the boss to believe that he has a good hand by means of the boss’ recognition of this intention because the boss does not know that the employee intends his fake smile to be recognised as such.

Now, it is an essential feature of mutual knowledge that it involves an endless series of knowledge states and one might wonder why the introduction of a condition that appeals to mutual knowledge would therefore not also invite a problematic regress; Grice (1982/1989, p. 299) himself expressed such a worry about Schiffer’s account. Responding to this, Schiffer claims that the infinite series of knowledge states required for mutual knowledge is a “harmless regress” (1972, p. 30) because such an infinite series is also required for individual knowledge. For example, it might be thought that if a person knows that Berlin is in Germany, she will also know that she knows that Berlin is in Germany, know that she knows that she knows that Berlin is in Germany, and so on. Hence, he argues that appeals to mutual knowledge do not create any special problem.

The literature on mutual knowledge contains several attempts at spelling out Schiffer’s insight that the notion of mutual knowledge, understood in terms of dispositions to draw an indefinite number of inferences, is not problematic from a psychological point of view. One such attempt is discussed in §3.a.ii, when the notion of mutual manifestness is introduced. This strategy, proposed and popularised in the first two decades of the twenty-first century, though not discussed in this article in detail, is to conceptualise mutual knowledge as a relational mental state, namely, as a mental state that two or more individuals have if there is a ternary relation that holds between them and a certain proposition (Wilby, 2010). Relational accounts of mutual knowledge, as well as of other cognate notions, have also been criticised on several fronts (see, for example, Battich and Geurts (2020)).

d. Problems with Conventional Speech Acts

Another challenge to the idea that Grice’s model is sufficient to give us a general account of communication has to do with a class of speech acts which are usually referred to as “conventional” (in the following, “speech act” is used to refer to what is usually called “illocutionary speech acts” in speech act theory; for further details, see the relevant sections of the article on John Langshaw Austin). Ever since Strawson’s (1964) introduction of Grice’s analysis to speech act theory, Intentionalists have distinguished conventional speech acts from what they deem to be communicative speech acts. According to them, communicative speech acts are speech acts that must be performed with a communicative intention and they will only be successful if the audience recognises this communicative intention. Types of speech acts that are standardly claimed to be communicative in this way are assertions and directives. Conventional speech acts, on the other hand, neither require any communicative intention on the side of the speaker nor any recognition by the audience for their successful performance. Instead, conventional speech acts depend on the existence of certain conventional or institutional background rules.

Here are two examples. First, consider checking in poker. In order to check in poker, all that one needs to do is to say “check” when it is one’s turn and certain conditions are satisfied (for example, no other player has yet made a bet in that round, and so forth). Importantly, no intention is required for a player to check as illustrated by the fact that sometimes players manage to check by saying “check” although they intended to bet. For this and other reasons, it is also not necessary that the player’s intention to check is recognised. Second, consider pronouncing a couple husband and wife. For a priest to do so, she only needs to say “I hereby pronounce you husband and wife” at the right point during a marriage ceremony. Again, for the speech act to work (that is, for the couple to be married) it will be irrelevant if the priest has a particular intention or if anyone recognises this intention. All that is necessary is that the speech act is performed according to the rules of the church. Of course, these are only two examples and it should be clear that there are many other conventional speech acts, including pronouncing a verdict, naming a ship, declaring war, and so on.

The problem that conventional speech acts pose for Grice’s account is that they have to be classified as non-communicative acts because for Grice speech acts are communicative only if they depend on the existence and recognition of communicative intentions. However, this result is unattractive because it seems false to claim that a speaker who checks in poker does not communicate that she is checking, or that a priest who is marrying a couple does not communicate this. Although theorists in the Gricean tradition usually recognise and accept that analysing communication in terms of certain complex intentions has this result, (Bach and Harnish, 1979, p. 117), a common defence is that it does not undermine Grice’s analysis because “such speech acts as belong to highly conventionalised institutions are, from the point of view of language and communication, of marginal interest only” (Schiffer, 1972, p. 93), and so belong to the study of institutions rather than communication (Sperber and Wilson, 1995, p. 245). It is unclear, however, why their conventional or institutional nature should make conventional speech acts less significant for the study of communication or be a reason to declare them non-communicative. One might claim instead that the Gricean theory is too narrow to include conventional speech acts and therefore defective as a general theory of communication.

e. Problems with Explaining Sentence-Meaning in Terms of Speaker-Meaning

Although the objections discussed in §2.a-d challenged the necessity and sufficiency of Grice’s proposed analysis of meaning and communication, the final set of objections challenges Grice’s claim that sentence-meaning can be reduced to speaker-meaning. One of the most well-known of these objections comes from Mark Platts (1997, pp. 89-90. Another famous objection to this claim has been presented by John Searle (1965); Searle’s objection has been addressed by Grice (1969/1989, pp. 100-105) and Schiffer (1972, pp. 27-30)). The problem with Grice’s analysis, Platts argues, is that a language allows for the formation of infinitely many sentences, most of which have never been used by any speaker, so there are no intentions that are usually associated with them. On Grice’s account, this would have the absurd consequence that most sentences lack meaning.

A possible response to this objection would be to slightly modify Grice’s account and claim that a sentence does not have meaning in virtue of the intentions that speakers actually have when using the sentence but in virtue of the intentions they would have if they were to use the sentence. However, this raises the question why speakers would have such intentions when using the sentence and it seems hard to explain that without making some reference to the meaning of the sentence itself. A more promising starting point in accounting for unuttered sentences is to note that sentences are complex entities that are composed of more basic elements, for example, words (see the article on Compositionality in Language). Taking this into account, a Gricean might argue as follows: there is a finite set of sentences that are regularly used by speakers and which thus have communicative intentions associated with them. Following Grice, one can claim that the meanings of this fixed set of sentences are determined by the intentions that speakers usually have when using them. But once the meaning of these sentences is fixed in this way, the meaning of the constituent words must be fixed, too (by how they are used by speakers in constructing sentences). And once the meaning of words is fixed in this way, they can be combined in novel ways to form new sentences and thereby restrict the possible intentions that speakers may have when using these new sentences and, therefore, also fix the meanings of these new sentences (such a move might be compatible with holding that speakers tacitly know a Davidsonian-style T-theory for their language, though the Intentionalist would claim that the basic meanings deployed in such a theory are generated via speaker intentions rather than via Davidson’s own notion of ‘radical interpretation’; see Davidson: Philosophy of Language). Grice (1968/1989) seems to make a similar proposal when arguing that not only sentence-meaning but also word-meaning should be explained by reference to speaker intentions (for a similar proposal, see Morris (2007, 262)). However, the success of this strategy will depend to a large extent on whether word-meaning can indeed be explained in terms of speaker intentions. After the first two decades of the twenty-first century, a detailed account of how exactly this might be done is yet to come.

3. Are Intentionalist Approaches Psychologically Implausible?

Although section §2 looked at specific objections to some of the elements of the Gricean model, an objection can also be levelled at Intentionalism more generally, concerning its psychological implausibility. This objection deserves to be taken seriously because it targets the core feature of the Intentionalist approach, namely, the overarching ambition of explaining semantic notions in psychological terms.

The objection from psychological implausibility takes two forms: first, there is a worry that the specific Intentionalist model which Grice gives us fails to fit with the cognitive processes which underlie the grasp of meaning and communication. Second, some have objected to the Intentionalist claim per se along the lines that, from a developmental and evolutionary point of view, it gets the relationship between meaning and thought the wrong way round. These challenges are explored in this section.

a. Is Grice’s Model Psychologically Implausible?

To get an intuitive sense of the concern, consider a speaker, S, who has the communicative intention of informing her hearer, H, that coffee is ready. One way of spelling out what it is for S to have an informative intention is to say that:

(i) S intends H to form a certain belief, in this case, that coffee is ready.

As already discussed, if S’s act is to count as communicative in Grice’s sense, it must be overt. So, at least (ii) must also hold:

(ii) S intends that H comes to believe that S intends to inform H.

Correlatively, understanding a communicative act seems to require the hearer to recognise the speaker’s intention as specified in (ii), the content of which comprises a complex embedding of intentions and beliefs. However, the current objection goes, in ordinary communicative exchanges speakers and hearers do not seem to go through complex inferences about each other’s mental states. Indeed, linguistic communication appears to be, by and large, an ‘automatic’ process. So, the model proposed by Grice seems not to fit with the actual psychological processes involved in language understanding and communication.

This concern was often voiced in Grice’s lectures (see, for instance, Warner (2001, p. x)). A prominent presentation of the argument can be found in the work of Ruth Millikan (1984, Chapter 3; other authors who have voiced similar concerns include Alston, 2000; Apperly, 2011; Azzouni, 2013; Gauker, 2003; Pickering and Garrod, 2004). In the remainder of this section, this objection is considered on its own terms, together with some of the replies that have been offered. Millikan’s teleological approach comes back in §4.a.

i. Grice’s Response to the Challenge: Levels of Explanation

Grice’s response to the allegation of psychological implausibility was to stress that his account was meant as a rational explanation of communicative interactions, and not as an attempt to capture the psychological reality of everyday communicative exchanges (Warner, 2001). Another proposal for how to understand Grice’s claim has been offered by Bart Geurts and Paula Rubio-Fernandez (2015), drawing on David Marr’s (1982) distinction between the computational and algorithmic levels of explanation. A computational level explanation captures the task or function a system aims to perform (that is, what mapping from inputs to outputs the system is designed to implement), while an algorithmic explanation captures the actual rules or states of the system which realise that function. So, supposing that the function we want to compute is ‘2x2’ (this is the computational level description of the system), there are then two different algorithms that a system could use to compute this function:

x · x                        x + x

2 · _____                    x · _____

So, two systems can have the same computational description while making use of different algorithms.

In the context of linguistic communication, Geurts and Rubio-Fernandez (2015, pp. 457-459) argue, Gricean pragmatics is pitched at the computational level, specifying the desired mappings between inputs and outputs. Processing theories of communication, on the other hand, are pitched at the algorithmic level. Only processing theories need to reflect the psychological or cognitive reality of the minds of speakers since they need to explain how the model provided by Gricean pragmatics is implemented. According to Geurts and Rubio-Fernandez, then, the objection from psychological implausibility rests on the tacit assumption that Gricean pragmatics is not only a computational theory of pragmatics, but also an algorithmic theory of processing. If the distinction between these two levels of explanation can be maintained, and Gricean pragmatics is understood as being only a computational theory, then the objection from psychological implausibility is undermined.

Finally, if one wants to argue that Gricean pragmatics is psychologically implausible, one needs to provide reliable empirical evidence. Importantly, allegations of psychological implausibility often seem to rely on evidence from introspection. As Geurts and Rubio-Fernandez (2015, pp. 459-466) point out, even assuming that introspection is always reliable (which it is not), ascribing propositional attitudes need not be a conscious or consciously accessible process. It might very well be a process that, by and large, occurs unconsciously (even though, on reflection, communicators can easily and consciously access the outputs of this process). Therefore, evidence from introspection alone is not enough to support the argument that the Gricean model is psychologically unrealistic.

ii. A Post-Gricean Response: Relevance Theory

The concern to capture the psychological reality of our everyday communicative exchanges is also at the heart of a highly influential post-Gricean approach known as ‘Relevance Theory’, which aims at providing a theory of communication that is not only plausible but also explanatory from a cognitive point of view. Proposed by Dan Sperber and Deirdre Wilson (1995), Relevance Theory has been highly influential not only in philosophy but also in linguistics and psychology (Noveck, 2012).

It is important to note that Relevance Theory aims at being, primarily, a theory of communication and not a theory of meaning, although it is suggestive of what is known as a ‘contextualist’ approach with respect to meaning (see Meaning and Context-Sensitivity, section 2). Furthermore, advocates of Relevance Theory stress that, even if Gricean insights have inspired much of their theorising, their approach differs in crucial respects from Grice’s own theory of communication (Sperber and Wilson, 2012, Ch. 1). To introduce the reader to this alternative approach, this section starts by presenting the notions of mutual manifestness and ostensive-inferential communication, which are meant, respectively, to replace the notion of mutual knowledge and expand Grice’s notion of communication (as anticipated in §§2.b and 2.c).

According to Sperber and Wilson (1995, pp. 15-21), the notion of mutual knowledge is not psychologically plausible because it requires interlocutors to have infinitely many knowledge states, and this is just not possible for creatures with limited cognitive resources. Although the argument against mutual knowledge is not perfectly clear, it seems that, according to Sperber and Wilson, when one knows (in the ‘occurrent’ sense of the term) that p, one must have formed a mental representation that p, and no cognitively limited being can form infinitely many representations. It is worth noticing that this further assumption might be misleading. As the early proponents (see §2.c) of the notion of mutual knowledge pointed out, the infinite series of knowledge states is to be understood as a series of inferential steps, which need not directly reflect individuals’ representational states. Therefore, there might not be anything psychologically improper about the notion of mutual knowledge. Setting this point to one side, however, it might still be that mutual manifestness proves more plausible from a cognitive perspective, and thus the notion is still worthy of exploration.

Sperber and Wilson begin their account by defining what they call ‘an assumption’. This is a thought that the individual takes to be true, and it is manifest to an individual if that individual has the cognitive resources to mentally represent it (in a certain environment at a certain time). Importantly, an assumption can be manifest without in fact being true. In this respect, the notion of manifestness is weaker than that of knowledge. Moreover, an assumption can be manifest even if the corresponding representation is neither currently entertained nor formed by the individual. Indeed, an assumption can be manifest simply if the individual has the cognitive resources to infer it from other assumptions, where the notion of inference is meant to cover deductive, inductive, and abductive inferences alike.

With the notion of manifestness in place, the definition of ostensive-inferential communication is next to be discussed, starting with the notion of informative intention. An informative intention is an intention to make a set of assumptions manifest (or more manifest) to the audience. This definition is meant to capture the fact that sometimes we intend to communicate something vague, like an impression. If one intends to make a set of assumptions more manifest to an audience, one might represent that set of assumptions under some description, without thereby representing any of the individual propositions in the set (Sperber and Wilson, 1995, pp. 58-60).

Often enough, the communicator also intends to make the fact that she has an informative intention manifest or more manifest to an audience. According to Sperber and Wilson (1995, pp. 60-62), when the communicator intends to make the informative intention mutually manifest between themselves and the audience, they thereby have a communicative intention. A communicative act in this sense is successful when the communicative intention is fulfilled, namely, when it is mutually manifest between the interlocutors that the communicator has an informative intention.

Finally, Sperber and Wilson (1995, p. 63) define ostensive-inferential communication:

The communicator produces a stimulus which makes it mutually manifest to communicator and audience that the communicator intends, by means of this stimulus, to make manifest or more manifest to the audience a set of assumptions.

Importantly, Sperber and Wilson do not take Grice’s third clause to be necessary for ostensive-inferential communication (see §2.b). In other words, they do not think that the fulfilment of the informative intention must be based on the fulfilment of the communicative intention. As a reminder: Grice’s third clause was meant to exclude, inter alia, cases of ‘showing’ from cases of genuine non-natural meaning. According to Sperber and Wilson (1995, pp. 59-60), however, it is useful to think that there is a continuum of cases from showing something to ‘meaning something’ (in Grice’s sense) in which, at one end of the spectrum, the third clause does not hold (showing), while at the other end of the spectrum the third clause does hold and the informative intention could not be retrieved without having retrieved the communicative intention.

The core of the explanation of pragmatic reasoning proposed by Sperber and Wilson hinges on the idea that any ostensive act comes with a presumption of relevance, and ultimately it is this assumption that guides the recipient in interpreting the utterance. This explanation is based on two principles, which Sperber and Wilson (1995, p. 261) call the cognitive and the communicative principles of relevance. The cognitive principle is meant to capture a general feature of human cognition, namely that it ‘tends to be geared to the maximisation of relevance’ (Sperber and Wilson, 1995, pp. 260-266); Sperber and Wilson emphasise that, in this context, the term ‘relevance’ is used in a technical sense. In this technical sense, the relevance that a representation has for an individual at a given time is a function that varies positively with cognitive benefits and negatively with cognitive costs, or effort required to access the representation (via either perception, memory or inference). This use of the term is then close to, but might not coincide exactly with, the use of the term in ordinary language.

The communicative principle of relevance (Sperber and Wilson, 1995, pp. 266-273), which applies to communication specifically, states that ‘every utterance communicates a presumption of its own optimal relevance’. The presumption of optimal relevance means that, other things being equal, the hearer will consider the utterance as worth the interpretive effort. The interpretive enterprise is thus aimed at finding the most relevant interpretation of an utterance that is compatible with what the hearer believes about the speaker’s abilities and preferences. The range of possible interpretations is constrained by the fact that the hearer will look for the most relevant interpretation that can be achieved whilst minimising cognitive effort.

Contra Grice, this heuristic mechanism for utterance interpretation does not presuppose that communication is a cooperative enterprise-oriented toward the achievement of a common goal. Therefore, it might provide a more straightforward explanation of communicative interactions that are not prima facie cooperative (for example, examinations, adversarial communication, and so on). On the other hand, one might argue, the principles of relevance are so general and vague that they can hardly be falsified, and therefore might lack the desired explanatory power. Wilson (2017, pp. 84, 87) has responded to this objection by pointing out that the principles would be falsified if, for instance, utterance interpretation were systematically driven by considerations of informativeness, where the information derived from utterance interpretation is informative but not relevant.

Relevance Theory sees the heuristic mechanism as explaining how speakers and hearers come to select appropriate assumptions among the many that are manifest to them, which is something that Grice left unexplained (although Levinson 1989 queries the extent to which Relevance Theory itself offers an adequate explanation in this regard). Importantly, proponents of Relevance Theory take the heuristic mechanism to be at play not only in the derivation of implicatures, but more generally in any sort of pragmatic reasoning that leads communicators to retrieve the communicated content, where this stretches from reference assignment and ambiguity resolution to the adjustment of lexical meaning (for an overview, see Wilson (2017, section 4.4); for a critique, see Borg (2016)).

Finally, according to Relevance Theory, pragmatic reasoning is fundamentally an exercise in mindreading, which they understand as attributing and reasoning about mental states (Sperber and Wilson, 2012, Ch. 11). In this respect, Relevance theorists in general tend to take a radically mentalistic stance on how to best interpret Gricean explanations of pragmatic reasoning. However, as the next section shows, this mentalistic stance has itself come under attack.

b. Is the Intentionalist Assumption of the Priority of Thought over Language Plausible?

The first objection from psychological implausibility (§3.a) was that the reasoning about mental states posited in the Gricean model was too complex to play a central explanatory role in linguistic exchanges. Even if this objection is dismissed, however, a related but distinct concern emerges from the fields of developmental and comparative psychology. Infants and (more controversially) some non-human primates appear to be proficient communicators, yet it is unclear to what extent, if at all, they can reason about mental states. This concern threatens one of the central assumptions of the Intentionalist approach, namely that reasoning about mental states comes before the grasp of linguistic meaning. In other words, this second objection from psychological implausibility holds that the Intentionalist approach gets the priority between language and mental state concepts the wrong way around.

A first reason for thinking that pre-linguistic or non-linguistic creatures lack the ability to attribute mental states is that most of what we come to know or believe about other minds is based on our understanding of what others say. It is thus not obvious that children could acquire mental state concepts without already having access to this source of information (for a rich elaboration of this point from an empirical point of view, see Astington and Baird (2005), as well as Heyes (2018, Ch. 8)). Correlatively, the youngest age at which children apparently manifest an understanding of false beliefs, which is often regarded as the hallmark of the capacity for mental state attribution, is between 3 and a half and 4 years of age, namely when they master a language that contains mental states terms (see Wellman, Cross, and Watson (2001), but see also Geurts and Rubio-Fernandez (2013); for a general overview of the field, see Theory of Mind). If one goes for the lowest threshold and assumes that the critical age for false-belief understanding is 3 and a half years of age, there still is an important gap between the time at which infants start communicating flexibly and effectively (after their first birthday), and the time in which they reason about others’ mental states.

Moreover, evidence from atypical development is suggestive of the same developmental trajectory, with linguistic competency developing prior to mental state attribution. For instance, deaf children born of hearing parents, who are delayed in their acquisition of a mental state vocabulary, also manifest a delay in their ability to reason about others’ mental states. Importantly, as soon as they learn the relevant mental state terms, their performance on mindreading tasks improves, even on tasks which do not involve significant use of language (see, for example, Pyers and Senghas (2009)). Or again, at least some individuals on the autistic spectrum successfully acquire language, even though this condition is standardly held to involve impairment to the ability to attribute mental states to others (see, for example, references in Borg (2006, fn. 6)).

The challenge to the Intentionalist approach is thus that from an empirical point of view, linguistic competency seems to precede, and possibly to support, the development of abilities to reason about mental states (rather than the reverse picture assumed by Intentionalists). However, not all researchers working in developmental or comparative psychology accept this claim. In response to the challenge, advocates of an Intentionalist model have argued, first, that there are reasons to think pre-linguistic communication must itself be Gricean in nature (so that an ability to do Gricean reasoning must be present prior to language acquisition; see, for example, Tomasello (2008; 2019)). Second, they have argued that (contrary to the suggestion above) typically developing infants do show sensitivity to others’ mental states prior to the acquisition of language (see, for example, Onishi and Baillargeon (2005); for a useful overview and further references, see Rakoczky and Behne (2019)). While the literature on child development in this area is fascinating, it is also extensive and full consideration of it would unfortunately take us too far afield. This section closes, then, just by noting that the exact nature of infants’ mindreading abilities is still hotly debated and thus whether infants and animals prove to be a direct challenge to Intentionalism remains to be seen. One final point is worth noting, however: following up on the considerations presented in §3.a.i, it seems that a model of communication need not, in general, determine a univocal cognitive underpinning. Therefore, in principle, there could be room for different specifications of the cognitive/motivational mechanisms that underlie communication, even if one grants that the relevant forms of communication must be Gricean in nature (this idea has been pursued by Richard Moore (2017b), who has proposed downplaying the cognitive requirements of Gricean communication in a way that might insulate the approach from developmental evidence of the kind alluded to in this section).

4. Rejecting Intentionalism

Although the previous two sections looked at a range of problems both for the Gricean model for understanding meaning and communication, and for the more general Intentionalist approach which Grice’s framework exemplifies, this article closes by touching briefly on two possible alternatives to the Intentionalist approach.

a. Teleological Approaches: Millikan

As noted above, Millikan’s work provides a clear statement of the objection that the Gricean model is psychologically implausible, but it is important to note that her objection to the Gricean programme emerges from her wider theory of meaning. Millikan’s ‘teleological semantics’ aims (like Grice’s approach) at giving a naturalistic explanation of meaning and communication. However, Millikan seeks to couch this in evolutionary terms which do not (contra Grice) presuppose mastery of mental state concepts. Instead, Millikan tries to show that both meaning and speakers’ intentions should be accounted for in terms of the more fundamental, teleological notion of ‘proper function’. (An alternative, equally important, variety of the teleological approach can be found in the work of Fred Dretske.) In very broad strokes, the proper function of a linguistic device, be it a word or a syntactic construction, is held to be the function that explains why that word or syntactic construction is reproduced and acted upon in certain ways. For instance, the proper function of a word like “dog” is “to communicate about or call attention to facts that concern dogs” (Millikan, 2004, p. 35) and it is the fact that speakers do use the word in this way, and hearers do come to think about such facts when they hear this word, that explains the continued use of the word. Or again, according to Millikan (1984, pp. 53-54), the indicative grammatical mood, which can take several different forms within and across languages, has the proper function of producing true beliefs, while the imperative mood has the proper function of producing compliance. When speakers utter a sentence in the imperative mood, they typically intend to produce compliance in their hearers, and this linguistic device has proliferated because often enough hearers do comply with imperatives.

Given this background, Millikan argues that communicative intentions are, by and large, not necessary for explaining linguistic communication. If you state that p, in Normal circumstances (where ‘Normal’ is a term of art that Millikan has tried repeatedly to specify) I will come to believe that p without needing to represent that you overtly intend me to believe that p. Indeed, according to Millikan (1984, pp. 68-70) language use and understanding happen, first and foremost, ‘automatically’, as ways to express (or act upon others’ expression of) beliefs and intentions. (In a later work, Millikan (2017) suggests that, although the phenomenology of the grasp of meaning may be that of an automatic process, there may nevertheless be some underlying inferential work involved.)

According to Millikan, we engage in the sort of mentalistic reasoning envisaged by Grice only when we somehow inhibit or exploit parts of the automatic processes for language production and understanding. A crucial aspect of Millikan’s argument is that only the mental states that we represent, and which are instantiated in some region of our brains, can be causally relevant to the production and understanding of linguistic utterances. In her view, mental states that we can easily and readily come to have on reflection, but that we do not use in performing a certain task, do not play any direct causal role and she argues that communicative intentions are, most of the time, dispositional in this sense.

b. Normative Social Approaches

A second major alternative to the Intentionalist approach has been offered by Robert Brandom (1994), who suggests that we explain linguistic communication, as well as semantic features of thought and talk, in terms of skilful participation in norm-governed social practices. The core idea is that it is possible to define practices that involve the use of linguistic signs (which Brandom terms ‘discursive practices’) as a species of norm-governed social practices, and that this definition can be given in non-semantic terms. Thus, Brandom argues that we can translate the normative dimension of basic discursive practices (which dictate what is permissible and what is not) into the inferential role of sentences in a language (which dictate which sentences are derivable, or compatible, or incompatible with which other sentences). For instance, a sentence like ‘New York is East of Pittsburgh’ entails, together with other background assumptions, the sentence ‘Pittsburgh is West of New York’. Brandom holds, then, that we can define the meaning of a sentence in terms of its inferential role within the discursive practices in which that sentence appears.

As Loeffler (2018, pp. 26-29) points out, Brandom does not offer a direct critique of Intentionalist accounts. However, one of the main motivations for exploring Brandom’s view is that it offers an account of how utterances can come to have conceptually structured content without presupposing that this content derives from the conceptually structured content of mental states.

Unlike Millikan’s approach, Brandom’s proposed explanation of content is non-naturalistic. Indeed, it makes crucial use of normative notions, chiefly those of commitment and entitlement. According to Brandom, the normativity of these twin notions cannot be accounted for naturalistically. Several theorists see the anti-naturalistic strand of Brandom’s proposal as highly objectionable; for further elaborations of the concerns underlying the objection, see Reductionism). However, this objection reflects general difficulties with naturalising normativity, and these difficulties are not specific to Brandom’s project. In fact, one might argue, Gricean conceptions of communication face an analogous problem, in that they explain linguistic communication as reasoning about mental states, an activity that also has an essentially normative dimension.

A key element of Brandom’s account (and of later, Brandom-inspired approaches, such as Geurts (2019); see also Drobnak (2021)) is the notion of a commitment. Commitments can be conceived of as ternary relations between two individuals and a proposition. To use one of Geurts’ examples, when Barney promises Betty that he will do the dishes, he becomes committed to Betty to act consistently with the proposition that Barney will do the dishes. On the other hand, on accepting Barney’s commitment, Betty becomes entitled to act on the proposition that Barney will do the dishes. As above, the notion of commitment is a normative one (if I commit myself to you to do the dishes, you are entitled to act on the proposition that I will do the dishes, and I can be held responsible if I fail to do the dishes). Moreover, the notion is non-mentalistic. Commitments can be undertaken either implicitly or explicitly, and one can undertake a commitment without knowing or believing that one has done so. Correlatively, one is entitled to act on someone’s commitments whether or not the agent knows that she is so entitled. Therefore, if we coordinate our actions by relying on the commitments that we undertake, we need not always attribute, or reason about, psychological states (for further elaboration and defence of this point, see Geurts (2019, pp. 2-3, 14-15)).

An important upshot of these considerations is that the commitment-sharing view of communication has the potential to account for pre-linguistic communicative interactions without presupposing much, if anything, regarding mindreading capacities, and this might constitute an explanatory advantage over Gricean approaches. In this respect, a promising line of inquiry would be to consider the notion of ‘sense of commitment’ elaborated by John Michael and colleagues (2016). Roughly, the idea is that an individual manifests a sense of commitment when, in the context of a joint action, that individual is motivated to act partly because she believes that other participants in the joint action expect her to do so. Michael and colleagues have proposed several strategies to spell out the notion of sense of commitment in detail, and to experimentally track the emergence and modulation of this aspect of human psychology from infancy. The unexplored potential of this framework for studying pre-linguistic communication becomes apparent if one considers communicative acts, like pointing gestures, as aimed at coordinating contributions to the joint action. Normative inferential approaches also hold out the promise of other advantages. For instance, theorists such as Geurts (2019) and Kukla and Lance (2009) argue that it can give us an attractive explanation of different speech act types (such as assertions, promises, directions or threats). Furthermore, Geurts (2019) emphasises that analysing speech acts in terms of commitments allows one to give a unified treatment of both conventional and non-conventional speech acts. This is an advantage over traditional Gricean pictures, in which conventional speech acts such as “I hereby pronounce you husband and wife” were considered non-communicative (see §2.d).

Finally, the approach might also be thought to yield a good account of the notion of common ground. Geurts’ (2019, pp. 15-20) proposal is to preserve the iterative structure of mutual knowledge (see §2.c), but to redefine common ground in terms of shared commitment. In a nutshell, all that is required for a commitment to be in place is that it is accepted, and when it is accepted, it thereby enters the common ground. If I commit to you to do the dishes, and you accept my commitment, we both become committed to act in a way that is consistent with the proposition that I will do the dishes. In other words, you too become committed to the proposition that I will do the dishes and as a result, for instance, you yourself will not do the dishes. Now, if I accept this commitment that you have as a result of accepting mine, I thereby undertake a further, higher-order commitment, that is, a commitment to you to the proposition that I am committed to you to do the dishes, and so on, as in the iterations that constitute mutual knowledge.

If the analysis of the notion of common ground in terms of shared commitments is tenable, it seems that there are good prospects for explaining pragmatic reasoning and linguistic conventions (on the subject of conventions, see Geurts (2018)). Regarding implicatures, Geurts observes that the same pragmatic reasoning that was proposed by Grice (see §1.b) can be cast in terms of commitments rather than psychological states:

It is common ground that:

(1) the speaker has said that p;
(2) he observes the maxims;
(3) he could not be doing this unless he was committed to q;
(4) he has done nothing to prevent q from becoming common ground;
(5) he is committed to the goal that q become common ground.

And so he has implicated that q. (Geurts, 2019, p. 21)

Although the schema is rather sketchy, it seems that it has the same explanatory capability as its Gricean counterpart. Of course, such a schema will be genuinely non-mentalistic only if all the elements in it have non-mentalistic descriptions and one might wonder whether the conversational maxims themselves can be rephrased in non-psychological terms. Geurts contends that the only maxim for which the reformulation might be problematic is the first maxim of quality (‘do not assert what you believe to be false’), since it is the only maxim that is cast in explicitly psychological terms. However, even in this case, he argues that the notions of belief and intention can be replaced without loss by appeal to the notion of commitments.

Of course, normative inferential approaches face independent challenges as well (for instance, many theorists have questioned whether commitments themselves can be spelt out without prior appeal to semantic content, while Fodor and Lepore (1992; 2001) famously objected to the holistic nature of such approaches). There may also be significant challenges to be faced by commitment-based accounts concerning how hearers identify exactly which commitments speakers have undertaken by their utterances, where this might be thought to require a prior grasp of linguistic meaning (in which case, to get off the ground, the normative inferential approach would need an independent account of how children acquire knowledge of word-meanings). However, if the problems for Intentionalist approaches which have been discussed here are ultimately found to hold good, then alternative approaches such as the normative inferential model will clearly deserve much further exploration.

5. Conclusion

This article began by asking how a gesture or utterance could have a meaning and how that meaning might come to be communicated amongst interlocutors. The starting point was the intuitively appealing idea that meaning and communication are tied to thought: an utterance (u) by a speaker (S) might communicate some content (p) to an audience (H) just in case p was the content S intended H to grasp. As §1 made explicit, spelling out this simple (Intentionalist) idea turns out to be pretty complex, leading Grice, the most famous advocate of the Intentionalist approach, to a three (or more) line definition of speaker-meaning which posited a complex set of (recursively specified) mental states. Grice’s model faces a range of objections. Opponents might query the necessity of Grice’s clauses (§§2.a-2.b) or argue that they are insufficient (§2.c), they might have a concern that the account fails to accommodate conventional speech acts (§2.d), or they might object to Grice’s proposed reduction of sentence-meaning to speaker-meaning (§2.e). Above and beyond these worries, however, it might also be objected that the starting premise—the idea that meaning and communication link inherently to thought—is mistaken, perhaps because such models can never be psychologically realistic (§3.a) or because they fail to cohere with developmental evidence about the relative priority of language acquisition over mental state reasoning (§3.b). Finally, two alternative approaches were surveyed, ones that seek to explain linguistic meaning and communication without assigning a constituent role to the content of thoughts. These two kinds of alternative approaches were the teleological model advocated by Millikan (§4.a) and the sort of normative-inferential model advocated by Brandom (§4.b). However, it is an open question whether these approaches can provide viable alternatives to the well-established Intentionalist account, since they might not be without their own significant problems. How we should understand meaning and communication, then, remains unsettled.

6. References and Further Reading

  • Alston, William. 2000. Illocutionary Acts & Sentence Meaning. Ithaca, NY: Cornell University Press.
  • Apperly, Ian A. 2011. Mindreaders. The Cognitive Basis of “Theory of Mind”. Hove: Psychology Press.
  • Astington, Janet W., and Jodie A. Baird. 2005. Why Language Matters for Theory of Mind. Oxford: Oxford University Press.
  • Azzouni, Jody. 2013. Semantic Perception. How the Illusion of a Common Language Arises and Persists. Oxford: Oxford University Press.
  • Bach, Kent, and Robert Harnish. 1979. Linguistic Communication and Speech Acts. Cambridge, MA: MIT Press.
  • Battich, Lucas, and Bart Geurts. 2020. “Joint Attention and Perceptual Experience.” Synthese.
  • Borg, Emma. 2006. “Intention-Based Semantics.” In The Oxford Handbook to the Philosophy of Language, by Ernie Lepore and Barry Smith, 250-266. Oxford: Oxford University Press.
  • Borg, Emma. 2016. “Exploding Explicatures.” Mind & Language 31 (3): 335-355.
  • Brandom, Robert. 1994. Making it Explicit: Reasoning, Representing and Discursive Commitment. Cambridge, MA: Harvard University Press.
  • Clark, Herbert. 1992. Arenas of Language Use. Chicago, IL: University of Chicago Press.
  • Clark, Herbert. 1996. Using Language. Cambridge: Cambridge University Press.
  • Coady, Cecil. 1976. “Review of Stephen R. Schiffer, Meaning.” Philosophy 51: 102-109.
  • Drobnak, Matej. 2021. “Normative inferentialism on linguistic understanding.” Mind & Language.
  • Fodor, Jerry, and Ernest Lepore. 1992. Holism: A Shopper’s Guide. Oxford: Blackwell.
  • Fodor, Jerry, and Ernest Lepore. 2001. “Brandom’s Burdens: Compositionality and Inferentialism.” Philosophy and Phenomonelogical Research LXIII (2): 465-481.
  • Gauker, Christopher. 2003. Words Without Meaning. Cambridge, MA: MIT Press.
  • Geurts, Bart. 2018. “Convention and Common Ground.” Mind & Language 33: 115-129.
  • Geurts, Bart. 2019. “Communication as Commitment Sharing: Speech Act, Implicatures, Common Ground.” Theoretical Linguistics 45 (1-2): 1-30.
  • Geurts, Bart, and Paula Rubio-Fernandez. 2013. “How to Pass the False-Belief Task Before Your Fourth Birthday.” Psychological Science 24 (1): 27-33.
  • Geurts, Bart, and Paula Rubio-Fernandez. 2015. “Pragmatics and Processing.” Ratio XXVIII: 446-469.
  • Grice, Paul. 1957/1989. “Meaning.” In Studies in the Way of Words, 213-223. Cambridge, MA: Harvard University Press; first published in The Philosophical Review 66(3).
  • Grice, Paul. 1968/1989. “Utterer’s Meaning, Sentence-Meaning, and Word-Meaning.” In Studies in the Way of Words, 117-137. Cambridge,MA: Harvard University Press; first published in Foundations of Language 4.
  • Grice, Paul. 1969/1989. “Utterer’s Meaning and Intention.” In Studies in the Way of Words, 86-116. Cambridge, MA: Harvard University Press; first published in The Philosophical Review 78(2).
  • Grice, Paul. 1975/1989. “Logic and Conversation.” In Studies in the Way of Words, 22-40. Cambridge, MA: Harvard University Press; first published in Syntax and Semantics, vol.3, P. Cole and J. Morgan (eds.).
  • Grice, Paul. 1982/1989. “Meaning Revisited.” In Studies in the Way of Words, 283-303. Cambridge, MA: Harvard University Press; first published in Mutual Knowledge, N.V. Smith (ed.).
  • Grice, Paul. 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press.
  • Heyes, Cecilia. 2018. Cognitive Gadgets. The Cultural Evolution of Thinking. Cambridge, MA: Harvard University Press.
  • Kukla, Rebecca, and Mark Lance. 2009. ‘Yo!’ and ‘Lo!’: The Pragmatic Topography of the Space of Reasons. Cambridge, MA: Harvard University Press.
  • Levinson, Stephen. 1989. “A Review of Relevance.” Journal of Linguistics 25 (2): 455-472.
  • Lewis, David. 1969. Convention. A Philosophical Study. Cambridge, MA: Harvard University Press.
  • Loeffler, Ronald. 2018. Brandom. Cambridge: Polity Press.
  • Marr, David. 1982. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. New York: W. H. Freeman and Company.
  • Michael, John, Natalie Sebanz and Günter Knoblich. 2016. “The Sense of Commitment: A Minimal Approach.” Frontiers in Psychology 6: 1968.
  • Millikan, Ruth G. 1984. Language, Thought and Other Biological Categories. New Foundations for Realism. Cambridge, MA: MIT Press.
  • Millikan, Ruth G. 2004. Varieties of Meaning. Cambridge, MA: MIT Press.
  • Millikan, Ruth G. 2017. Beyond Concepts: Unicepts, Language, and Natural Information. Oxford: Oxford University Press.
  • Moore, Richard. 2017a. “Convergent minds: ostension, inference and Grice’s third clause.” Interface Focus 7(3).
  • Moore, Richard. 2017b. “Gricean Communication and Cognitive Development.” The Philosophical Quarterly 67: 303-326.
  • Morris, Michael. 2007. An Introduction to the Philosophy of Language. Cambridge: Cambridge University Press.
  • Neale, Stephen. 1992. “Paul Grice and the Philosophy of Language.” Linguistics and Philosophy 15: 509-559.
  • Noveck, Ira. 2012. Experimental Pragmatics: The Making of a Cognitive Science. Cambridge: Cambridge University Press.
  • Onishi, Kristine H., and Renée Baillargeon. 2005. “Do 15-Months-Old Infants Understand False Beliefs?” Science 308: 255-258.
  • Pickering, Martin J., and Simon Garrod. 2004. “Toward a Mechanistic Psychology of Dialogue.” Behavioural and Brain Sciences 27: 169-190.
  • Platts, Mark. 1997. Ways of Meaning: An Introduction to a Philosophy of Language. Cambridge, MA: MIT Press.
  • Pyers, Jennie E., and Ann Senghas. 2009. “Language Promotes False-Belief Understanding: Evidence from Learners of a New Sign Language.” Psychological Science 20 (7): 805-812.
  • Rakoczky, Hannes, and Tanya Behne. 2019. “Commitment Sharing as Crucial Step Toward a Developmentally Plausible Speech Act Theory?” Theoretical Linguistics 45 (1-2): 93-97.
  • Schiffer, Stephen. 1972. Meaning. Oxford: Oxford University Press.
  • Searle, John. 1965. “What Is a Speech Act?” In Philosophy in America, by Max Black, 221-240. London: George Allen & Unwin Ltd.
  • Sperber, Dan, and Deirdre Wilson. 1995. Relevance: Communication and Cognition. London: Blackwell.
  • Sperber, Dan, and Deirdre Wilson. 2012. Meaning and Relevance. Cambridge: Cambridge University Press.
  • Stalnaker, Robert. 2002. “Common Ground.” Linguistics and Philosophy 25: 701-721.
  • Stalnaker, Robert. 2014. Context. Oxford: Oxford University Press.
  • Strawson, Peter. 1964. “Intention and Convention in Speech Acts.” The Philosophical Review 73 (4): 439-460.
  • Tomasello, Michael. 2008. Origins of Human Communication. Cambridge, MA: MIT Press.
  • Tomasello, Michael. 2019. Becoming Human. A Theory of Ontogeny. Harvard: Harvard University Press.
  • Warner, Richard. 2001. “Introduction.” In Aspects of Reason, by Paul Grice, vii-xxviii. Oxford: Oxford University Press.
  • Wellman, Henry M., David R. Cross, and Julanne Watson. 2001. “Meta-Analysis of Theory-of-Mind Development: the Truth about False Belief.” Child Development 72(3): 655 – 684.
  • Wilby, Michael. 2010. “The Simplicity of Mutual Knowledge.” Philosophical Explorations 13(2): 83-100.
  • Wilson, Deirdre. 2017. “Relevance Theory.” In The Oxford Handbook of Pragmatics, by Yan Huang, 79-101. Oxford: Oxford University Press.

 

Author Information

Emma Borg
Email: e.g.n.borg@reading.ac.uk
University of Reading
United Kingdom

and

Antonio Scarafone
Email: A.Scarafone@pgr.reading.ac.uk
University of Reading
United Kingdom

and

Marat Shardimgaliev
Email: m.shardimgaliev@pgr.reading.ac.uk
University of Reading
United Kingdom

Giordano Bruno (1548—1600)

Giordano Bruno was an Italian philosopher of the later Renaissance whose writings encompassed the ongoing traditions, intentions, and achievements of his times and transmitted them into early modernity. Taking up the medieval practice of the art of memory and of formal logic, he focused on the creativity of the human mind. Bruno criticized and transformed a traditional Aristotelian theory of nature and helped revive atomism. His advocacy of Copernicanism and the claim that there is an infinite number of worlds was innovative. In metaphysics, he elevated the concepts of matter and form to absolutes so that God and creation coincide. Bruno also advocated for a version of pantheism, and he probed the powers that shape and develop reality, including occult forces that traditionally belong to the discipline of magic. Most of his theories were made obsolete in detail with the rise of early modern empiricism; nevertheless, modern rationalism, which explored the relation between mind and world, and the modern critique of dogmatic theology were both influenced by Bruno’s philosophy.

Bruno was born in 1548 in southern Italy. He was educated in Naples, first by free-lance teachers, then at the Dominican convent of San Domenico Maggiore. After giving early indications of a provocative and critical attitude to Church teachings, he started the life of a migrant scholar that led him to Switzerland, France, England, and Germany. Throughout his travels, he continually tried to secure a position at a university, and he was frequently supported by monarchs and princes. While in Padua he was denounced a heretic by his host, a Venetian patrician, and he was interrogated by the Inquisition, first in Venice. In Rome, he was burned as an unrepentant heretic in 1600.

Bruno’s death at the will of the Catholic Church was immediately perceived as emblematic for the freedom of thought against dogmatic intolerance; this was especially due to an eyewitness report from the stake that spread via Protestant circles. John Toland, a freethinker himself, made Bruno a hero of anti-Christian propaganda. Bruno probably influenced Baruch Spinoza in his alleged pantheism, if not his atheism. As such, Bruno aroused the interest of defenders and critics of pantheism in the 18th and 19th centuries, until he was rediscovered as a critical thinker in his own right, one who broke with medieval traditions and paved the way to modern idealism.

Table of Contents

  1. Life and Works
    1. Life
    2. Works
  2. Major Philosophical Themes
    1. Epistemology, Art of Memory, Theory of Spirit
    2. Physics and Cosmology
    3. Metaphysics and Mathematics
    4. Religion and Politics
  3. Reception
  4. References and Further Reading
    1. Bruno Editions
    2. Other Primary Sources
    3. Secondary Sources
    4. Online Resources

1. Life and Works

a. Life

Bruno was born in February 1548 in the historic town of Nola, a cultural center in Campania, Southern Italy (about 25 km northeast of Naples). The following summary of his life aims at introducing philosophical themes as they developed throughout his career. Most details of his early life are known through Bruno’s depositions at the Inquisition trial (Firpo 1993; Mercati 1988; Spampanato 2000; Canone 1992; 2000). He was given the name Filippo. His father Giovanni Bruno was in military service, and his mother was Fraulisa Savolino. Bruno began his education in his home town of Nola, and at age 15 he went to Naples where he studied with private and public teachers. One of them was Giovanni Vincenzo Colle, known as Il Sarnese (d. 1574); another known teacher was Theophilus Vairanus, a friar of the order of the Augustinians who later taught at an Augustinian school in Florence and became a professor at the University of Rome. He died in Palermo in 1578. Colle might have introduced the student to Averroism, as he defended the philosophy of Averroes, according to Hieronymus Balduinus (Destructio destructionum dictorum Balduini, Naples 1554). Vairanus probably taught the young student Augustinian and Platonic approaches to wisdom and might have inspired him to name some interlocutors in his Italian dialogues Teofilo.

At this early stage, Bruno also started studying the art of memory by reading Peter of Ravenna’s book Foenix, which likens the art of memory to the combination of paper and letters: images are ‘placed’ on an ideal chart so that the memorized content can be recalled in a quasi-mechanical way. Likely to finance his further education, Bruno entered the convent of the Dominicans in Naples, San Domenico Maggiore, a center of the order, where Thomas Aquinas once had resided, and which was an early stronghold of Thomism. Bruno acquired the name Giordano, which he maintained, with few exceptions, for the rest of his career in spite of his conflicts with the Church. After being consecrated a priest and enrolling in the study of theology, in 1575 Bruno concluded his studies with two theses that confirm the scholastic/Thomist character of the curriculum: “It is all true what Thomas Aquinas teaches in the Summa contra Gentiles” and “It is all true what the Magister Sententiarum says [Peter Lombard in his Sentences]”. Early during his novitiate, Bruno raised suspicion for giving away images of saints, specifically one of St. Catherine of Siena and perhaps one of St. Antonino, while maintaining only a crucifix. Around the same time, he scolded a fellow novice for reading a pious book in praise of the Virgin Mary. In view of his later productions, this incident can be read as indicating Bruno’s critique of the cult of saints, which makes him appear friendly with Protestantism. The accusation, however, was soon dropped. Nevertheless, about ten years later, in 1576, a formal process was opened returning to the earlier incident and adding new accusations regarding the authority of the Church Fathers, and the possession of books by Chrysostom and Gerome that included commentaries by Erasmus of Rotterdam, which were prohibited. This amounted to his excommunication. As a student in Naples, Bruno might have learned of Erasmus through a group of local heretics, the Valdensians, who adhered to the teachings of Juan de Valdes (d. 1541), who questioned the Christian doctrine of the Trinity. Bruno argued with another friar about scholastic argumentation and the possibility to express theological themes in other forms of argumentation. He adduced Arius as an example (an ancient heretic who denied the dual nature of Jesus as man and God), and the resulting investigation touched upon the essentials of Catholic theology. Bruno traveled to Rome, probably defending his case at the Dominican convent Santa Maria sopra Minerva (the center of the Inquisition and in charge of approving or prohibiting books), and then left the Dominican convent, thus starting his career as a wandering scholar.

Bruno traveled northern Italy (Genoa, Turin, Savona, and other places) and allegedly published a book De segni de tempi (Signs of the times) in Venice, which is lost and might have dealt with astronomy and meteorology in Italian; for he reported to have taught, around that time, lectures on the popular textbook of astronomy, Sphaera, by the 13th-century author Johannes de Sacrobosco. In 1579 Bruno arrived in Geneva, the preferred destination of religious refugees albeit a fortress of Calvinism. After working at a printing press, Bruno enrolled at the university and published a pamphlet against Antoine de la Faye, then professor of philosophy and, as a protégé of Theodore Beza, an eminent theologian and the successor to John Calvin. The content of the pamphlet is unknown, but as the result of a formal trial, Bruno was excommunicated from the Reformed Church and had to apologize. He left Geneva and moved to Toulouse, where he stayed from 1579 through 1581, again teaching the Sphaera and also on the soul according to Aristotle. In Toulouse, Bruno also met the Portuguese philosopher Francisco Sanchez (died 1623) who dedicated to Bruno his new book Quod nihil scitur (Nothing is known). The book is a manifesto of modern skepticism that upends traditional, scholastic reliance on logical argument. Bruno shared his critique of scholastic Aristotelian logic but trusted the potency of the human intellect; therefore, he despised Sanchez, a sentiment which is confirmed by a note in his copy, where he calls him a “wild ass.” France was troubled with interconfessional struggles, Huguenots (Reformed) against Catholics; when these tensions broke out, Bruno had to leave Toulouse and moved to Paris, where he hoped to impress King Henry III.

In Paris in 1582, Bruno published and dedicated to the King two of his first works, which treated in his peculiar way the art of memory that seems to have been of interest to the monarch. De umbris idearum (The shadows of ideas) is a theory of mind and reality, the annexed Ars memoriae (Art of memory) applies that to the construction of mental contents; at the same time Bruno dedicated to Henri d’Angoulême the Cantus Circaeus (Circe’s chant), Circe being a mythological figure that elicits humanity from animalistic appearance and, again, practices memory. In political terms, the philosopher opted with these dedications for the Catholic faction. The King had been interested in the theory of memory and offered the guest a provisional lectureship. Also in Paris, Bruno published a comedy Il candelaio (Chandler). With letters from the French King Bruno came to the embassy of Michel de Castelnau in London, where he stayed, close to the court of Queen Elizabeth, from 1583 through 1585.

England was in a complex political situation, given the tensions between the Protestant Queen Elizabeth of England and the Catholic King Philip II of Spain, each of whom was having to deal with religious dissension in their respective kingdoms; and, France mediated there. Bruno, who was not the only Italian dwelling in England at the time, befriended courtiers and intellectuals like Philip Sidney and John Florio and, vying for recognition and stability in London, produced six of his works in the Italian language (as it was fashionable at the court) commonly called the Italian Dialogues, the best known of his literary and philosophical legacy. Nevertheless, when a Polish diplomat, Albert Laski, was sent to Oxford to visit the university, Bruno joined him with the intent to find an academic position. He debated and gave lectures on various topics but eventually was ridiculed and accused of plagiarism. He thus left Oxford and returned to London, before heading to Paris when Castelnau was recalled to France.

In Paris, Bruno befriended Fabrizio Mordente and first promoted then chastised his geometrical work; this was the first time Bruno entered the field of mathematics. His interest was directed towards the ontological truth and practical application of geometry that he was to discuss in his works on physics and epistemology. He also published a summary of Aristotle’s Physics according to mnemotechnical principles (Figuratio physici auditus), thus showcasing his competence in scholastic philosophy. At the Collège de Cambrai, the institution for teachers sponsored by the King, Bruno arranged a public disputation in May 1586. As was customary at the time, a student of Bruno’s presented a number of the teacher’s theses, which were directed against Aristotle. These were at the same time printed and dedicated to King Henry III as Centum et viginti articuli de natura et mundo adversus Peripateticos (One hundred and twenty theses on nature and the world, against the Aristotelians) and reprinted in 1588 as Camoeracensis acrotismus (Cambrai lecture). The debate became tumultuous, Bruno left the lecture hall immediately and was next seen in Germany at the Calvinist University of Marburg.

Still in 1586, Bruno started as a private lecturer at Wittenberg University, the center of Lutheran education, where, among others, Philip Melanchthon had taught. The lectures covered mostly Aristotelian philosophy, some of them were published in the 19th century based on transcripts by a student. Bruno also publishes several works that apply the Lullian method to principles of research under the heading lampas (lamp) and composes a major work Lampas triginta statuarum (Torch of thirty statues), a cosmology according to Lullism and art of memory, in which every part of reality is scrutinized by thirty categories.

In 1588, Bruno leaves Wittenberg, on account of the rising influence of Calvinists. He delivered a programmatic “Farewell Lecture” (Oratio valedictoria) and subsequently sought a position in Prague, where Rudolf II of Habsburg entertained a court of scholars and scientists, among whom were later the astronomers Tycho Brahe (d. 1601) and Johannes Kepler (d. 1630).  In Prague, Bruno publishes a Lullian study and dedicates to the Emperor a critique of mathematics (Articuli adversus mathematicos) without professional success. He next moved on to Helmstedt, again a Lutheran university, where he stays from January 1589 through mid-1590. Here, Bruno garnered a third excommunication, because the Lutheran pastor appears to have detected some heresy (Omodeo 2011) in his thought. While in Helmstedt, Bruno worked on his trilogy, soon to be published in Frankfurt, and on several works that dealt with occult sciences and magic.

In 1590 Bruno traveled to Frankfurt where he published a trilogy of poems in hexameter (on the model of Lucretius) with prose commentaries that encompass his philosophy: De minimo, on the infinitely small, De monade, a theory of monads that are both metaphysical and physical minimal parts or atoms, and De immenso, the theory of the infinity of magnitude and innumerous worlds. From Frankfurt, Bruno traveled for a short time to Zurich, where he delivered lectures on the principles of metaphysics, which were later published as a compendium of metaphysical terms (Summa terminorum metaphysicorum). While in Frankfurt, Bruno received letters from a patrician in Venice, Giovanni Mocenigo, inviting him to give private lectures on the art of memory. Soon after he arrived in Venice in 1591, Bruno proceeded to the Venetian university in Padua to lecture on geometry hoping to find a permanent academic position. This attempt failed (Galileo Galilei later obtained that position) and he returned to Venice. His sponsor Mocenigo, however, was dissatisfied with Bruno’s service (most likely, he expected some magical practice) and denounced him in May 1592 to the Inquisition as a heretic. In early 1593, Bruno was transferred to the Inquisition in Rome where interrogations and investigations continued. His books were censured for heretical content. Among others, the accusations regard the eternity of creation, the equivalence of divine and created powers, the transmigration of the human soul, the soul as the form of the human body, the motion of the earth in relation to the teaching of the Bible, and the multitude of worlds. Pope Clement VIII ordered not to use torture as the case of heresy was proven, and Cardinal Robert Bellarmine, the author of a history of heresies (De controversiis, 1581-1593), presented a list of eight heretical propositions the defendant had to recant. Only two of the propositions are known: one questioning the sacrament of reconciliation, the other the theory of the soul as the helmsman of the body. Both tenets challenge the Christian doctrine of the individual soul and its afterlife. Bruno was declared a heretic, formally “unrepentant, pertinacious, and obstinate”, and thus delivered to the authorities who burned him at the stake on Campo de’Fiori in Rome on 17 February 1600.

b. Works

The standard editions of Bruno’s works are the 19th-century collection of his Latin writings initiated by Francesco Fiorentino (Bruno 1962) and the Italian dialogues edited by Giovanni Gentile and Giovanni Aquilecchia (Bruno 1958). These texts are also available online (see References below). Besides many separate text editions, there is a collection of Latin works in progress, commented with Italian translations, under the direction of Michele Ciliberto (Bruno 2000a; 2001; 2009; 2012). Bilingual editions with extensive commentaries of the Italian works were published with French translation under the direction of Giovanni Aquilecchia (Bruno 1993-2003) and with German translation under the direction of Thomas Leinkauf (Bruno 2007-2019).

Bruno’s works have unusual but meaningful titles. They are listed here in chronological order of publication or – for works published posthumously – of composition. No original manuscripts are extant. The list will help readers find them as they are mentioned in the following text.

  • De umbris idearum (The shadows of ideas) 1582
  • Cantus Circaeus ad memoriae praxim ordinatus (Circe’s chant applied to the practice of memory) 1582
  • De compendiosa architectura et complemento artis Lullii (Comprehensive construction and complement to the art of Lull) 1582
  • Candelaio (Chandler) 1582
  • Ars reminiscendi (Art of memory; reprint of dialogue II of Cantus Circaeus) 1583
  • Explicatio triginta sigillorum et Sigilli sigillorum (Unfolding of the 30 sigils and the sigil of sigils) 1583
  • London dialogues in Italian:
    • La cena de le ceneri (The Ash Wednesday supper) 1584
    • De la causa, principio e uno (Cause, principle, and one) 1584
    • De l’infinito, universo e mondi (The infinite, universe, and worlds) 1584
    • Spaccio de la bestia trionfante (Expulsion of the triumphant beast) 1584
    • Cabala del cavallo Pegaseo con l’aggiunta dell’Asino Cillenico
      (Cabal of the Pegasus horse with the ass of Cyllene) 1585
    • De gli eroici furori (Heroic frenzies) 1585
  • Figuratio aristotelici physci auditus (Arrangement of the Physics of Aristotle) 1586
  • Dialogi duo de Fabricii Mordentis Salernitani prope divina adinventione (Two dialogues on Fabrizio Mordente’s almost divine invention) 1586
  • Dialogi. Idiota triumphans. De somnii interpretatione. Mordentius. De Mordentii circino (Dialogues: The triumphant idiot; Interpretation of a dream; Mordente; Mordente’s compass) 1586
  • Centum et viginti articuli de natura et mundo adversus Peripateticos (One hundred and twenty theses on nature and world, against the Aristotelians) 1586
  • De lampade combinatoria lulliana (Torch of Lullian combinatorics) 1587
  • De progressu et lampade venatoria logicorum (Procedure and searching torch of logicians) 1587
  • Artificium perorandi (The art of persuasion) 1587
  • Animadversiones circa lampadem lullianam (Advice regarding the Lullian torch) 1587
  • Lampas triginta statuarum (Torch of thirty statues) 1587
  • Oratio valedictoria (Farewell speech) 1588
  • Camoeracensis Acrotismus seu rationes articulorum physicorum adversus Peripateticos (Cambrai lecture, or arguments for the theses in physics against the Aristotelians) 1588
  • De specierum scrutinio et lampade combinatoria Raymundi Lullii (Investigation of species and Lullian combinatory torch) 1588
  • Articuli centum et sexaginta adversus huius tempestatis mathematicos atque philosophos (One hundred and sixty theses against mathematicians and philosophers of our times) 1588
  • Libri physicorum Aristotelis explanati (Aristotle’s Physics explained) 1588
  • Oratio consolatoria (Funeral speech) 1589
  • De rerum principiis, elementis et causis (Principles, elements, and causes of things) 1589-1590
  • De magia (Magic) 1589-1590
  • De magia mathematica (Mathematical magic) 1589-1590
  • Medicina lulliana (Lullian medicine) 1590
  • Frankfurt Trilogy
    • De triplici minimo et mensura (The threefold minimum and measure) 1591
    • De monade, numero et figura (Monad, number, and shape) 1591
    • De innumerabilibus, immenso et infigurabili (The innumerable, immense, and shapeless) 1591
  • De imaginum, signorum et idearum compositione (The composition of images, signs, and ideas) 1591
  • Summa terminorum metaphysicorum (Compendium of metaphysical terms) 1591
  • Theses de magia (Theses on magic) 1591
  • De vinculis in genere (Bonds in general) 1591
  • Praelectiones geometricae (Lectures in geometry) 1591
  • Ars deformationum (Art of geometrical forms) 1591

2. Major Philosophical Themes

Giordano Bruno’s philosophical production spanned only ten years. One is therefore not likely to detect major phases or turns in his development. While it is certainly possible to differentiate certain subdisciplines of philosophy in his work (epistemology, physics, metaphysics, mathematics, natural theology, politics), what is typical of his many writings is that almost all themes are present in all the treatises, dialogues, and poems. Therefore, it is feasible to outline his theories with some works, in which one aspect is prevalent, provided we remain aware of the interconnection with the rest.

a. Epistemology, Art of Memory, Theory of Spirit

Bruno entered the professional stage with his De umbris idearum (The Shadows of Ideas), which contains his lectures on mnemotechnic introduced by a dialogue between Hermes and a philosopher that explains its underlying philosophy. Hermes Trismegistus was a legendary Egyptian sage, the alleged source of Plato and of others, whose spurious writings became fashionable in the Renaissance for seemingly reconciling Christian with pagan wisdom. Bruno was one of the promotors of Hermeticism. The book explains the purpose of mnemotechnic or art of memory. Throughout history, memory was not only a subjective experience of remembering things but a specific faculty of the human mind that can be trained and directed. Memory consists in creating an internal writing that represents ideas as though the internal cognition were the shadow of the absolute ideas. Ideas are only worth pursuing if they are real. Bruno endorses strains of the Neoplatonic tradition, according to which the Platonic Forms are not more real than the visible world but equally present in that world. In that vein, Bruno explains the metaphor of shadow as not made of darkness (as one would think) but as being the vestige of light and, vice versa, the trace of reality in the ideal. Shadow is open for light and makes light accessible. Consequently, while human understanding is only possible by way of ‘shadows,’ that is, wavering between truth and falsehood, every knowledge is apt to be either deformed or improved towards the good and true. In an analogy from physics: in the same way as matter is always informed by some form, form can take on various kinds of matter, and in this sense, the intellect, including memory, can be informed by any knowledge. If it is true that whatever is known is known by way of ideas, then any knowledge is known by the mediation of something else. This mediation is the task of the art of memory. Bruno elaborates on these principles in repeated sequences of thirty chapters, headed as ‘intentions’ and ‘concepts’, which reiterate the pattern that everything is, in a way, a shadow of everything else. Based upon the universal and unbreakable concordance of things (horizontally and across layers), memory does nothing other than connect contents that on the surface are distinct. To make this approach plausible, Bruno invokes the ancient and Renaissance Platonists, as well as Kabbalah, which have in common to see any one thing as referencing everything else and truth as such. One example, borrowed from Nicholas of Cusa, is a rectangular line erected upon a basis: when the line inclines towards the basis, it not only makes the angle acute, it creates at the same time an obtuse angle so that both imply each other mutually, and in that sense the different is equal. In arguing this way, Bruno expressly abandons the traditional philosophical standards of determination and classification of the Aristotelian school; he dismisses them as ‘merely logical’ while postulating and claiming to construe a harmonious unity and plurality, which is at the same time correct as to concepts and controls reality.

All this gives a philosophical explanation of the technique of memorizing that had dominated Bruno’s reputation throughout his career from his early lectures up to his invitation to Venice. Memory is the constructive power of the soul inherited from the life-giving principle of the world, and thus it orders, understands, and senses reality. Artifice and nature coincide in their mutual principles; the internal forces and the external perception correlate. Traditional mnemotechnic construed concentric wheels with letters of the alphabet, which represented images and contents; such wheels provided means to memorize contents quasi mechanically. Bruno also endorses this technique with the philosophical argument that such technique applies this universal harmony in diversity, which structures the world, and recreates its intelligibility. The psychological experience of memorizing content consists of concepts, triggers, passive and active evocation of images, and ways of judgment. This art of memory claims that it conforms with both metaphysical truth and the creative power of the mind. At the same time, on the concentric circles are tokens that actualize the conversion of anything intended into anything, because, as stated, imagination is not plainly abstract but vivid and alive. Creating schemata (‘figuring out’) for memorization is an art and yet not artificial in the negative sense; it is the practical execution or performance of reality. Here is an example from De umbris idearum: Suppose we need to memorize the word numeratore (numberer), we take from the concentric wheels the pairs NU ME RA TO RE. Each pair is represented by an image, and together they form a sentence: NU=bee, ME=on a carpet, RA=miserable, and so on. This produces a memorizable statement describing an image: ‘A bee weaves a carpet, dressed in rags, feet in chains; in the background a woman holding out her hands, astride a hydra with many heads.’ (Bruno 1991, LXV). This appears farfetched, but less so if we consider that, with any single change of one of the pairs, we arrive at a different statement yet supported by the same imagined picture. Such a picture, though artificial, allows for a smooth transition from concept to concept; and such concepts capture the complexity of reality.

Without making any claims in this direction, Bruno practices what today is called semiotics, namely, the discipline that approaches understanding reality from the perspective of designating and intellectually processing. That is clear from the title of his last book on memory, De imaginum, signorum et idearum compositione (The composition of images, signs, and ideas). Although this discipline deals with signification and its methods, it still relies on depicting reality as though it were ‘digesting’ it. A key concept in Bruno’s epistemology and metaphysics of memory is conversion (convertere, conversio). The purposefully arranged images that support remembrance are effective because one sign must be converted into a referent, and images convert according to schemata into new images or signs and representations. This is exercised in all of Bruno’s mnemotechnic works. Such transformations might appear arbitrary but are based on the constant transformation of psychic states and physical reality and on the intellectual activity of turning attention to an object. Love is an example of this conversion: profiting from Marsilio Ficino’s theory of love, Bruno claims that love not only dominates the relations between humans, and between God and humans, but is also the force that organizes the living and non-living world. This is only possible because love means that any of these can take on the essence of any other so that the loving and the beloved convert mutually into each other (Sigillus Sigillorum n. 158; Bruno 2009, 2:256, 472).  Bruno’s interest in the art of memory was fueled by his Platonizing metaphysics, which seeks the convergence of the universe in one principle and the convertibility of truth into understanding. For this purpose, he also invoked the (late ancient) tradition of Hermeticism as visible in Hermes as the messenger of the philosophical principles and in making the sorceress Circe speaker of a treatise on memory. During his time in Germany, Bruno produced texts and excerpts on magic, on spellbinding (De vinculis), universal principles, mathematical and medical application of Lullism, and on the cosmology of the soul (Lampas triginta statuarum). All of them have the idea of transformation at their basis. Before modern empiricism and science based on mathematical calculus and projections, natural magic was a reputable discipline, which investigated the invisible ‘occult’ forces that drive events in nature, including spiritual powers. Bruno contributed to magical theories by raising the question of how those forces are related to empirical and metaphysical certainties. In his notes on magic, Bruno likens the interconnectedness of things to the ‘conversation’ in human languages by way of translation as transformation: in an analogous way, magical signs and tokens are hints that make understanding divinity possible (De magia, Bruno 2000a, 192–94 [1962 vol. III, 412]).

Bruno became best known for his Copernicanism and his end as a heretic. However, Bruno’s epistemology of memory, his cosmology, and his interest in magic are all convergent with the project of a universal theory of everything that, by definition, purports the unity of thought, existence, and objectivity. This can be seen in two literary descriptions of subjectivity and objectivity. In the Heroic Frenzies, Bruno narrates the mythos of Actaeon who in search of the goddess Diana is turned into a deer and eaten by his own dogs. The dogs allegorize the human intellect and the convergence of knowledge and object, a dissolution in which knowledge achieves its end. The other example can be found in Bruno’s De immenso: Bruno remembers his childhood experience that from the mountain Cicala of his home town the Vesuvius looked menacing, but from the Vesuvius the familiar mountain was alien. From this, he concluded that the center of the world is wherever one stands and that the world has no physical boundaries (Bruno 1962, vol. I 1, 313-317). His cosmology is based on an epistemology that aims at overcoming the divide between theory, practice, and objective truth.

b. Physics and Cosmology

Bruno’s fame as a heretic was closely linked to his opposition to Aristotelian physics and his endorsement of Copernicanism, which was particularly pronounced in three of the Italian dialogues and in the Frankfurt trilogy. Since Nicolaus Copernicus had introduced a planetary system in which the earth orbits around the sun—as opposed to the astronomy of Ptolemy that explained the movement of planets with circles and epicycles around the earth—, Bruno discusses the question of whether this is only a mathematical model or reality. He points out that both models are mutually convertible, with the sun taking the place of the earth. However, preferable is not what is easier to calculate, more plausible, or more traditional (indeed, all this is the case in the Copernican model, including his reliance on ancient philosophy) but what is compelling on all fronts. If it is true that the earth moves, it must be so because it is “possible, reasonable, true, and necessary” (Ash Wednesday Supper III, Bruno 2018, 153; 1958, 131). The emphasis lies on the combination of possibility and truth; for a theory cannot suffice in being plausible and true, in a way, it also has to be necessary, so that what is possible is also real. If the planetary orbs are not mere hypothetical objects of mathematical calculation, philosophy has to accept them as real movements and explain how this motion comes about. Whatever changes is driven by an effective principle, which is necessarily internal; and that maxim applies to the stars as well as to magnetism and animal sex (Ash Wednesday Supper III, Bruno 2018, 123; 1958, 109). Copernicanism, for Bruno, is not only one segment of knowledge about reality, it is symptomatic of how human understanding of the powers of the world works. Astronomy exemplifies that mathematics is more than calculus; it is the real structure of the world (as the Pythagoreans had taught), and in being intellectual it has to be a reality that transcends the material and is inherent in everything.

In this context, Bruno equates the world to a machine and to an animal: as any living being is composed of distinct parts, which would not exist without the whole, so is the world one diverse organism that is composed of distinct parts. When he returns to Copernicus and discusses astronomy in great detail and with some modifications in his De immenso, Bruno reiterates that the universe is one, to the effect that there cannot be any first-mover (beyond the ultimate sphere in Aristotelian astronomy); rather, the earth and everything else is animated with the soul as the center of every part (De immenso III 6, Bruno 1962, vol. I 1, p. 365). In Aristotelian natural philosophy, the soul was the incorporeal principle of movement of living bodies. Bruno transfers and applies this notion to the universe. Hence it follows for him that there is a world soul, that the heavenly spheres are animated, that all planets are alike (that is, the earth is as much a planet as any other), that the number of spheres and suns is innumerable or even infinite, and that nature and God are identical insofar as God is omnipresent. This is the reason why Bruno famously extended the Copernican ‘closed world’ to an open and infinite universe. Copernicus had made the sun the center of the universe and assigned the earth and the planets their orbits accordingly, but he did not expressly deny the Aristotelian theory that the world is finite in extension. Bruno went a step further and inferred from the equivalence of all planets and the infinite power of God that the universe must be infinite. God, causation, principles, elements, active and passive potencies, matter, substance, form, etc. are all parts of the One and distinguished only by logical conceptualization, as it is inevitable in human discourse (De immenso VIII 10, Bruno 1962, vol. I 2, p. 312). These cosmological ideas have led later readers to the interpretation that Bruno was a pantheist, identifying God and nature, both being the whole of the universe; they also could talk of atheism if he meant to say that God is nothing but a name for natural mechanisms. The terms ‘atheism’ and ‘pantheism’ were coined later, but as a matter of fact, these possible interpretations dominated the reception of Bruno from the mid-18th century in relation to Baruch Spinoza while others insisted that God’s absolute immanence admits for some sort of transcendence and distinction from the finite world (see below section 4).

c. Metaphysics and Mathematics

To consolidate this novel approach to traditional themes, Bruno had to rearrange philosophical terminology and concepts. In his De la causa, he addressed the scholastic philosophy of cause and principle, matter and form, substance and accident, and also one and many. In Aristotelian causality, finality was the dominating force, and, in Christian thought, that had been identified with God who governs the world. Bruno correlated universal finality with the internal living power and controlling reason in all things. Accordingly, if God is usually understood as beyond the world and now identified as the internal principle, the distinction between internal and external causation vanishes. Bruno uncovers the conceptual problems of Aristotelian causality, which includes matter and form as two of the principles: if they are only descriptors of things, they are not real, but if they are supposed to be real, they need to be matching to the extent that there is no matter without form, no form without matter, and both are co-extensive. Prime matter in school philosophy is either nothing (prope nihil, for lack of form) or everything and receptacle of all forms. What is logically necessary to be kept distinct, such as forms and matter or the whole and its parts, is metaphysically one and also as infinite as all potentialities. Bruno closes his dialogue on Cause, Principle, and the One with an encomium of the One. Being, act, potency, maximum, minimum, matter and body, form and soul – all are one, which harkens back to Neoplatonist themes. However, in the fifth dialogue, Bruno challenges this praise of unity by raising the question of how it is at all possible to have individual or particular items under the canopy of oneness. He pronounces the adage “It is profound magic to draw contraries from the point of union,” in other words: how is plurality at all possible if all is one?

In his Frankfurt trilogy, Bruno unfolds the interconnection of nature, understanding, metaphysics, and mathematics. In his dedication to Duke Henry Julius, Bruno announces its contents: De minimo explains the principles of understanding as the foundational project while relying on sensation; it belongs to mathematics and deals with minimum and maximum in geometrical figurations. De monade, numero, et figura traces imagination and experience at the basis of research, which conforms to language and is quasi-divine; its theme is the monad as the substance of things and the basis of unity, diversity, and relationality. De immenso, innumerabilibus et infigurabili shows the order of the worlds with proofs that factually nature is visible, changing, and composed of elements, heaven, and earth, and yet an infinite whole (Bruno 1962, vol. I 1, 193-199; 2000b, 231–39). The theory of monads encompasses three elements: the geometrical properties of points, the physical reality of minimal bodies (atoms), and the cognitive method of starting with the ultimate simple when understanding complexity. In this function, the monad is the link between the intellectual and the physical realms and provides the metaphysical foundation of natural and geometrical investigations. Thus the monad is what makes everything particular as something singular; at the same time it constitutes the wholeness of the universe, which is made up of infinitely many singular things and necessarily infinite. With his monadology, and the research into the minimal constituents of thought and being, Bruno revived ancient atomism, as adopted from Lucretius. There is no birth nor death in the world, nothing is truly new, and nothing can be annihilated since all change is but a reconfiguration of minimal parts, monads, or atoms (depending on the context of the discourse). The concept of mathematical, geometrical, and physical atoms is the methodical channel to relate things and ideas with one another and to explain the existence of distinctions out of the One, thus turning geometrical and physical theories into metaphysics.

Mathematics, for Bruno, is geometry in the first place because arithmetic performs quantitative calculations of measurements of figures that are defined geometrically. Therefore, in his Articuli adversus mathematicos of 1588, he establishes the methodical and ontological chain of concepts that leads from the mind – via ideas, order, definitions, and more – down to geometrical figures, the minimum, and the relation of larger and smaller. Bruno’s geometry, inspired by Euclid and Nicholas of Cusa, becomes the paradigmatic discipline that encompasses the intelligibility and reality of the world. Precisely for the sake of intelligibility, Bruno does admit infinitude in the extension of the world, but not in the particulars: the infinitely small would be unintelligible, and therefore there is a minimum or atom that terminates the process of ever smaller division. Not only is the earth one star among many, the sphere in a geometrical sense becomes the infinite as such if taken to be universal; ‘universal’ now meaning ubiquitous, logically referring to everything, and physically encompassing the universe. Since such infinity escapes human measurement, if not intelligence, any quantity has to originate from the minimal size that is not just below sensibility, rather, it is the foundational quality that constitutes any finite measurement. Without the minimum, there is no quantity by analogy to the thought that unity is the substance of number and the essence of anything existing. It is the minimum that constitutes and drives everything.

One consequence of this line of thinking is Bruno’s understanding of geometry in the technical sense. In his De minimo and also in his Padua lectures of 1591, he explains that all geometrical figures, from lines to planes and bodies, are composed of minima. A line, then, is the continuation of a point, which is defined as first part or as end, and the line is the fringe of a plane. This entails that angles are also composed of minimal points which then build up the diverging lines. Bruno suggests angles are generated by gnomons (equal figures that when added combine to a like figure), namely minimal circles that surround a minimal circle; thus gnomons create a new circle, and through tangents, at the points of contact these create lines that spread out. Invoking the ancient atomist Democritus, Bruno created an atomistic geometry and claimed to find a mathematical science that is not merely arithmetic but conforms with the essence of the world (Bruno 1962, vol. I 3, 284 and 183-186; 1964).

Bruno’s scholarly language abounds with analogies, parallels, repetitions, metaphors, and specifically with serial elaboration of properties; after all, the mnemotechnic and Lullist systems were permutations of terms, too. Bruno also never misses a chance to play with words. For instance, he spells the Italian word for reason as ‘raggione’ with duplicated letter g: in this spelling, reason and ray (ragione and raggio) become associated and suggest that thought is a search light (notice the recurrence of the term lampas ­– torchlight) into reality and illuminated by truth. This poetic and ingenious handling of language parallels his understanding of geometry that makes up reality: visible reality is structured by measurable, construable, and retraceable proportions and is intelligible only with recourse to the perfection of geometrical relations. Thus, all understanding consists in appropriation of the unfolding of the minimum, which is a creative process that reconstructs what there is. This also applies to human life. In his Heroic Frenzies, he uses the motto Amor instat ut instans (Love is an insistent instant): the momentary experience of love lasts and keeps moving, it is like the minimal point in time that lasts forever and keeps pushing (Bruno 1958, 1066–69).

d. Religion and Politics

The question Bruno had to face through his life, until his condemnation as a heretic, was in what sense he agreed with basic tenets of Christianity. He advocated some kind of natural theology, that is, an understanding of God that is revealed not only in sacred texts but first and foremost in what is theologically to be called creation, and philosophically the contingent and finite world. Reality and human understanding contain clues for the ultimate meaning and demand means of explanation (such as the absolute, infinity, minimum, creativity, power). Revelation as that of the Bible competes with literary and mythological sources and with hidden powers as investigated in magic (in that regard, Bruno is an heir to Renaissance humanism). Much of this can be found in his depositions during the Inquisition trials. His major work on the question of religion was the Expulsion of the Triumphant Beast, a staged conversation among the Greek Gods about virtues and religion. Bruno sets religion into a historical and relativist context: Christianity, Egyptian, and Greek or Jewish traditions have in common that they try to represent the divine in the world and to edify the souls of humans. Thus, religious symbolism works like mnemotechnic imagery that may and must be permutated according to the circumstances of the community. The altar (i.e., the place of the eucharistic mystery) represents the core of religious meaning and is, in Bruno’s age of Protestantism and Counter Reformation, the relic (as he terms it) of the “sunken ship of religion and divine cult” (Bruno 1958, 602). Since “nature is God in things” (Bruno 1958, 776), the divine is as accessible through nature as the natural is through the divine. Consequently, worship means the continuous reformation of the individual towards better understanding the world and the self. Established religious cults have two valid purposes: inquiry into the ultimate principle of reality and communal order for all members of society who lack the means of true philosophy. Inspired by the debates among humanists and reformers (for instance, Thomas More, Erasmus of Rotterdam, Martin Luther, John Calvin, Juan Valdès), Bruno inaugurated two branches of religion, namely, natural religion and political theology. It was probably the discovery of this dialogue of the Triumphant Beast by the inquisitors in 1599 that sealed Bruno’s fate as a heretic (Firpo 1993, 100, 341).

3. Reception

Bruno’s afterlife as a character and a philosopher (Canone 1998; Ricci 1990; 2007) shows a great variety, not mutually exclusive, of interpretations of his philosophy. The history of his fame encompasses many possible interpretations of his philosophical aims. During the trial of Bruno in Rome, the German recent convert Kaspar Schoppe wrote a witness report of the testimony, which was first published in print by the Hungarian Calvinist Peter Alvinczi to vilify the Catholics (Alvinczi 1621, 30–35; Schoppe 1998). With his fate, Bruno began to symbolize the struggle between religious persecution and freedom of thought, thus inaugurating philosophical modernity. Bruno was heralded as the individualist “knight errant of philosophy” (Bayle 1697, 679) and associated with Baruch Spinoza’s alleged atheism, according to which all things are attributes of God. Already in 1624, Marin Mersenne included Bruno in his polemic against deists, libertines, and atheists. He reported that Bruno taught that circle and straight line, point and line, surface and body are all the same thing, advocated the transmigration of souls, denied God’s freedom for the sake of infinite worlds, but also held that God could have made a different world, and he reduced miracles to natural data (Mersenne 1624, pts. I, 229–235; II, passim). To this Charles Sorel responded by emphasizing the literary style of Bruno’s Latin poems. Both interpretations inspired an anonymous treatise by the title J. Brunus redivivus (Bruno revived, 1771) that discussed whether Bruno’s cosmology was atheistic (Sorel 1655, 238–43; Del Prete 1995; Schröder 2012, 500–504). The historian Jacob Brucker (Brucker 1744, IV 2:12–62) presented Bruno as the first thinker to promote modern eclecticism while also pointing to the Neoplatonic elements of his thought. Around 1700, John Toland identified the Spaccio as a manifesto against the Church and published Schoppe’s report and an English translation of the introductory letter to De l’infinito; in correspondence with Gottfried Wilhelm Leibniz, he branded Bruno as pantheist (Begley 2014). A debate in German Protestant circles in the late 18th century on Spinozism as the epitome of atheism prompted Friedrich Heinrich Jacobi to publish letters to Moses Mendelssohn, to which, in 1789, he added excerpts from Bruno’s De la causa with the intent to prove thereby that Spinoza was an atheist. These excerpts were actually free paraphrases from dialogues II to V that culminated in the praise of the infinite essence that is cause and principle, one and all. As a consequence of the fascination with Bruno, Nicholas of Cusa was rediscovered, being among his philosophical sources. Friedrich Schelling, contrary to Jacobi’s intentions, incorporated Bruno in the program of German Idealism in a book of 1802 with the title Bruno presenting Bruno’s purported pantheism as a step towards idealist philosophy of unity and identity. Using excerpts from Jacobi’s paraphrase, he termed the above-quoted adage about “drawing contraries from the point of union” the symbol true philosophy. From there Georg Wilhelm Friedrich Hegel discovered Bruno’s Lullism and art of memory as a “system of universal determinations of thought” or the coincidence of nature and creative mind (Hegel 1896, II  3 B 3, p. 130), which prompted the Catholic philosopher Franz Jakob Clemens to present Bruno as a precursor of Hegel. In England, Bruno’s brief sojourn left an impression that was less academically philosophical and more literary and legendary, which fits his disastrous appearance in Oxford. There is an ongoing debate on to what extent he might have influenced Christopher Marlowe and William Shakespeare. Henry Percy, Earl of Northumberland, collected Bruno’s works and wrote essays that appear to be inspired by Bruno’s literary style and themes, while members of his circle appreciated Bruno in the context of the rise of modern science. It was the philosopher-poet Samuel Taylor Coleridge who later, in the wake of the Spinoza debate, brought Bruno back from Germany to England. Rita Sturlese discovered that the reception of Bruno can be retraced through the fate of single copies of his early editions (Sturlese 1987). The 19th century inaugurated philological and historical research on Bruno and his Renaissance context. Italian scholars and intellectuals made Bruno a hero of national pride and anticlericalism (Rubini 2014; Samonà 2009): Vincenzo Gioberti and Bertrando Spaventa claimed that the German idealism from Spinoza to Hegel had its antecedent in Bruno, the Italian renegade (Molineri 1889; Spaventa 1867, 1:137–267). The edition of his Latin works was started in 1879 as a national project, and a monument was erected on Campo de’Fiori in Rome. In the second half of the 20th century, Bruno was perceived both as the turning point into modernity and as the heir of ancient occultism (Blumenberg 1983 [originally 1966], pt. 4; Yates 2010 [originally 1964]).

4. References and Further Reading

a. Bruno Editions

  • Bruno, Giordano. 1950. “On the Infinite Universe and Worlds [De l’infinito Universo et Mondi, English].” In Giordano Bruno, His Life and Thought, by Dorothea Waley Singer, 225–380. New York: Schuman.
  • Bruno, Giordano. 1957. Due dialoghi sconosciuti e due dialoghi noti: Idiota triumphans – De somnii interpretatione – Mordentius – De Mordentii Circino. Edited by Giovanni Aquilecchia. Roma: Ed. di Storia e Letteratura.
  • Bruno, Giordano. 1958. Dialoghi italiani. Edited by Giovanni Gentile and Giovanni Aquilecchia. Firenze, Sansoni.
  • Bruno, Giordano. 1962. Jordani Bruni Nolani opera latine conscripta. Edited by Francesco Fiorentino and Felice Tocco. [Napoli / Firenze, 1879-1891]. Stuttgart-Bad Cannstatt: Frommann-Holzboog.
  • Bruno, Giordano. 1964. Praelectiones geometricae, e Ars deformationum. Edited by Giovanni Aquilecchia. Roma: Edizioni di Storia e letteratura.
  • Bruno, Giordano. 1991. De umbris idearum. Edited by Maria Rita Pagnoni-Sturlese. Firenze: Olschki.
  • Bruno, Giordano. 1993-2003. Oeuvres complêtes = Opere complete. Edited by Giovanni Aquilecchia. 7 vols. Paris: Les Belles Lettres.
  • Bruno, Giordano. 1998. Cause, Principle, and Unity and Essays on Magic. Translated by Robert De Lucca and Richard J. Blackwell. Cambridge, U.K.: Cambridge University Press.
  • Bruno, Giordano. 2000a. Opere magiche. Edited by Simonetta Bassi, Scapparone, Elisabetta., and Nicoletta Tirinnanzi. Milano: Adelphi.
  • Bruno, Giordano. 2000b. Poemi filosofici latini. Ristampa anastatica delle cinquecentine. Edited by Eugenio Canone. La Spezia: Agorà.
  • Bruno, Giordano. 2001. Corpus iconographicum: le incisioni nelle opere a stampa. Edited by Mino Gabriele. Milano: Adelphi.
  • Bruno, Giordano. 2002. The Cabala of Pegasus. Translated by Sidney L. Sondergard and Madison U. Sowell. New Haven: Yale University Press.
  • Bruno, Giordano. 2004. Opere mnemotecniche. Edited by Marco Matteoli, Rita Sturlese, and Nicoletta Tirinnanzi. Vol. 1. 2 vols. Milano: Adelphi.
  • Bruno, Giordano. 2007-2019. Werke [Italienisch – Deutsch]. Edited by Thomas Leinkauf. 7 vols. Hamburg: Meiner.
  • Bruno, Giordano. 2009. Opere mnemotecniche. Edited by Marco Matteoli, Rita Sturlese, and Nicoletta Tirinnanzi. Vol. 2. 2 vols. Milano: Adelphi.
  • Bruno, Giordano. 2012. Opere lulliane. Edited by Marco Matteoli. Milano: Adelphi.
  • Bruno, Giordano. 2013. On the Heroic Frenzies. Degli Eroici Furori. Tranlsated by Ingrid D. Rowland. Toronto: University of Toronto Press.
  • Bruno, Giordano. 2018. The Ash Wednesday Supper. Translated by Hilary Gatti. (Italian/English). Toronto: University of Toronto Press.

b. Other Primary Sources

  • [Alvinczi, Péter]. 1621. Machiavellizatio, qua unitorum animos Iesuaster quidam dissociare nititur. Saragossa [Kassa]: Ibarra.
  • Bayle, Pierre. 1697. Dictionaire historique et critique: par Monsieur Bayle. Tome premier premiere partie. A-B. Rotterdam: Leers.
  • Brucker, Jakob. 1744. Historia critica philosophiae: A tempore resuscitatarum in occidente litterarum ad nostra tempora. Vol. IV 2. Leipzig: Breitkopf.
  • Clemens, Franz Jakob. 2000. Giordano Bruno und Nicolaus von Cusa. Eine philosophische Abhandlung [1847]. Edited by Paul Richard Blum. Bristol: Thoemmes Press.
  • Firpo, Luigi. 1993. Il processo di Giordano Bruno. Edited by Diego Quaglioni. Roma: Salerno.
  • Hegel, Georg Wilhelm Friedrich. 1896. Lectures on the History of Philosophy: Medieval and Modern Philosophy. Translated by E. S. Haldane and Frances H. Simson. Vol. 3. London: Paul, Trench, Trübner.
  • J. Brunus redivivus, ou traité des erreurs populaires, ouvrage critique, historique & philosophique, imité de Pomponace: Premiere partie. 1771.
  • Jacobi, Friedrich Heinrich. 2000. Über die Lehre des Spinoza in Briefen an den Herrn Moses Mendelssohn. Edited by Irmgard Maria Piske and Marion Lauschke. Hamburg: Meiner.
  • Mercati, Angelo, ed. 1988. Il sommario del processo di Giordano Bruno; con appendice di documenti sull’ eresia e l’Inquisizione a Modena nel secolo XVI. Reprint of the 1942 edition. Modena: Dini.
  • Mersenne, Marin. 1624. L’ impiété des deistes, et des plus subtils libertins découverte, et refutee par raisons de theologie, et de philosophie. Paris: Billaine.
  • Molineri, G. C., ed. 1889. Vincenzo Gioberti e Giordano Bruno. Due lettere inedite. Torino: Roux.
  • Schelling, Friedrich Wilhelm Joseph. 1984. Bruno, or, On the Natural and the Divine Principle of Things. Translated by Michael G. Vater. [Origninally 1802]. Albany: State University of New York Press.
  • Schelling, Friedrich Wilhelm Joseph. 2018. Bruno oder über das göttliche und natürliche Princip der Dinge: Ein Gespräch. [2nd. ed., Berlin: Reimer, 1842]. Berlin: De Gruyter.
  • Schoppe, Kaspar. 1998. “Brief an Konrad Rittershausen.” Edited by Frank-Rutger Hausmann. Zeitsprünge. Forschungen zur frühen Neuzeit 2 (3-4: Kaspar Schoppe): 459–64.
  • Sorel, Charles. 1655. De la perfection de l’homme. Paris: de Nain.
  • Spaventa, Bertrando. 1867. Saggi di critica filosofica, politica e religiosa. Vol. 1. Napoli: Ghio.
  • Toland, John. 1726. A Collection of Several Pieces of Mr. John Toland: Now Just Published from His Original Manuscripts, with Some Memoirs of His Life and Writings. 2 vols. London: Peele.

c. Secondary Sources

  • Begley, Bartholomew. 2014. “John Toland’s On the Manner, Place and Time of the Death of Giordano Bruno of Nola.” Journal of Early Modern Studies, Bucharest 3 (2): 103–15.
  • Blum, Elisabeth. 2018. Perspectives on Giordano Bruno. Nordhausen: Bautz.
    • On religion, politics and language.
  • Blum, Paul Richard. 2012. Giordano Bruno: An Introduction. Translated by Peter Henneveld. Amsterdam: Rodopi.
  • Blum, Paul Richard. 2016. Giordano Bruno Teaches Aristotle. Translated by Peter Henneveld. Nordhausen: Bautz.
    • The specific reception of Aristotle’s philosophy.
  • Blumenberg, Hans. 1983. The Legitimacy of the Modern Age. Translated by Robert M. Wallace. [Originally 1966]. Cambridge, Mass.: MIT Press.
    • Cusanus and Bruno as landmarks of modernity.
  • Bönker-Vallon, Angelika. 1995. Metaphysik und Mathematik bei Giordano Bruno. Berlin: Akademie Verlag.
    • The importance of mathematics.
  • Bruniana & Campanelliana (Journal since 1995)
  • Canone, Eugenio, ed. 1992. Giordano Bruno: Gli anni napoletani e la “peregrinatio” europea: immagini, testi, documenti. Cassino: Università degli studi. [Excerpts at “Archivio Giordano Bruno – Studi e Materiali”].
    • An edition of biographic documents.
  • Canone, Eugenio, ed. 1998. Brunus Redivivus: Momenti della fortuna di Giordano Bruno nel XIX secolo. Pisa: IEPI.
    • The reception of Bruno in the 19th century.
  • Canone, Eugenio, ed. 2000. Giordano Bruno, 1548-1600: Mostra storico documentaria. Firenze: Olschki.
    • Documents of an exhibition in memory of Bruno.
  • Canone, Eugenio, and Germana Ernst, eds. 2006. Enciclopedia bruniana e campanelliana. 3 vols. Pisa – Roma: IEPI – Serra.
    • A dictionary of concepts.
  • Carannante Salvatore. 2018. Unigenita Natura. Edizioni di Storia e Letteratura.
    • Nature and God in Bruno.
  • Catana, Leo. 2008. The Historiographical Concept “System of Philosophy”: Its Origin, Nature, Influence and Legitimacy. Leiden: Brill.
  • Catana, Leo. 2017. The Concept of Contraction in Giordano Bruno’s Philosophy. Routledge.
  • Ciliberto, Michele. 2002. L’occhio di Atteone: Nuovi studi su Giordano Bruno. Ed. di Storia e Letteratura.
    • Epistemological questions in Bruno.
  • De Bernart, Luciana. 2002. Numerus quodammodo infinitus: Per un approccio storico-teorico al dilemma matematico nella filosofia di Giordano Bruno. Roma: Edizioni di storia e letteratura.
    • The role of mathematics.
  • Del Prete, Antonella. 1995. “L‘univers Infini: Les Interventions de Marin Mersenne et de Charles Sorel.” Revue Philosophique de La France et de l’Étranger 185 (2): 145–64.
    • Bruno’s reception in France.
  • Eusterschulte, Anne, and Henning S. Hufnagel. 2012. Turning Traditions Upside Down: Rethinking Giordano Bruno’s Enlightenment. Budapest: Central European University Press.
  • Faracovi, Ornella Pompeo, ed. 2012. Aspetti della geometria nell’opera di Giordano Bruno. Lugano: Agorà.
    • Geometry in Bruno’s philosophy.
  • Gatti, Hilary. 1999. Giordano Bruno and Renaissance Science. Ithaca: Cornell University Press.
  • Gatti, Hilary, ed. 2002. Giordano Bruno: Philosopher of the Renaissance. Aldershot: Ashgate.
  • Gatti, Hilary. 2011. Essays on Giordano Bruno. Princeton, N.J.: Princeton University Press.
  • Gatti, Hilary. 2013. The Renaissance Drama of Knowledge: Giordano Bruno in England. London: Routledge.
  • Granada, Miguel Ángel, and Dario Tessicini, eds. 2020. Giordano Bruno, De immenso. Letture critiche. Pisa – Roma: Serra.
    • Detailed interpretations of the single books of De immenso.
  • Granada, Miguel Angel, Patrick J. Boner, and Dario Tessicini, eds. 2016. Unifying Heaven and Earth: Essays in the History of Early Modern Cosmology. Barcelona: Edicions de la Universitat de Barcelona.
  • Kodera, Sergius. 2020. “The Mastermind and the Fool. Self-Representation and the Shadowy Worlds of Truth in Giordano Bruno’s Candelaio (1582),” Aither – Journal for the Study of Greek and Latin Philosophical Traditions 23 (7): 86–111.
  • Matteoli, Marco. 2019. Nel tempio di Mnemosine: L’arte della memoria di Giordano Bruno. Pisa: Edizioni della Normale.
    • The study of memory in Bruno.
  • Mendoza, Ramon G. 1995. The Acentric Labyrinth: Giordano Bruno’s Prelude to Contemporary Cosmology. Shaftesbury: Element Books.
  • Mertens, Manuel. 2018. Magic and Memory in Giordano Bruno. Leiden: Brill.
  • Omodeo, Pietro Daniel. 2011. “Helmstedt 1589: Wer exkommunizierte Giordano Bruno?” Zeitschrift für Ideengeschichte 5 (3): 103–14.
    • On the excommunication of Bruno by Lutherans.
  • Omodeo, Pietro Daniel. 2014. Copernicus in the Cultural Debates of the Renaissance: Reception, Legacy, Transformation. Leiden: Brill.
  • Ordine, Nuccio. 1996. Giordano Bruno and the Philosophy of the Ass. New Haven: Yale University Press.
    • On the metaphor of ‘asininity’ and its history.
  • Ricci, Saverio. 1990. La fortuna del pensiero di Giordano Bruno, 1600-1750. Firenze: Le Lettere.
    • Bruno’s reception in the 17th and 18th centuries.
  • Ricci, Saverio. 2007. Giordano Bruno nell’Europa del Cinquecento. Milano: Il Giornale.
    • Historic contexts of Bruno’s activities.
  • Rowland, Ingrid D. 2008. Giordano Bruno: Philosopher/Heretic. New York: Farrar, Straus and Giroux.
  • Rubini, Rocco. 2014. The Other Renaissance: Italian Humanism between Hegel and Heidegger. Chicago: University of Chicago Press.
  • Saiber, Arielle. 2005. Giordano Bruno and the Geometry of Language. Aldershot: Ashgate.
    • Language and mathematics.
  • Samonà, Alberto, ed. 2009. Giordano Bruno nella cultura mediterranea e siciliana dal ’600 al nostro tempo: Atti della Giornata nazionale di studi, Villa Zito, Palermo, 1 marzo 2008. Palermo: Officina di Studi Medievali.
  • Schröder, Winfried. 2012. Ursprünge des Atheismus: Untersuchungen zur Metaphysik- und Religionskritik des 17. und 18. Jahrhunderts. Stuttgart: Frommann-Holzboog.
    • A history of atheism from 1600 through 1900.
  • Spampanato, Vincenzo. 2000. Vita di Giordano Bruno con documenti editi ed inediti. Edited by Nuccio Ordine. [First ed. 1921]. Paris/Torino: Les Belles Lettres, Aragno.
    • Most complete biography of Bruno.
  • Sturlese, Rita. 1987. Bibliografia censimento e storia delle antiche stampe di Giordano Bruno. Firenze: Olschki.
    • Bibliography of every single copy of the first prints of Bruno’s works.
  • Tessicini, Dario. 2007. I dintorni dell’infinito: Giordano Bruno e l’astronomia del Cinquecento. Pisa: Serra.
    • Bruno’s Copernicanism in 16th-century context.
  • Traversino, Massimiliano. 2015. Diritto e teologia alle soglie dell’età moderna: Il problema della potentia Dei absoluta in Giordano Bruno. Napoli: Editoriale scientifica.
    • Bruno in the context of juridical and theological debates.
  • Yates, Frances. 2010. Giordano Bruno and the Hermetic Tradition. [First 1964]. London: Routledge.
    • Bruno’s reading of occultist sources related to Pseudo-Hermes Trismegistus.

d. Online Resources

  • “Archivio Giordano Bruno – Studi e Materiali.” http://www.iliesi.cnr.it/AGB/.
  • “Bibliotheca Bruniana Electronica: The Complete Works of Giordano Bruno.” The Warburg Institute. https://warburg.sas.ac.uk/research/research-projects/giordano-bruno/download-page.
  • “Enciclopedia Bruniana e Campanelliana.” http://www.iliesi.cnr.it/EBC/entrate.php?en=EB.
  • “La biblioteca ideale di Giordano Bruno. L’opera e le fonti.” http://bibliotecaideale.filosofia.sns.it.

 

Author Information

Paul Richard Blum
Email: prblum@loyola.edu
Palacký University Olomouc
Czech Republic
and
Loyola University Maryland
U.S.A.

The Bhagavad Gītā

The Bhagavad Gītā occurs at the start of the sixth book of the Mahābhārata—one of South Asia’s two main epics, formulated at the start of the Common Era (C.E.). It is a dialog on moral philosophy. The lead characters are the warrior Arjuna and his royal cousin, Kṛṣṇa, who offered to be his charioteer and who is also an avatar of the god Viṣṇu. The dialog amounts to a lecture by Kṛṣṇa delivered on their chariot, in response to the fratricidal war that Arjuna is facing. The symbolism employed in the dialog—a lecture delivered on a chariot—ties the Gītā to developments in moral theory in the Upaniads. The work begins with Arjuna articulating three objections to fighting an impending battle by way of two teleological theories of ethics, namely Virtue Ethics and Consequentialism, but also Deontology. In response, Kṛṣṇa motivates Arjuna to engage in battle by arguments from procedural ethical theories—specifically his own form of Deontology, which he calls karma yoga, and a radically procedural theory unique to the Indian tradition, Yoga, which he calls bhakti yoga. This is supported by a theoretical and metaethical framework called jñāna yoga. While originally part of a work of literature, the Bhagavad Gītā was influential among medieval Vedānta philosophers. Since the formation of a Hindu identity under British colonialism, the Bhagavad Gītā has increasingly been seen as a separate, stand-alone religious book, which some Hindus treat as their analog to the Christian Bible for ritual, oath-swearing, and religious purposes. The focus of this article is historical and pre-colonial.

Table of Contents

  1. Introduction
  2. The Eighteen Chapters of the Gītā
  3. Just War and the Suppression of the Good
  4. Historical Reception and the Gītā’s Significance
  5. Vedic Pre-History to the Gītā
  6. Mahābhārata: Narrative Context
  7. Basic Moral Theory and Conventional Morality
  8. Arjuna’s Three Arguments Against Fighting
  9. Kṛṣṇa’s Response
  10. Gītā’s Metaethical Theory
    1. Moral Realism
      1. Good and Evil
      2. Moral Psychology
    2. Transcending Deontology and Teleology
  11. Scholarship
  12. References and Further Reading

1. Introduction

The Bhagavad Gītā (Gītā) occurs at the start of the sixth book of the Mahābhārata —one of South Asia’s two main epics. Like the Rāmāyaṇa, it depicts the god Viṣṇu in avatāra form. In the Rāmāyaṇa, he was Rāma; in the Mahābhārata he is Kṛṣṇa. This time, Viṣṇu is not the protagonist of the whole epic, but unlike the Rāmāyaṇa, here he shows awareness of his own identity as Īśvara or Bhagavān: Sovereignty. While moral theory is a topic of discussion in both epics, the Bhagavad Gītā is a protracted discourse and dialog on moral philosophy. The text itself, as an excerpt from an epic, was received variously in South Asian traditions. To some philosophers, such as those who grounded their theorizing on the latter part of the Vedas, a position known as Vedānta, the Bhagavad Gītā, though a smṛti (a historical document) and not a śruti (revealed text like the Vedas or scripture), nevertheless plays a prominent role in constituting a source of argument and theory. The major Vedānta philosophers, Śaṅkara, Rāmānuja and Madhva, all wrote commentaries on the Gītā. Importantly, the Bhagavad Gītā is very much part of South Asia’s history of popular philosophy explored in literature, which unlike the Vedas, was widely accessible. It informs South Asian understandings of Kṛṣṇa, the warrior philosopher, who is a prominent incarnation of Viṣṇu. What is unique about this exploration of philosophy is that it happens on a battlefield, prior to a fratricidal war, and it addresses the question of how we can and should make tough decisions as the infrastructure of conventions falls apart.

2. The Eighteen Chapters of the Gītā

The Bhagavad Gītā contains eighteen chapters (books), which were originally untitled. Hence, editions and translations frequently include title headings that are created for publication. The Śrī Vaiṣṇava philosopher, Yāmunācārya (916-1041 CE), in his Summary of the Import of the Gītā (Gītārtha-saṅgraha), divides the Gītā into three parts, each with six chapters. The first hexad concerns, on his account, an emphasis on karma yoga (a deontological perfection of duty) and jñāna yoga (the Gītā’s metaethics, or elucidation of the conditions of ethical reasoning). The middle hexad emphasizes bhakti yoga, the Gītā’s label for the position also called Yoga in the Yoga Sūtra and other philosophical texts: The right is action in devotion to the procedural ideal of choice (Sovereignty), and the good is simply the perfection of this practice. This engagement in bhakti yoga, according to Yāmunācārya’s gloss on the Gītā, is brought about by karma yoga and jñāna yoga (v.4). The last hexad “which subserves the two preceding hexads,” concerns metaphysical questions related to the elaboration of Yoga. Specifically, it explores and contrasts nature (prakṛti), or explanation by causality, and the self (puruṣa), or explanation by way of responsibility. Īśvara, or sovereignty, is the proper procedural ruler of both concerns. The last hexad summarizes earlier arguments for karma yoga, bhakti yoga, and jñāna yoga. What follows below summarizes the chapters.

Chapter 1 concerns Arjuna’s lament: Here, we hear Arjuna’s three arguments against fighting the impending war, each based on one of the three theories of conventional morality: Virtue Ethics, Consequentialism, and Deontology.

Chapter 2 initiates Kṛṣṇa’s response. Kṛṣṇa extols a basic premise of Yoga: Selves (persons) are eternal abstractions from their lives, and hence cannot be confused with the empirical contingencies that befall them. This is meant to offset Arjuna’s concern for the welfare of those who would be hurt as a function of the war. Here we hear of the first formulations of karma yoga and bhakti yoga.

Kṛṣṇa here articulates the idea that blameless action is done without concern for further outcome, and that we have a right to do what we ought to do, but not to the further outcomes of activity (Gītā 2.46-67). This radical procedural frame for moral reasoning Kṛṣṇa defines as “yoga” (Gītā 2.48), which is skill in action (2.50).

Chapter 3 introduces karma yoga in further detail. The chapter begins with Arjuna concerned about a contradiction: Kṛṣṇa apparently prefers knowledge and wisdom, and yet advocates fighting, which produces anxiety and undermines clarity. Kṛṣṇa’s response is that action is unavoidable: no matter what, we are choosing and doing (even if we choose to sit out a fight). Hence, the only way to come to terms with the inevitability of choice is to choose well, which is minimally to choose to do what one ought to do, without further concern for outcome. This is karma yoga. Here we learn the famous formula of karma yoga: better one’s own duty poorly performed than someone else’s performed well (Gītā 3.35). Kṛṣṇa, the ideal of action (Sovereignty), is not exempt from this requirement either. Rather, the basic duty of Kṛṣṇa is to act to support a diversity of beings (Gītā 3.20-24). This too is the philosophical content of all duty: Our duty constitutes our contribution to a diverse world and a pedagogic example to others to follow suit. Chapter 4 focuses on bhakti yoga, or the practice of devotion. As Kṛṣṇa is the ideal of right action, whose activity is the maintenance of a diverse world of sovereign individuals responsible for their own actions, the very essence of right action is devotion to this ideal of Sovereignty. Chapter 5 introduces jñāna yoga, or the metaethical practice of moral clarity as a function of the practice of karma yoga. Chapter 6 picks up threads in previous comments on yoga, bringing attention to practices of self-regulation that support the yogi, or one engaging in skillful action.

Chapter 7 shifts to a first-person account of Sovereignty by Kṛṣṇa and the concealment of this procedural ideal in a world that is apparently structured by nonnormative, causal relations. Chapter 8 distinguishes between three classes of devotees. Chapter 9 explores the primacy of the ideal of Sovereignty and its eminence, while Chapter 10 describes the auspicious attributes of this ideal. Chapter 11 explores Arjuna’s dramatic vision of these excellences, but it is one that shows that the moral excellence of the procedural Ideal of the Right is not reducible to the Good, and logically consistent with both the Good and the Bad. Chapter 12 returns to the theme of bhakti yoga and its superiority.

Chapter 13 turns to the body and it being a tool and the seat of responsibility: the self. Chapter 14 explores a cosmological theory closely associated with Yoga, namely the idea that nature (prakṛti) is comprised of three empirical properties—sattva (the cognitive), rajas (the active), and tamas (the inert)—and that these empirical characteristics of nature can conceal the self. In chapter 15, the supreme Self (Sovereignty) is distinguished from the contents of the natural world. Chapter 16 contrasts the difference between praiseworthy and vicious personality traits. Chapter 17 focuses on the application and misapplication of devotion: Outcomes of devotion are a direct function of the procedural excellence of what one is devoted to. Devotion to Sovereignty, the ultimate Self, is superior to devotion to functionaries. Chapter 18 concludes with the excellence of renouncing a concern for outcomes via Yoga. Kṛṣṇa, speaking as the ideal, exhorts Arjuna to not worry about the content of ethics (dharma): He should focus instead on approximating the procedural ideal as the means of avoiding all fault.

3. Just War and the Suppression of the Good

The Gītā and the Mahābhārata have garnered attention for their contribution to discussions of Just War theory (compare Allen 2006). Yet, as most accounts of South Asian thought are fuelled by an interpretive approach that attempts to understand the South Asian contribution by way of familiar examples from the Western tradition, the clarity of such accounts leaves much to be desired (for a review of this phenomenon in scholarship, see Ranganathan 2021). Explicated, with a focus on the logic of the arguments and theories explored as a contribution to philosophical disagreement—and not by way of substantive beliefs about plausible positions—we see that the Mahābhārata teaches us that the prospects of just war arise when moral parasites inflict conventional morality on the conventionally moral as a means of hostility. Parasites effect this hostility by acting belligerently against the conventionally moral, while relying on the goodness of the conventionally moral to protect them from retaliation in response to their belligerence. Any such retaliation would be contrary to the goodness of conventional morality and hence out of character for the conventionally moral. The paradox here is that, from the perspective of the conventionally moral, this imposition of conventional moral standards is not wrong and is good. However, it is the means by which moral parasites exercise their hostility to the disadvantage of the conventionally moral. Prima facie, it would be just for the conventionally moral to retaliate as moral parasites act out of the bounds of morality. However, the moment that the conventionally moral engage such parasites in war, they have departed from action as set out by conventional morality, and it would appear that they thereby lack justification. This standing relative to conventional moral expectations is the same as the parasite’s. This was Arjuna’s problem at the start of the Gītā. Arjuna indeed explicitly laments that fighting moral parasites would render him no better (Gītā 1.38-39).

A procedural approach to ethics, such as we find in the Gītā, transcends conventional morality especially as it deprioritizes the importance of the good (karma yoga). Indeed, it rejects the good as a primitive moral notion in favour of the right (bhakti yoga) and thereby provides an account of the justice of those who wage war on moral parasites: The justice of the war of Arjuna and other devotees of Sovereignty should be measured by their fidelity to procedural considerations of the right, and not to considerations of the good. Arjuna and other just combatants fight as part of their devotion to Sovereignty and hence conform their behavior to an ultimate ideal of justice: that all concerned should be sovereign and thus made whole. Hence, just war (jus ad bellum) and just conduct in war (jus in bello) come to the same thing: For the just cause is devotion to the ideal, and right action is the same. In contrast, those who are not devoted to the regulative ideal fail to have a just cause, or just action in war. Jeff McMahan’s conclusion in his Killing in War (2009), that those who fight an unjust cause do wrong by fighting those whose cause is just, is entailed by bhakti yoga. However, McMahan appears to claim that the justice of a war is accounted for not by a special set of moral considerations that come into effect during war, but the same considerations we endorse during times of peace. Yet in times of peace it appears that conventional morality wins the day, vitiates against war, and all parties depart from it when they wage war—or at least, this seems to be the analysis of the Mahābhārata. It is because there are two competing moral frames—conventional morality of the good and the proceduralism of the right, or Yoga/Bhakti—that we can continue to monitor the justice of war past the breakdown of conventional moral standards (for more on the just war theory here, see Ranganathan 2019). It is because of the two standards that Yoga/Bhakti can constitute an ultimate standard of moral criticism of the right even as the conventional moral standards of the good that characterize peace deteriorate under the malfeasance of parasites.

With respect to success, we see that the Gītā also has a story to tell about which side wins the war. As the bhakti yogi is committed to a process of devotion to sovereignty, their behavior becomes sovereign in the long run and hence their success is assured. Moral parasites, in contrast, are not engaged in an activity of self-improvement. Their only means of survival—taking advantage of the conventionally moral—now lacking (as the conventionally moral have renounced conventional morality to become devotees of Sovereignty), renders them vulnerable to defeat by devotees of Sovereignty. Moral parasites only have the one trick of taking advantage of the conventionally moral, and the transition to bhakti yoga on the part of the formerly conventionally moral deprives parasites of their victims and source of sustenance.

4. Historical Reception and the Gītā’s Significance

The relationship of the Gītā to what is known as Hinduism, and to what we understand as religion, is more complicated and problematic than a straightforward philosophical study of the Gītā. In a world dominated by Western imperialism, it is common to take religious designations at face value, as though they are dispositive of the “religious” traditions and not an artifact of colonialism. An historical claim commonly made, as we find in the Encyclopedia of World Religions, is that the “Bhagavad-Gītā” is “perhaps the most widely revered of the Hindu scriptures.” The expectation that the Gītā is a religious work leads to the notion that there is some type of thematic religious development in the text that is distinct from the philosophy it explores. So, for instance, the same entry suggests that the religious theme of the opening lines of the Gītā is to be found when Arjuna (the protagonist) is faced with a fratricidal war. “The problem for Arjuna is that many other revered figures, such as Arjuna’s teacher, are fighting for his cousins. Seeing in the ranks of the enemy those to whom he owes the utmost respect, Arjuna throws down his bow and refuses to fight” (Ellwood and Alles 2008: 49-50). That is not at all how events unfold, however. Arjuna, upon arriving at the battlefield, provides three distinct arguments based on three prominent ethical theories that comprise what we might call conventional morality (Virtue Ethics, Consequentialism, and Deontology) and then concludes on the strength of these objections that he should not fight. Expecting to distinguish the thematic development from the philosophy in the Gītā is like attempting to distinguish the thematic development in a Platonic dialogue from the philosophy: It cannot be done without great violence—and the fact that we might expect this as possible in the case of South Asian philosophy but not in the case of Plato is inconsistent. Moreover, the gloss that the Gītā is scripture is mistaken on points of history. Historically, and in the South Asian tradition, the Gītā was not thought of as scripture. Indeed, “scripture” is often reserved to designate texts that are thought to have a revelatory character, like the Vedas, and are called śruti (what is heard). The Gītā, in contrast, was thought to be an historical or commemorative document, or smṛti (what is remembered), as the Mahābhārata, of which it is a part, was regarded as such historical literature. Calling it scripture is ahistorical. The motivation to regard the Gītā as a religious text is no doubt derivable from the uncritical acceptance of the Gītā as a basic text of Hinduism. By analogy to other religions with central texts, the Gītā would apparently be like a Bible of sorts. In this case, the confusion arises because of the ahistorical projection of the category, “Hinduism,” on to the tradition.

As Western powers increased their colonial hold on South Asia, there was pressure to understand the South Asian traditions in terms of a category of understanding crucial to the West’s history and methodology of alterity: religion (Cabezón 2006). Historical research shows that it was under the British rule of South Asia that “Hindu”—originally a Persian term meaning “Indus” or “India”—was drafted to identify the indigenous religion of South Asia, in contrast to Islam (Gottschalk 2012). By default, hence, anything South Asian that is not Islam is Hinduism. Given its baptismal context that fixes its referent (compare Kripke 1980), “Hinduism” is a class category, the definition of which (something like “South Asian, no common founder”) need not be instantiated in its members, and the function of which is rather to describe Hindu things at the collective level. “Hinduism” as a class category is much like the category “fruit salad”: Fruit salad is a collection of differing pieces of fruit, but members of a collection that is fruit salad need not be, and would not be, a collection of different pieces of fruit. Indeed, it would be a fallacy of composition to infer from the collective definition of “fruit salad” that there is something essentially fruit salad about pieces of fruit salad. Similarly, at the collective level, we might include the Gītā among Hindu texts because the collection is definable as being South Asian but with no common founder. It would be a fallacy of composition, though, to infer that the Gītā bears defining traits of being Hindu, or even religious, for that matter, as these characterize the collection, not the members. If, as history shows, the only things that world religious traditions share is that they have non-European origins, that the philosophical diversity across all things religious is equivalent to philosophical diversity as such, and that religious identity was manufactured as a function of the Western tradition’s inability to explain and ground non-Western philosophical positions in terms of the Western tradition (Ranganathan 2018b), then Hindu texts would be treated as essentially religious and not primarily philosophical because of their South Asian origins. This depiction of texts such as Gītā as religious, however, like the historical event of defining Hinduism, is a straightforward artifact of Western colonialism, and not a trait of the texts being studied under the heading of Hinduism.

Historically, to be Hindu is apparently to share nothing except philosophical disagreements on every topic: One can be an evolutionary materialist and atheist, as we find in the Sāṅkhya Kārikā, or take a deflationary view about the reality of the Gods while endorsing Vedic texts, as we find in Pūrva Mīmāṃsā works, and be a most orthodox Hindu merely because one’s philosophical views are South Asian and because they can be grouped in the category of South Asian, with no common founder (Ranganathan 2016a, 2018b). Yet, the common expectation is that religions are kinds, not classes, that specify criteria of inclusion that are instantiated by their members, as this is true of virtually every other religion. Under this particular set of expectations—that examples of Hindu things must exemplify something distinctly Hindu—the Bhagavad Gītā has come to be valued not merely as a popular contribution to moral philosophy, but as the Hindu equivalent to the Christian Bible, something one can swear oaths on, and can look to for religious advice (compare Davis 2015). Attempting to project this colonial development back onto the tradition, though commonplace, is mistaken. It generates the perception that what we have in the Gītā is not primarily philosophy, as we have decided to ignore it. The depiction of the Gītā as essentially religious, and not contingently religious given the colonial artefact of religious identity, is a self-fulfilling prophecy that arises when we do not pay attention to the history of South Asian philosophy as relevant to understanding its texts because we have assumed, as a function of the colonial history that makes up religious identity, that such texts are religious.

5. Vedic Pre-History to the Gītā

While tempting to read the Gītā in a vacuum, knowing something about the development of moral theory in South Asian traditions sheds light on many aspects of the Gītā. It constitutes a response to the Jain (Virtue Ethics), Buddhist (Consequentialism), and Pūrva Mīmāṃsā (Deontology) options (Ranganathan 2017a), but it also constitutes a slight criticism of Deontology too, which it provisionally endorses (explored at greater length in section 7, Basic Moral Theory and Conventional Morality). The Jain and Buddhist options, as options of Virtue Ethics and Consequentialism, are versions of teleology: They prioritize the Good over the Right in moral explanation. Deontology and Yoga/Bhakti are versions of proceduralism: They prioritize the Right over the Good in moral explanation. The critical move away from teleology to proceduralism constitutes the history of moral reasoning in the Vedic tradition.

The very earliest portions of the Vedic tradition begin with the Mantra (chants) and Brāhmaṇa (sacrificial instruction manuals) sections, along with forest books (Āraṇyaka) that provide theoretical explanations of the sacrifices. All, and especially the Mantra section, speak of and to an Indo-European, nomadic culture. Like all early Indo-European cultures, whether in ancient Persia or Greece, there is evidence of the worship of nature gods as a means of procuring benefits. The logic of this paradigm is teleological: The good ends of life, such as victory over enemies, safety for one’s kin and self, as well as the material requirements for thriving (food and land) are the goal, and the worship of gods of nature are hypothesized as the means. One section of the Aitareya Āraṇyaka is revealing of a proto-empirical hypothesis: that the need for eating is generated by fire, and it is (fire) that is the consumer of food (I.1.2.ii). The sacrificial offering just is food (I.1.4,vii). If it is ultimately fire that is hungry, and the sacrifice is how we enact feeding our debt to fire, then the sacrifice is the ritualization of metabolism: the burning of calories.

The key to actualizing this flourishing according to the Aitaraya Brāhmaṇa is a distinction between sacrifice and victim. This distinction requires a certain moral sensitivity. Hence, the presiding priests at the end of an animal sacrifice mutter, “O Slaughterers! may all good you might do abide by us! and all mischief you might do go elsewhere.” This knowledge allows the presiding priest to enjoy the flourishing made possible by the sacrifice (Aitaraya Brāhmaṇa 2.1.7, p. 61).

One of the curious features of the worldview that acknowledges that it is forces of nature that create such requirements is that, in feeding them, we are really transferring an evil that would befall us onto something else. Hence, in order to avoid being undermined by the forces of nature ourselves, we need to find a sacrificial victim, such as an animal, and visit that evil on it: That allows us to pay our debt to the forces of nature and thrive. It is no longer the forces of nature and their propitiation that lead us to the good life: It is rather a matter of the ritual of feeding natural requirements that secures the good life. In this equation, one element that is not reducible is evil itself. Indeed, the very rationale for the ritual is to avoid an evil of scarcity. The Brāhmaṇa quoted already notes that during the course of a sacrifice, the blood of the victim should be offered to evil demons (rākṣasas). This is because, by offering blood to the demons, we keep the nourishing portion of the sacrifice for ourselves (Aitaraya Brāhmaṇa 2.1.7, pp. 59-60). This is an admission that appeasing the gods of nature is part of a system of ressentiment, where we must understand the goods in life as definable in relation to evils we want to avoid (for further exploration of these texts and themes, see Ranganathan 2018c).

The total appreciation not only of the goods of life that we wish to achieve, the pressure to achieve them by way of natural forces, and the desire to appease such forces in order to gain the goods, leaves much to be desired. The system creates a crisis that is managed by feeding it. Furthermore, as the system is teleological, it organizes moral action around the good, which unlike the right (what we can do) is outside of our control.

What we find in the Upaniṣads (dialogues)—the latest installation to the corpus of Vedic texts—is a radical reorientation of practical explanation. Whereas the earlier part was concerned primarily with the good as a source of justifying right procedure, we find a switch to the focus on the center of agency and practical rationality, the Self or ātmā, but also a related substance that it is often identified with: Development, Growth, Expansion (Brahman). Interpreted from a Eurocentric backdrop, Brahman is like a theistic God, for Brahman appears to play a role similar to a theistic God in the belief system of theists. Explicated—that is if we understand this theoretical entity as a contribution to philosophical disagreement—its identification with the Self entails a theory where the primary explanation of reality is not by way of a good, but a procedure (of Development) that is identifiable with the paradigm Self, or what it is to be an agent. While the Upaniṣads do not all agree or say exactly the same thing about the self and Brahman—often it seems to talk about many selves related to Brahman, sometimes only one paradigm self and Brahman)—it is often raised in relationship to ideas we find central to yoga, or meditation, such as the concept of breath, itself a procedure internal to animal agency.

One of the more revealing dialogues in the Upaniṣads that sheds light on this procedural shift is the Kaṭha Upaniṣad, specifically the dialogue concerning the young boy Nachiketa.

In the famous Kaṭha Upaniṣad, the young boy Nachiketa is condemned to death by his father (conducting a solemn sacrifice to the gods) in response to the boy’s pestering question: “To whom will you sacrifice me?” “To death,” his father utters in irritation. It is in an official context. So the boy is sacrificed and travels to the abode of the God of Death, Yama, who is absent. Upon returning after three days, Yama offers the young boy three boons to make up for his lack of hospitality. Two boons are readily granted: The first is returning to his father, and the second is knowledge of a sacrifice that leads to the high life. Last, Nachiketa wants to know: What happens to a person after they die—do they cease to exist, or do they exist? Yama tries to avoid answering this question by offering wealth—money, progeny, and the diversions of privilege. Nachiketa rejects this, on the grounds that “no one can be made happy in the long run by wealth,” and “no one can take it with them when they come to you [that is, Death].” He objects that such gifts are short-lived. Death is inevitable, so he wants the answer. The boy is persistent, and Yama relents. He begins his response by praising the boy for understanding the difference between the śreya (control) and pre-ya (literally “advance-movement,” that is, utility, the offering for or gain of the sacrifice): the foolish are concerned with the preya (what Yama tried to give the boy), but the wise with control.

Yama continues with his allegory of the chariot. According to Yama, the body is like a Chariot in which the Self sits. The intellect (buddhi) is like the charioteer. The senses (indriya) are like horses, and the mind (mānasa) is the reins. The Enjoyer is the union of the self, senses, mind, and intellect. The objects of the senses are like the roads that the chariot travels. People of poor understanding do not take control of their horses (the senses) with their minds (the reins). Rather, they let their senses draw them to objects of desire, leading them to ruin. According to Yama, the person with understanding reins in the senses with the mind and intellect (Kaṭha Upaniṣad I.2). This is explicitly called Yoga (Kaṭha Upaniṣad 2.6). Those who practice yoga reach their Self in a final place of security—Viṣṇu’s abode. This is the place of the Great Self (Kaṭha Upaniṣad I.3). There is no evil here.

What we have in the Kaṭha Upaniṣad is a very early articulation of the philosophy of Yoga as we find it in Patañjali’s Yoga Sūtra and the Gītā’s defense of bhakti yoga. In Patañjali’s Yoga Sūtra (a central, systematic formulation of Yoga philosophy), we find no mention of Viṣṇu. However, we do find that Patañjali defines Yoga in two ways. First, he defines it as an end: the (normative, or moral) stilling of external mental waves of influence (YS I.2). This involves bringing one’s senses and mind under one’s rational control. Second, Patañjali identifies yoga as accomplished by an approximation to Sovereignty, which is analyzable into unconservativism and self-governance (cf. Ranganathan, 2017b). This fits the pattern of the theory of Bhakti/Yoga, which identifies and defines right action as the approximation to a procedural ideal. When Patañjali moves to describe yoga not as an end (the stilling of external waves of influence) but as a practice, he further analyzes the project of Yoga into three procedural ideals: Īśvara praṇidhāna (approximating Sovereignty, unconservativism, and self-governance), tapas (unconservativism), and svā-dhyāya (self-governance) (YS II.1). Rarely noted, the three procedural ideals are celebrated in a popular South Asian model of Ādi Śeṣa (the cosmic snake) floating over a sea of external waves of influence depicted as the Milk Ocean (the ends of yoga) as he is devoted not only to Viṣṇu (a deity depicted as objectifying himself as harmful manifestations such as the disk and mace, which do not constrain him, thereby showing himself to be untouched by his own choices and thereby unconservative) but also Viṣṇu’s partner Lakṣmī: the goddess of intrinsic value and wealth, shown as a lotus, sitting on herself, and holding herself, thereby self-governing. Thus devotion to Sovereignty (Ādi Śeṣa) analyzes Sovereignty into two further procedural ideals—unconservativism (Viṣṇu) and its partner, self-governance (Lakṣmī)—all the while floating over receding waves of influence. What this common tableau of South Asian devotional practice literally depicts is the absolute priority of the right procedure (the three procedural ideals floating) over the good outcome (the stilling of waves of external influence).

In the model we find from Death in the Kaṭha Upaniṣad, there is no explicit reference to Lakṣmī on her own, but much is made of self-governance as something geared toward a realm controlled by Viṣṇu. Hence, already in the Vedas, we have a theory of radical procedural ethics, governed by an approximation to a procedural ideal of Viṣṇu (tapas, self-challenge, unconservativism), and such a model is implicit in the other great work of Yoga of South Asian traditions, the Yoga Sūtra.

One of the outcomes of Death’s argument, as he explicitly states, is that life lived wisely gets rid of teleological reasoning, and replaces it with a procedural emphasis on self-governance and control. Looking back on the very beginnings of the Vedic literature, a dialectic becomes apparent, which takes us from teleological reasoning to procedural reasoning. The motivation to move to a procedural approach is to get rid of luck from the moral equation and replace it with the ideals of unconservativism and self-governance. The Kaṭha Upaniṣad then represents a trend in the Vedic tradition to treat teleological considerations—practical arguments focused on the good—as a foil for a procedural approach to practical rationality. Ranganathan has called this dialectic the Moral Transition Argument (MTA): the motivation of a procedural approach to practical rationality on the basis of a dissatisfaction with a teleological approach. Freedom, mokṣa, is an ambiguous condition of this process, but a certain outcome of perfecting a procedural approach to life. Brahman, Development, is the metaphysical formalization of this idea that reality is not an outcome or a good, but a process to be perfected (Ranganathan 2017c).

There are of course further problems that arise from MTA, such as the paradox of development. We need to be free to engage in a procedural approach to life, for such practice is a matter of self-determination, and yet, as people who have not mastered a procedural ethic, we are less than free to do as we choose. By analogy, we can consider the challenge of learning an art, such as playing the guitar. We need some degree of freedom to approximate the procedural ideal of playing a guitar, and this approximation constitutes practicing the guitar, but in a state of imperfection, we cannot play any tune or composition on the guitar we wish: The freedom to engage in this craft and art is something that is the outcome of much practice. It is, all things considered, a state with a low expected utility: Even if we do practice regularly, there is no guarantee that we will become as proficient as Jimi Hendrix or Pat Metheny. The movement to a procedural metaphysics—of understanding reality not as a good outcome, but as a work in progress (Brahman), gives some reason for optimism: It is in the very nature of reality to be dynamic, and so we should not assume that our current state of incapacity is a metaphysical necessity. However, and more practically, Yoga provides an additional response: It is commitment to the regulative ideal of a practice—the Lord—that makes our freedom to do as we choose possible, but this freedom is not a good that we can organize our practice around: It is rather a side effect of our commitment to the ideal.

The authors of the Mahābhārata and especially the Gītā, which appears to be an interpolation in the wider epic, must have been quite conscious of the Kaṭha Upaniṣad (Jezic 2009); hence, the deliberate use of a chariot as a scene for the discourse of the Gītā, where Kṛṣṇa (Viṣṇu) delivers arguments reminiscent of Death’s lecture to Nachiketa, is no accident. Yet, whereas the Kaṭha Upaniṣad depicts Viṣṇu as a ruler of a distant realm, which we attain when we have mastered the rigors of yoga, here in the Gītā itself, Viṣṇu is the one who delivers the lecture, but also the advice that he ought to be sought after, as the ideal to be approximated and emulated. Also, whereas in the Kaṭha Upaniṣad the charioteer is the intellect, here Kṛṣṇa’s assumption of the role of the charioteer furthers the role he plays in the Gītā to be the voice of reason in the face of adversity and peril. In using the Kaṭha Upaniṣad as the metaphorical backdrop of the dialogue, the authors of the Gītā script Kṛṣṇa to elaborate Death’s lesson to the boy Nachiketa. Death’s argument was that in facing the possibility of danger as something to be avoided, we survive Death, not as a personal misfortune, but as a potential public mishap that we avoid by taking a procedural approach to life. Life after Death is not brought about by avoiding struggle or danger, but by mastering oneself. Just as in the Kaṭha Upaniṣad there is a criticism of the earlier teleological goals of the Vedas, so too in the Gītā do we find Kṛṣṇa persistently criticizing the language and goading rationality of the Vedas, which motivates by way of selfish desires. But in the case of the Gītā, the authors use these elements to bring into the picture the teleological considerations of the earlier Vedic tradition, but also Buddhist and Jain arguments, not to mention a refined Pūrva Mīmāṃsā Deontology—as seen in Arjuna’s three arguments for not fighting. What these arguments have in common is that they appeal to the good in some form, and together they mark out the scope of conventional morality—morality that can be conventionalized in so far as it is founded around a moral outcome, the good. What follows after Arjuna’s recitation of these arguments is a sustained argument from Kṛṣṇa to the effect that moral considerations that appeal to outcomes and ends are mistaken, and that one should adopt a procedural—yogic—approach to practical rationality. Hence, the Bhagavad Gītā from start to finish is the MTA as a dialectic that goes from teleological considerations, through Deontology (karma yoga) to the extreme proceduralism of Bhakti (yoga) via a metaethical bridge it calls jñāna yoga. It hence serves as both a summary of the teleological precursors to a procedural approach to morality and its refutation. It serves also as a historical summary of the dialectic of the Vedic tradition, but in argument form: with the radical proceduralism of bhakti yoga being the conclusion.

6. Mahābhārata: Narrative Context

The Bhagavad Gītā is itself a dialogue, but one of a philosophical character. That is, there are no plot or thematic developments of the text apart from the dialectic it explores, couched in argument. This is quite easy to miss if one does not begin a reading of the text with attention to the arguments provided by its protagonists, Arjuna and Kṛṣṇa, and if one expects that there is some uniquely religious content to the text that is distinct from the philosophy. Ignoring the philosophy certainly generates a reading of the text that is mysterious, not founded in reason, and opaque, which could be taken as evidence of its religious significance—but that would be an artifact of ignoring the philosophy and not anything intrinsic to the text. The text begins at the battlefield of the fratricidal war that is itself the climax of the Mahābhārata. Hence, to understand the motivation for the arguments explored in the Gītā, one needs to understand the events that unfold in the epic prior to the fateful conversation between Kṛṣṇa and Arjuna.

The Mahābhārata (the “Great” war of the “Bhāratas”) focuses on the fratricidal tensions and all-out war of two groups of cousins: the Pāndavas, numbering five, the most famous of these brothers being Arjuna, all sons of Pāṇḍu; and the Kauravas, numerous, led by the oldest brother, Duryodhana, all sons of Dhṛtarāṣṭra. Dhṛtarāṣṭra, though older than Pāṇḍu and hence first in line for the throne, was born blind and hence sidelined in royal succession, as it was reasoned that blindness would prevent Dhṛtarāṣṭra from ruling. Pāṇḍu, it so happens was the first to have a son, Yudhiṣṭhira, rendering the throne all but certain to be passed down via Pāṇḍu’s descendants. Yet Pāṇḍu dies prematurely, and Dhṛtarāṣṭra becomes king as the only appropriate heir to the throne, as the next generation are still children.

As the sons of Pāṇḍu and Dhṛtarāṣṭra grow up, Pāṇḍu’s sons distinguish themselves as excellent warriors, and also virtuous individuals, who are not without their flaws. The Kauravas, in contrast, are less able in battle, but mostly without moral virtues or graces. The rivalry between the two sets of cousins is ameliorated only by the Pāṇḍava’s inclination to compromise and be deferential to their cousins—this despite attempts on the Pāṇḍava’s lives by the Kauravas. Matters turn for the worse when the Pāṇḍavas accept a challenge to wager their freedom in a game of dice, rigged by the Kauravas. The Pāṇḍavas seem unable to restrain themselves from participating in this foolish exercise, as it is consistent with conventional pastimes of royalty. After losing everything, and even wagering their common wife, Draupadī, who is thereby publicly sexually harassed, their freedom is granted back by Dhṛtarāṣṭra, who caves into Draupadī’s lament. Once the challenge of the wager—taking a chance—is brought up again, the Pāṇḍavas again lose everything and must subsequently spend fourteen years in exile, and the final year incognito, and if exposed must repeat the fourteen years of exile. They complete it successfully and return to reclaim their portion of the kingdom, at which point the Kauravas refuse to allow the Pāṇḍavas any home area so that they might eke out a livelihood as rulers. Despite repeated attempts by the Pāṇḍavas at conciliation, mediated by their mutual cousin Kṛṣṇa, the Kauravas adopt a position of hostility, forcing the Pāṇḍavas into a corner where they have no choice but to fight. Alliances, loyalties, and obligations are publicly reckoned and distinguished, and the two sides agree to fight it out on a battlefield with their armies.

What is noteworthy about the scenario described in the Mahābhārata is that the Pāṇḍavas, but for imprudent decisions, conform their actions to standards of conventional moral expectations for people in their station and caste—including rising to the occasion of risky public challenges, as is the lot of warriors. Engaging in activities that follow from good character traits (including courage—a Virtue Theoretic concern), engaging in activities with a promise of a good outcome (such as winning at dice—a Consequentialist concern), and agreeing to be bound by good rules of procedure (such as those that condition the game of dice—a Deontological concern). Spelled out, even the imprudence of the Pāṇḍavas is an outcome of their conventional moral practice (of Virtue Ethics, Consequentialism and Deontology). This self-constraint by the Pāṇḍavas, characteristic of conventional moral practice, renders them vulnerable to the Kauravas, who are moral parasites: People who wish others to be restrained by conventional moral expectations so that they may be abused, but have no expectations of holding themselves to those standards. Ever attempting both compromise and conciliation, the Pāṇḍava’s imprudent decisions are not the reason for their predicament; rather, the hostility of the Kauravas is the explanation. But for this hostility, exemplified by the rigged game of dice and the high stakes challenge they set, the Pāṇḍavas would have lived a peaceful existence and would never have been the authors of their own misfortune.

With all attempts at conciliation dashed by the Kaurava’s greed and hostility, war is a fait accompli. Kṛṣṇa agrees to be Arjuna’s charioteer in the fateful battle. What makes the impending war especially tragic is that the Pāṇḍava are faced with the challenge of fighting not only tyrannical relatives that they could not care less for, but also fight loved ones and well-wishers, who, through obligations that arise out of patronage and professional loyalty to the throne, must fight with the tyrants. Bhīṣma, the granduncle of the Pāṇḍavas and the Kauravas, and an invincible warrior (gifted, or cursed, with the freedom to choose when he will die), is an example of one such well-wisher. He repudiated the motives of the Kauravas, sympathizes with the Pāṇḍavas, but due to an oath that precedes the birth of his tyrannical grandnephews (the Kauravas), he remained loyal to the throne on which the Kaurava father, Dhṛtarāshtra, presided. Arjuna, who looked upon Bhīṣma and others like him as a loving elder, had to subsequently fight him. The conflict and tender feelings between these parties was on display when, prior to the war, Arjuna’s eldest brother, Yudhiṣṭhira, wanted the blessings of Bhīṣma on the battlefield to commence the war, and Bhīṣma, his enemy, and leader of the opposing army, blessed him with victory (Mahābhārata 6.43).

What follows prior to the battle are two important philosophical moves. First, Arjuna provides three arguments against fighting, each based on three basic ethical theories that comprise conventional morality: Virtue Ethics, Consequentialism, and Deontology. The essence of this package is the importance and centrality of the good (outcome) to an articulation and definition of the right (procedure). This is then followed by Kṛṣṇa’s prolonged response that consists in making a case for three philosophical alternatives: karma yoga (a form of Deontology), bhakti yoga (a fourth ethical theory, more commonly called Yoga, which does not define the Right by the Good), and jnana yoga (a metaethical theory that provides a justification for the previous two). Indeed, Kṛṣṇa’s exploration of the three options constitutes the dominant content of the 18 chapters of the Gītā. The preoccupation with the good, characteristic of conventional morality, allows moral parasites to take advantage of conventionally good people, for conventionally good people will not transcend the bounds of the good to retaliate against moral parasites. Kṛṣṇa’s arguments, in contrast, are focused not on the Good, which characterizes conventional moral expectation, but the Right. With an alternate moral framework of Yoga that does not define the Right in terms of the Good, Kṛṣṇa is able to counsel Arjuna and the Pāṇḍavas to victory against the Kauravas: For as Arjuna and the Pāṇḍava brothers abandon the good of conventional morality, they are no longer sitting targets for the malevolence of the Kauravas. Moreover, this frees the Pāṇḍavas to take preemptive action against the Kauravas, resorting to deception and treachery to win the war. At the end of the war, the Pāṇḍavas are accused by the surviving Kauravas of immorality in battle at Kṛṣṇa’s instigation (Mahābhārata 9.60.30–34)—and indeed, the Pāṇḍavas do resort to deception and what might be thought of as treachery, given conventional moral practice. Kṛṣṇa responds that there would have been no prospect of winning the war if constrained by conventional moral expectations (Mahābhārata: 9.60.59). This seems like a shocking admission until we remember that war is the very dissolution of such conventions, and it is the Pāṇḍavas capacity to pivot to an alternate moral paradigm (Yoga) that defines the Right without respect to the good which allows for their victory both with respect to the battle and with respect to Just War. A new dharma, or a new ethical order, is the perfection of the practice, not the means. Unlike the Kauravas, who had no moral code and were parasites, the Pāṇḍavas do have an alternate moral code to conventional morality, which allows them to re-establish a moral order when the old one is undermined by moral parasitism. The kernel of the Gītā Just War Theory is hence indistinguishable from its arguments for Yoga.

7. Basic Moral Theory and Conventional Morality

To understand the Gītā is to understand its contribution to South Asian and world philosophy. This contribution consists in a criticism of conventional morality that prioritizes the Good in a definition of the Right. Conventional morality is comprised of three familiar theories: Virtue Ethics, Consequentialism, and a conventional version of Deontology. The Gītā’s unique contribution is completed by the defense of two procedural ethical theories that prioritize the Right choice over the Good outcome. The first of the two normative theories is the Gītā’s version of Deontology, called karma yoga, a practice of one’s natural duty that contributes to a world of diversity. The second of the two normative theories, and the fourth in addition to the three theories of conventional ethics, is a radically procedural option unique to the South Asian tradition, namely Yoga (compare Ranganathan 2017b), which the Gītā calls bhakti yoga. Yoga/Bhakti is distinguished for defining the Right without reference to the Good: The right thing to do is defined as devotion to the ideal of the Right—Īśvara, Bhagavān—Sovereignty (played by Kṛṣṇa in the dialogue and epic), and the Good is the incidental perfection of this devotion. The Gītā also includes a meta-ethical theory, jñāna yoga, that renders the epistemic and metaphysical aspects of the two normative ethical theories clear. The Gītā’s main aim with these procedural ethical theories is to provide an alternate moral framework for action and choice, which liberates the conventionally moral Arjuna—the other protagonist of the Gītā, in addition to Kṛṣṇa—from being manipulated and harassed by moral parasites. Moral parasites have no expectation of holding themselves accountable by conventional moral expectations of good behavior and choice but wish others, like Arjuna, to abide by such expectations so that they are easy to take advantage of. At the precipice of moral conventions, undermined by moral parasites, the Bhagavad Gītā recommends bhakti yoga, devotion to Sovereignty—played by Viṣṇu’s avatāra, Kṛṣṇa—as the means of generating a new moral order free of parasites (compare Gītā 4.8), supported by an attention to the practice of a duty that allows one to contribute to a world of diversity. The key transition to this radical proceduralism of bhakti yoga is an abandonment of concerns for Good outcomes in favour of the ideal of the Right. By abandoning a concern for the Good, one is no longer self-constrained to act in good ways and will hence no longer be the easy target of parasites who take advantage of the conventionally good because of their goodness. Unlike moral parasites, the devotee of Sovereignty is not amoral or merely in life for themselves. As devotees of Sovereignty, they act in a manner that is in the interest of those devoted to sovereignty and are hence able to engage in just relationships of friendship and loyalty and to cut away relationships of manipulation and parasitism. This shift to the Right and away from the Good constitutes the kernel of the Gītā’s important contribution to Just War Theory. The just cause is the cause waged as part of a devotion to Sovereignty. The unjust’s cause is steeped in moral parasitism. As devotion to Sovereignty constitutes a practice of transforming one’s behavior into sovereign behavior, success is assured in the long run.

Much confusion about the Gītā and its argument for bhakti yoga (both with the expectation that there is some type of essentially religious theme afoot) and with respect to the connection of bhakti yoga as a response to previous moral and political concerns brought about by social conflict persists because basic normative ethical options are not spelled out as theories about the Right or the Good.

The question of the Right or the Good is central to the rigged game of dice, which includes engaging in activities that follow from good character traits (including courage), engaging in activities with a promise of a good outcome (such as winning at dice), and agreeing to be bound by good rules of procedure (such as those that condition the game of dice). Spelled out, even the imprudence of the Pāṇḍavas is an outcome of their conventional moral practice, which was motivated by a concern for the good. These three aspects of the game of dice exemplify the concerns of three prominent ethical theories: Virtue Ethics, which prioritizes a concern for good character, Consequentialism, which prioritizes good outcomes, and Deontology, which prioritizes good rules. Arjuna, at the start of the Gītā, provides three arguments against fighting, each based on three basic ethical theories that comprise conventional morality: Virtue Ethics, Consequentialism, and Deontology. Arjuna’s philosophical intuitions are indistinguishable from those that motivated him and his brothers to participate in the rigged game of dice. Kṛṣṇa’s response involves a sustained criticism of teleology, which includes Virtue Ethics and Consequentialism, and a rehabilitation of Deontology. Properly spelled out, we also see why the common view that bhakti yoga is a case for theism is mistaken. Clarity is had by defining and spelling out these theories as positions on the Right or the Good.

The Good causes the Right: Roseland Hursthouse identifies Virtue Ethics as the view that the virtues, or states of goodness, are the basic elements in moral theory, and that they support and give rise to right action (Hursthouse 1996, 2013). Hence on this account, Virtue Ethics is this first moral theory. A reason for objecting to this characterization of Virtue Ethics is that the prioritization of virtue does not entail that right action is what follows from the virtues: An appropriate omission or non-action may be the proper outcome of virtue. This is consistent with the idea of Virtue Ethics that treats the virtues as the primary element in moral explanation. Yet, Virtue Ethical theories credit states of goodness (the virtues) with living well, which is in a broad sense right, so in this case, any theory that prioritizes virtue in an account of the life well lived endorses some version of the notion that the right procedure follows from good character. In South Asian traditions, the paradigm example of Virtue Ethics—which also denies that right action follows from the virtues, but a well-lived life does follow from the virtues—is the ancient tradition of Jainism. According to the Jains, an essential feature of each sensory being (jīva) is virtue (vīrya), and this is clouded by action (karma). We ought to understand ourselves in terms of virtue, which is benign and unharmful, and not in terms of action, which intrudes on the rights of others. Jains are historically the staunchest South Asian advocates of strict vegetarianism and veganism as a means of implementing non-harm. As Jains identify all action as harmful, they idealize a state of non-action, sullekana, which (accidentally) results in death: This is the fruit of Jain moral observance (Soni 2017).

The Good justifies the Right: This category analyzes what is essential to Consequentialist theories. Accordingly, the right action or omission of action only has an instrumental value relative to some end, the good, and hence the good serves the function of justifying the right. Hence, an action or an omission of an action can be morally equivalent in so far as they are equally justified by some end. The right can be a rule or a specific action, but either way, it is justified by the ends. When the end is agent neutral, we could call the theory Utilitarianism. The most famous example of this type of theory in South Asian traditions is Buddhism, which takes the welfare of sentient beings as the source of obligation (Goodman 2009). In its classical formulation in the Four Noble Truths, it is duḥkha—discomfort or disutility—that is to be minimized by the elimination of agent-relative evaluation and desire, by way of a scripted program, the Eight-Fold Path: a project justified by agent-neutral utility. Yet, interpreted by common beliefs, the classical Buddhist doctrine seems like a hodgepodge of ethical commitments. For instance, the Buddha is recorded in the Aṅguttara Nikāya-s (I 189–190) as distinguishing between two kinds of dharmas, or ethical ends—those that are wholesome, such as moral rules, and those that are not, such as pathological emotions. It thus seems that dharma has more than one unrelated theoretical sense here. By explicating the reasons that comprise Buddhist theory and entail its controversial claims, we see how this is part of the project of Consequentialism: Basic to all dharma is the end of harm reduction or welfare, but whereas some such dharmas, such as agent-neutral moral teachings, justify themselves as means to harm reduction, some dharmas, such as pathological emotions, that appear agent relative justify the meditational practice of mindfulness, thereby relieving us of having to treat these dharmas as possessing emulative or motivational force.

The Right justifies the Good: This is the inverse of the previous option, and while it may not be a popular way to think about the issue, it sheds light on the role of Deontological theories in moral disagreement. The goods of moral theory, on this account, may be actions or omissions: the former are often called duties, the later, rights—these are moral choices. Whatever counts as a moral choice is something good and worth preserving in one’s moral theory. Yet, the reason that they are theoretically worth endorsing has to do with procedural criteria that are distinct from the goods of moral theory. Hence, this category makes use of a distinction between the definition of moral choices (by definition, good), and their justification: “Deontological theories judge the morality of choices by criteria different from the states of affairs those choices bring about” (Alexander and Moore Winter 2012). The right (procedure) is hence prior not only to the goods of moral theory (moral choices) but also to their further consequences. This way of thinking about moral theory lays to rest a confusion: That if Deontologists consider the good outcomes in identifying duties or rights, they are thereby Consequentialists. This is a mistake that rests on a failure to distinguish between the substance of moral choice and the prior criteria that justify them.

In South Asian traditions, famous deontologists abound, including the Pūrva Mīmāṃsā tradition, and the Vedānta tradition. Pūrva Mīmāṃsā is a version of Deontological particularism (Clooney 2017), but also moral nonnaturalism, that claims that moral precepts are defined in terms of their beneficial properties but are justified by intuition (śruti), which, on its account, is the ancient body of texts called the Vedas (Ranganathan 2016b). Authors in the Vedānta tradition also often endorse Deontology and a procedural approach to ethics by way of criticizing problems with teleology, namely, that it apparently makes moral luck an irreducible element of the moral life (Ranganathan 2017c). The Pūrva Mīmāṃsā is the tradition in which ancient practices of animal sacrifices, largely abandoned and criticized, are defended as part of the content of good actions that we engage in for procedural considerations, though here too there is often an appreciation of the superiority of nonharmful interactions with nonhuman animals.

The Bhagavad Gītā is a prominent source of deontological theorizing, especially for the Vedānta tradition that draws upon it. As we shall see, in the Gītā, a deontological approach to ethical reasoning is formulated as karma yoga—good practice, which is to be endorsed for procedural reasons, namely that its perfection is a good thing. A distinguishing feature of karma yoga as a form of deontology is that it understands good action as something suited to one’s nature, which one can perfect, that contributes to a world of diversity (Gītā 3.30-24). One’s reason for endorsing the duty is not its further outcome, but rather that it is appropriate for one to engage in. The rationale for why something counts as one’s duty, though, has everything to do with its place within a tapestry of diverse action, of diverse beings, contributing to a world of diversity.

The Right causes the Good: This is a fourth moral option that is radically procedural. Whereas the previous options in moral theory define the things to be done, or valued, by the good, this fourth defines it by the Right. It is also the mirror opposite of the first option, Virtue Ethics. The salient example of this theory is Yoga, as articulated in Patañjali’s Yoga Sūtra, or Kṛṣṇa’s account of Bhakti yoga in the Gītā. Accordingly, the right thing to do is defined by a procedural ideal—the Lord of the practice—and approximating the ideal brings about the good of the practice, namely its perfection. Yoga acknowledges a primary Lord of practices as such: Sovereignty. This is Īśvara (the Lord, Sovereignty) or Bhagavān (especially in the Gītā), and it is defined by two features: It is untouched by past choice (karma) and is hence unconservative, and it is externally unhindered and is hence self-governing. In the Yoga Sūtra, these two features are further analyzed into two procedural ideals of disciplined practice—tapas (heat-producing, going against the grain, unconservativism) and svā-dhyāya (literally “self-study” which, in the context of the Yoga Sūtra that claims that to know is to control objects of knowledge, amounts to self-control or self-governance). Devotion to Sovereignty—what Patañjali calls Īśvara praṇidhāna, or approximating Sovereignty, called “bhakti yoga” in the Gītā —hence takes on the two further procedural ideals of unconservativism and self-governance (Ranganathan 2017b). According to the Yoga Sūtra, the outcome of such devotion is our own autonomy (kaivalya) in a public world. In the language of the Gītā, the outcome of such devotion is freedom from evil (Gītā 18:66).

Failing to be transparent about the four possible, basic ethical theories causes two problems. First, the possibilities of the Gītā are then understood in terms of the three ethical theories familiar to the Western tradition (Virtue Ethics, Consequentialism and Deontology), which are ironically the positions that are, together, constitutive of conventional morality that the Gītā is critical of. One outcome of this interpretive orientation that treats the Gītā as explainable by familiar ethical beliefs in the Western tradition is that arguments for Yoga are understood as Consequentialist arguments (compare Sreekumar 2012), as though Yoga/Bhakti is an exhortation to be devoted for the sake of a good outcome. This is ironic, as the argument for Yoga is largely predicated on a criticism of Consequentialism in the Gītā itself, where action done for the sake of consequences is repeatedly criticized. The second irony that follows from this interpretation of the Gītā along familiar ethical theories is the re-presentation of the argument as one rooted in theism—itself playing a role in the depiction of the text as religious, if by religion one means theism.

While Yoga/Bhakti seems superficially like theism, it is not. Theists regard God as the paradigmatic virtuous agent, which is to say Good. Right action and teaching emanate from God the Good. Theism is hence a version of Virtue Ethics, with God playing the paradigm role of the virtuous agent. For Yoga/Bhakti, the Lord is Right: goodness follows from our devotion to Sovereignty. Hence, our role is not to obey the instructions of God on this account, but to approximate Sovereignty as our own procedural ideal. The outcome is ourselves as successful, autonomous individuals who do not need to take direction from others. The Good of life is hence an outcome of the perfection of the practice of devotion to the procedural ideal of being a person: Sovereignty.

With the four basic normative ethical options transparent, we are in a position to examine the beginning of the Gītā, and specifically Arjuna’s three arguments against fighting in the impending war, and Kṛṣṇa’s response.

8. Arjuna’s Three Arguments Against Fighting

Prior to the commencement of the battle, on the very battlefield where armies are lined up in opposition, and with Kṛṣṇa as his charioteer, Arjuna entertains three arguments against fighting.

First, if he were to fight the war, it would result in death and destruction on both sides, including the death of loved ones. Even if he succeeds, there would be no joy in victory, for his family will largely have been decimated as a function of the war (Gītā 1.34-36). This is a Consequentialist and more specifically Utilitarian argument. In the South Asian context, this would be a prima facie classical Buddhist argument, in so far as Buddhist theory seeks the minimization of duḥkha (suffering) and the maximization of nirvāṇa—freedom from historical constraints that lead to discomfort. According to such arguments, the right thing to do is justified by some good (harm reduction, or the maximization of happiness), and here Arjuna’s reasoning is that he should skip fighting so as to ensure the good of avoiding harm.

Second, if the battle is between good and evil, his character is not that of the evil ones (the Kauravas), but yet, fighting a war would make him no better than his adversaries (Gītā 1.38-39). This is a Virtue Ethical argument. According to such arguments, the right thing to do is the result of a good, the virtues, or strength of character. Not only is this a Virtue-Ethical argument, it is a classical Jain position: The correct response to sub-optimal outcomes is not more action, but restraint in conformity to the virtues.

Third, war results in lawlessness, which undermines the virtue and safety of women and children (Gītā 1.41). This might be understood as an elaboration of the first Consequentialist argument: Not only does war end in suffering, which should be avoided, but it also leads to undermining the personal safety of women and children, and as their safety is good, we ought to avoid war so as to protect it. The argument can also be understood as a version of Kantian style Deontology.

An essential feature of Deontology is the identification of goods, whether these are actions (duties) or freedoms (rights), as what require justification on procedural grounds. A duty is hence not only something that is good to do, and a right not only something good to have, but something we have reason to do or allow. Such goods, duties, and rights constitute the social fabric and are justified, as Kant reasoned, in so far as they help us relate to each other in a Kingdom of Ends. Deontology is hence the inverse of Consequentialism: Whereas Consequentialism holds that the good outcome justifies the procedure, the Deontologist holds that some good state of affairs (actions, freedoms) is justified by a procedural consideration. This way of clarifying duty is wholly in keeping with the Pūrva Mīmāṃsā position that ethics (dharma) is a command distinguished by its value (Mīmāṃsā Sūtra I.2).

What Consequentialism, Virtue Ethics and Deontology have in common is the idea that the good—the valuable outcome—is an essential feature of making sense of the right thing to do. Morality defined or explained by way of the good is something that can be established as an outcome of reality, and can hence be conventionalized. Thinking about morality by way of the good helps us identify an area of moral reasoning we might call conventional morality: consisting of actions that are justified in so far as they promise to maximize the good (Consequentialism), lifestyle choices motivated by good character (Virtue Ethics), and good actions that we have reason to engage in (Deontology). War disrupts conventional morality as it disrupts the good. This is indeed tragic, in so far as conventional morality is organized around the Good.

But there is indeed another side to the story rendered clear by the narrative of the Mahābhārata. It was conventional morality that made it possible for the Kauravas to exercise their hostility against the Pāṇḍavas by restricting and constraining the Pāṇḍavas. The Pāṇḍavas could have rid themselves of the Kauravas by killing them at any number of earlier times when they had the chance in times of peace, and everyone who survived would be better off for having been rid of moral parasites as rulers and having the benevolent Pāṇḍavas instead. They could have accomplished this most easily by assassinating the Kauravas in secret or perhaps openly when they were not expecting it, for the Kauravas never worried about or protected themselves from such a threat because they counted on the virtue of the Pāṇḍavas. Yet, the Pāṇḍava fidelity to conventional morality created a context for the Kauravas to ply their trade of deceit and hostility. The game of dice that snared the Pāṇḍavas is a metaphor for conventional morality itself: a social practice that promises a good outcome (Consequentialism), constituted by good rules that all participants have reason to endorse (Deontology), and laudable actions that follow from the courage and strength of its participants to meet challenges head-on (Virtue Ethics).

The lesson of the Mahābhārata generalizes: Conventional morality places constraints on people who are conventionally moral, and this enables the maleficence of those who act so as to undermine conventional morality by undermining those who bind themselves with it. The only way to end this relationship of parasitism is for the conventionally moral to give up on conventional morality and engage moral parasites in war. This would be a just war—dharmyaṃ yuddham—and the essence of a just war, for the cause would be to rid the world of moral parasites (Ranganathan 2019). Yet, from the perspective of conventional morality, which encourages mutually accommodating behavior, this departure is wrong and bad. Indeed, relying purely on conventional standards that encourage social interaction for the promise of a good, an argument for pacifism, such as the Jain argument, is more easily constructed than an argument for war. Hence, Arjuna’s three arguments against war.

9. Kṛṣṇa’s Response

Prior to the serious arguments, which Kṛṣṇa pursues to the end of the Gītā, he begins with considerations that are in contrast less decisive, and which he does not dwell on, except sporadically, through the dialogue. Kṛṣṇa responds immediately by mocking Arjuna for his loss of courage. Indeed, if maintaining his virtue is a worry, appealing to Arjuna’s sense of honor is to motivate him via Virtue Ethical concern (Gītā 2.2-3, 2.33-7), intimating that Virtue Ethics is not uniquely determinative (justifying both the passivist and activist approach to war). He also makes the claim that paradise ensues for those who fight valiantly and die in battle (Gītā 2.36-7). This would be a Consequentialist consideration, intimating that Consequentialist considerations are not uniquely determinative (justifying both arguments to fight and to not fight). He also appeals to a Yogic metaphysical view: As we are all eternal, no one kills anyone, and so there are no real bad consequences to avoid by avoiding a war (Gītā 2.11-32). The last thesis counters the third and last of Arjuna’s arguments: If good practice that entrenches the welfare of women and children is in order, then the eternality of all of us should put to an end to any serious concern about war on these grounds.

These three considerations serve the purpose of using the very same theoretical considerations that Arjuna relies on to argue against war to motivate fighting, or at least deflate the force of the original three arguments. The last claim, that we are eternal, is perhaps the more serious of the considerations. This is indeed a very basic thesis of a procedural approach to ethics for the following reason. People cannot be judged as outcomes, but rather procedural ideals themselves—ideals of their own life—and as such they are not reducible to any particular event in time. Hence, moving to a procedural approach to ethics involves thinking about people as centers of practical rationality that transcend and traverse the time and space of particular practical challenges. Buddhists are famous for arguing, as we find in the famous Questions of King Milinda, that introspection provides no evidence for the self: All one finds are subjective experiences and nothing that is the self. Indeed, the self as a thing seems to be reducible out of the picture—it seems like a mere grouping for causally related and shifting bodily and psychological elements. Such a Buddhist argument is aligned with a Consequentialist Ethic, geared toward minimizing discomfort. However, if we understand the self as charged with the challenges of practical rationality, and the challenge of morality to rein in the procedural aspects of our life, we have no reason to expect that the self is an object of our own experiences: It is rather an ideal relative to which we judge our practical challenge. It is like the charioteer, who is conscious not of him- or herself, but is engaged in driving. For the charioteer to look for evidence of him- or herself from experience would be to look in the wrong direction, but as the one responsible for the experiences that follow, the charioteer is in some sense always outside of the contents of his or her experience, transcending times and places.

Kṛṣṇa, as the driver of Arjuna’s battle-ready chariot, has the job of supporting Arjuna in battle, and so his arguments that aim at motivating Arjuna to fight are an extension of his literal role as charioteer in the battle, but also his metaphorical role as the intellect of the chariot, as set out in the Kaṭha Upaniṣad. One of the problems with the frame of conventional moral expectations that Arjuna brings to the battlefield is that it frames the prospects of war in terms of the good, but war is not good: It is bad. Even participants in a war do not desire the continuity of the war: They desire victory, which is the cessation of a war. So thinking about war in terms of the good gives us no reason to fight. Moreover, war is a dynamic, multiparty game that no one person can uniquely determine. The outcome depends upon the choices of many players and many factors that are out of the control of any single player. Game theoretically, this is debilitating, especially if we are to choose a course of action with the consequences in view. However, if the practical challenge can be flipped, then ethical action can be identified on procedural grounds, and one has a way by which to take charge of a low-expected-utility challenge via a procedural simplification: The criterion of moral choice is not the outcome; rather, it is the procedure. This might seem unconvincing. If I resort to procedure, it would seem imprudent because then I am letting go of winning (the outcome). However, there are two problems with this response. First, the teleological approach in the face of a dynamic circumstance results in frustration and nihilism—or at least, this is what Arjuna’s monologue of despondency shows. Thus, focusing upon a goal in the face of challenge is not a winning strategy. Indeed, when one thinks about any worthwhile pursuit of distinction (whether it is the long road to becoming an award-winning scientist or recovering from an illness), the a priori likelihood of success is low, and for teleological reasons, this gives one reason to downgrade one’s optimism, which in turn depletes one’s resolve. Focusing on outcomes ultimately curtails actions that can result in success in cases where the prospects of success are low. Call this the paradox of teleology: Exceptional and unusual outcomes that are desirable are all things considered unlikely, and hence we have little reason to work toward such goals given their low likelihood. Rather, we would be better off working toward usual, mundane outcomes with a high prospect of success, though such outcomes have a lower utility than the unusual outcomes. Second, if we can distinguish between the criterion of choice and the definition of duty—Deontology—then we have a way to choose duties that result in success, defined by procedural reasons. This insulates the individual from judging the moral worth of their action in terms of the outcome and hence avoids the paradox of teleology while pursuing a winning strategy (Gītā 2.40). The essence of the strategy, called yoga (discipline), is to discard teleology as a motivation (Gītā 2.50). Indeed, to be disciplined is to abandon the very idea of good (śubha) and bad (aśubha) (Gītā 12.17). In effect, practical rationality moves away from assessing outcomes.

To this end, Kṛṣṇa distinguishes between two differing normative moral theories and recommends both: karma yoga and bhakti yoga. Karma yoga is Deontology formulated as doing duty without the motive of consequence. Duty so defined might have beneficial effects, and Kṛṣṇa never tires of pointing this out (Gītā 2.32). However, the criterion of moral choice on karma yoga is not the outcome, rather it is the fittingness of the duty as the thing to be done that justifies its performance: Hence, better one’s own duty poorly performed than someone else’s well performed (Gītā 2.38, 47, 18.47). Yet, one’s duty properly done is good, so one can have confidence in the outcomes of one’s struggles if one focuses on perfecting one’s duty. Bhakti yoga in turn is Bhakti ethics: performance of everything as a means of devotion to the regulative ideal that results in one’s subsumption by the regulative ideal (Gītā 9.27-33). Metaphorically, this is described as a sacrifice of the outcomes to the ideal. Ordinary practice geared toward an ideal of practice, such as the practice of music organized around the ideal of music, provides a fitting example: The propriety of the practice is not to be measured by the quality of one’s performance on any given day, but rather by fidelity to the ideal that motivates a continued commitment to the practice and ensures improvement over the long run. In the very long run, one begins to instantiate the regulative ideal of the practice: music. Measuring practice in terms of the outcome, especially at the start of ventures like learning an instrument, is unrewarding as one’s performance is suboptimal. At the start, and at many points in a practice, one fails to instantiate the ideal. Given Bhakti, one finds meaning and purpose through the continuity of one’s practice, through difficulty and reward, and one’s enthusiasm and commitment to the practice remain constant, as it is not measured by outcomes but by fidelity to the procedural ideal—a commitment that is required to bring about a successful outcome.

Kṛṣṇa also famously entertains a third yoga: jñāna yoga. This is the background moral framework of bhakti yoga and karma yoga: What we could call the metaethics of the Gītā. Jñāna yoga, for instance, includes knowledge of Kṛṣṇa himself as the moral ideal, whose task is to reset the moral compass (Gītā 4.7-8, 7.7). It involves asceticism as an ancillary to ethical engagement—asceticism here is code, quite literally, for the rejection of teleological considerations in practical rationality. The proceduralist is not motivated by outcomes and hence attends to their duty as an ascetic would if they took up the challenge of action. What this procedural asceticism reveals is that the procedural ideal (Sovereignty) subsumes all of us, and hence, jñāna yoga yields an insight into the radical equality of all persons (Gītā 5.18).

Kṛṣṇa, Sovereignty, sets himself up as the regulative ideal of morality in the Gītā in two respects. First, he (Kṛṣṇa) describes his duty as lokasaṃgraha, the maintenance of the welfare of the world, and all truly ethical action as participating in this function (Gītā 3.20-24). To this extent, he must get involved in life to re-establish the moral order, if it diminishes (Gītā 4.7-8). Second, he acts as the regulative ideal of Arjuna, who is confused about what to do. The outcome of devotion (bhakti) to the moral ideal—Kṛṣṇa here—is freedom from trouble and participation in the divine (Gītā 10.12), which is to say, the regulative ideal of ethical practice—the Lord of Yoga (Gītā 11.4). This, according to Kṛṣṇa, is mokṣa—freedom for the individual. Liberation so understood is intrinsically ethical, as it is about participation in the cosmic regulative ideal of practice—what the ancient Vedas called Ṛta.

In identifying his own Sovereignty with his function as protector of the world, Kṛṣṇa allows for a way of thinking about Deontological action, karma yoga, as not disconnected from the fourth ethical theory: Bhakti.

Given such considerations, it is not surprising that to some commentators, such as M. K. Gandhi, a central concept of the Gītā is niṣkāmakarma—acting without desire. This in turn is closely related to sthitaprajña—literally “still knowing” (Gandhi 1969: vol. 37, 126). Gandhi goes so far as to claim that these doctrines imply that we should not even be attached to good works (Gandhi 1969: vol. 37, 105). While this sounds dramatic and perhaps paradoxical, it is a rather straightforward outcome of procedural ethical thinking. Even in the case of Deontology, where duty is defined as a good thing to be done, one endorses such actions for procedural reasons, and not merely because they are good. Hence, clarity with respect to the procedural justification for duty deprives us the possibility of being motivated by a desire for the duty in question, for such a desire treats the goodness of the action as a motivating consideration and this is exactly what Deontology denies, for there may be many competing good things to do, but not all count as our duty, and our duty is what has a procedural stamp of approval. In the case of Bhakti, however, the distance between the desire for good works and moral action is even sharper, for goodness is an outcome of right procedure and not an independent moral primitive that one could use to motivate action.

Given the initial challenge of motivating Arjuna to embrace war, Kṛṣṇa’s move to a radically procedural moral framework as we find in jñāna yoga undermines the motivational significance of the various arguments from conventional morality and against fighting, which give pride of place to the good. Yet, in shifting the moral framework, Kṛṣṇa has not abandoned dharma as such, but has rather proceduralized it.

Hence, the morality of engaging in the war, and engaging in any action, can be judged relative to such procedural considerations. To this end, he leaves Arjuna with two competing procedural normative theories to draw from: karma yoga (Deontology) and bhakti yoga (Yoga/Bhakti).

10. Gītā’s Metaethical Theory

The previous section reviewed the obvious normative implications of the Gītā, as it provides two competing theories that Kṛṣṇa endorses: karma yoga (Deontology) and bhakti yoga (Bhakti/Yoga). Both are procedural theories, but Deontology identifies what is to be done as a good, and bhakti dispenses with the good in understanding the right: It is devotion to the procedural ideal that defines the right. They are united in a common metaethical theory. Metaethics concerns the assumptions and conditions of normative ethics, and the metaethics of the Gītā is jñāna yoga: the discipline of knowledge (jñāna). One of the entailments of the Gītā, as one can find in most all South Asian philosophy, is that morality is not a fiction: Rather, there are facts of morality that are quite independent of our perspective, wishes, hopes, and desires. This is known as Moral Realism.

a. Moral Realism

In Chapter 4, Kṛṣṇa specifies himself as the moral ideal whose task is to reset the moral compass (Gītā 4.7-8). This section discusses the characteristics of the moral ideal, but also its relationship to ordinary practice. What follows in the fifth chapter and beyond seems increasingly esoteric but morally significant. The fifth chapter discusses the issue of renunciation, which in the Gītā amounts to a criticism of Consequentialism and teleology on the whole. The various ascetic metaphors in the Gītā are in short ethical criticisms of teleology. One of the implications of this criticism of teleology is the equality of persons understood properly: All of us are equal in moral potential if we renounce identifying morality with virtue (Gītā 5.18). The sixth chapter ties in the broader practice of Yoga to the argument. The continuity between Devotion (Bhakti) and Discipline (Yoga) as a separate philosophy is seamless, and if we were to study the Yoga Sūtra, we would find virtually the same theory as the Bhakti yoga account in the Gītā. Here in chapter six, we learn about the equanimity that arises from the practice of yoga (Gītā 6.9). Indeed, as we abandon the paradox of teleology, we ought to expect that ethical practice results in stability: No longer allowing practical rationality to be destabilized by a desire for unlikely exceptional outcomes, or likely disappointing outcomes, we settle into moral practice geared towards the stability of devotion to our practice and the regulative ideal. Chapter seven returns to the issue of Bhakti in full, with greater detail given to the ideal (Gītā 7.7). Chapter eight brings in reference to Brahman (literally “Development”) and ties it with the practice of yoga. Moral Realism has many expressions (Brink 1989; Shafer-Landau 2003; Brink 1995; Sayre-McCord Spring 2015 Edition; Copp 1991), but one dominant approach is that moral value is real. Chapter nine introduces the element of Moral Realism: all things that are good and virtuous are subsumed by the regulative ideal (Gītā 9. 5). The ideal is accessible to anyone (Gītā 9.32).

i. Good and Evil

Chapter ten contends with the outward instantiation of the virtues of the ideal. It is claimed that the vices too are negative manifestations of the ideal (Gītā 10.4-5). This is an acknowledgment of what we might call the moral responsibility principle. This is the opposite of the moral symmetry principle, which claims that two actions are of the same moral worth if they have the same outcome. The moral responsibility principle claims that different outcomes share a procedural moral value if they arise from devotion to the same procedural ideal. As outcomes, vices are a consequence of a failure to instantiate the moral ideal. Hence the moral ideal is responsible for this. This only shows that devotion to the ideal is preferable (Gītā 10.7).

It may seem counterintuitive that we should not understand the moral ideal only in terms of good outcomes. To define the procedural ideal by way of good outcomes, as though the good outcome is a sign of the procedural ideal, reverses the explanatory order of Bhakti by treating the Good as a primitive mark of the right and morality as such, and hence to avoid this rejection of Bhakti we have to accept that doing the right thing can result in further consequences that are bad. For instance, practicing the violin every day will likely yield lots of bad violin performances, especially in the beginning, and this is something that is not an accident, but a function of one’s devotion to the procedural ideal of practicing the violin. In the Western tradition, the notion that dutiful action can result in further consequences that may be bad is often known as the Double Effect, and it has been used as a defense of scripted action in the face of suboptimal outcomes—according to this Doctrine of Double Effect, one is only responsible for the primary effect, and not the secondary negative effects. Yet, as noted, this doctrine can be used to justify any choice, for one could always relegate the negative effects of a choice to the secondary category and the primary effect to one’s intended effect (Foot 2007). The moral responsibility principle is in part a response to such a concern: One cannot disown the effects of one’s actions as though they were secondary to one’s intention. The good and the bad are a function of devotion to the ideal and must be affirmed as right, though not necessarily good. This parsing deprives the possibility of analyzing choice into a primary effect and a secondary one: There is the primary choice of the action which is right, and not an outcome, and then there is the outcome, good or bad.

With karma yoga of the Gītā too, however, double effect is reduced out of the picture. Good action that we endorse for procedural reasons (karma yoga) might result in bad secondary outcomes, which we are responsible for, yet in this case we have not perfected our duty, and when perfected, there are no deleterious secondary outcomes. However, in so far as double effect is a sign of failure to execute one’s duty properly, one cannot take credit for a primary effect while disavowing a secondary effect. Double effect is brought into the picture precisely when we fail in some way to do our duty. In so far as such failure is a function of one’s devotion to one’s duty, as something to be perfected, one must be responsible for all the effects of one’s actions.

Kṛṣṇa in the Gītā recommends treating outcomes as such as something to be renounced, and this may seem to vitiate against the notion that we are responsible for the outcome of our choices. On a procedural approach to action, though, we renounce the outcomes of actions precisely because they are not, on the whole, anything to calibrate moral action to, as they may be good or bad.

Chapter eleven refers to the empirical appreciation of the relation of all things to the regulative ideal. Here, in the dialog, Kṛṣṇa gives Arjuna special eyes to behold the full outcome of the regulative ideal—his cosmic form, which Arjuna describes as awe inspiring and terrifying. Good and bad outcomes of reality are straightforwardly acknowledged as outcomes of the regulative ideal.

Chapter twelve focuses on the traits of those who are devoted to the regulative ideal. They are friendly and compassionate and do not understand moral questions from a selfish perspective (12.13). Importantly, they renounce teleological markers of action: good (śubha) and evil (aśubha) (12.17). Yet they are devoted to the welfare of all beings (Gītā 12.4). The Bhakti theory suggests that these are not inconsistent: If the welfare of all beings is the duty of the regulative ideal, Kṛṣṇa (Gītā 3.24), then ethical practice is about conformity to this duty. And this is not arbitrary: If the procedural ideal (the Lord) of unconservatism and self-governance accounts for the conditions under which a being thrives, then the welfare of all beings is the duty of the ideal. The outcome is not what justifies the practice; the good outcome is the perfection of the practice.

ii. Moral Psychology

Moral psychology in the Western tradition is often identified with the thought processes of humans in relationship to morality. In the South Asian context, mind is often given an external analysis: The very content of mind is the content of the public world. Hence, psychology becomes coextensive with a more generally naturalistic analysis of reality. To this extent, moral psychology is continuous with a general overview of nature.

Chapter thirteen emphasizes the distinction of the individual from their body—this follows from the procedural analysis of the individual as morally responsible yet outside of the content of their experiences. Chapter fourteen articulates the tri-guṇa theory that is a mainstay of Sāṅkhya and Yoga analyses. Accordingly, aside from persons (puruṣa), nature (prakṛti) is comprised of three characteristics: sattva (the cognitive), rajas (activity), and tamas (inertia). Nature so understood is relativized to moral considerations and plays an explanatory role that ought to be downstream from regulative choices. Chapter fifteen is an articulation of the success of those who adhere to Bhakti: “Without delusion of perverse notions, victorious over the evil of attachment, ever devoted to the self, turned away from desires and liberated from dualities of pleasure and pain, the undeluded go to that imperishable status” (Gītā 15.5).

Chapter sixteen is an inventory of personalities relative to the moral ideal. Chapter seventeen returns to the issue of the three qualities of nature, but this time as a means of elucidating moral character. Most importantly, it articulates the bhakti theory in terms of śraddhā (commitment), often also identified with faith: “The commitment of everyone, O Arjuna, is in accordance with his antaḥ karaṇa (inside helper, inner voice). Everyone consists in commitment. Whatever the commitment, that the person instantiates” (Ch 17.3). Here, we see the theory of bhakti universalized in a manner that abstracts from the ideal. Indeed, we are always making ourselves out in terms of our conscience—what we identify as our moral ideal—and this warrants care, as we must choose the ideal we seek to emulate carefully. The three personality types, following the three characteristics of nature, choose differing ideals. Only the illuminated choose deities as their ideals. Those who pursue activity as an ideal worship functionaries in the universe (yakṣa-s and rākṣasas), while those who idealize recalcitrance worship those that are gone and inanimate things (Gītā 17. 4).

b. Transcending Deontology and Teleology

In the final chapter, Kṛṣṇa summarizes the idea of renunciation. Throughout the Gītā, this has been a metaphor for criticizing teleology. The practical reality is that action is obligatory as a part of life, and yet, those who can reject being motivated by outcomes as the priority in ethical theory are true abandoners (Gītā 18: 11). Unlike those who merely choose the life of the recluse (being a hermit, or perhaps joining a monastery), the true renunciate has got rid of teleology. A new paradox ensues. Those who operate under the regulative ideal are increasingly challenged to account for their action in terms of the ideal. This means that it becomes increasingly difficult to understand oneself as deliberating and thereby choosing. As an example, the musically virtuous, who has cultivated this virtue by devotion to the ideal of music, which abstracts from particular musicians and performances, no longer needs to explain his or her performance by entry-level rules and theory taught to beginners. Often one sees this narrative used to motivate Virtue Ethics (Annas 2004), but in the case of Bhakti, the teacher is not a virtuous person, but rather our devotion to the regulative ideal: This yields knowledge, and the ideal is procedural, not actual dispositions or strengths. This explains how it is that virtuosos can push their craft and continually reset the bar for what counts as good action—for in these cases there is no virtuous teacher to defer to but leaders themselves in their fields setting the standards for others. Their performance constitutes the principles that others emulate. Bhakti is the generalization of this process: Devotion to the procedural ideal leads to performances that reset one’s own personal standards and in the long run everyone’s standards. Thus, Bhakti is not obviously a version of moral particularism, which often denies the moral importance of principles (Dancy 2017). Surely they are important, but they are generated by devotion to the ideal of the Lord, unconservativism, and self-governance. One of the implications of this immersion in procedural practical rationality is the utter disavowal of teleological standards to assess progress. In this light, claims like “He who is free from the notion “I am the doer,” and whose understanding is not tainted—slays not, though he slays all these men, nor is he bound” (Gītā 18:17), are explicable by the logic of devotion to a procedural ideal. In the path of devotion, the individual themselves cannot take credit, lest they confuse the morality of their action with the goodness of their performance. Hence, we find “that agent is said to be illuminated (sattvika) who is free from attachment, who does not make much of himself, who is imbued with steadiness and zeal and is untouched by successes and failure” (Gītā 18:26).

At this juncture, Kṛṣṇa introduces an explicitly deontological account of duty, which cashes duty out in terms of goodness of something to be done relative to one’s place in the moral order. Duty, caste duty specifically, is duty suited to one’s nature (Gītā 18.41). He also recalls the procedural claim: better one’s own duty poorly done than another’s well done (Gītā 18.47). Kṛṣṇa claims that the right thing to do is specified by context transcendent rules that take into account life capacities and situate us within a reciprocal arrangement of obligations and support (Gītā 12.5-13, 33-35). These are moral principles. He also further argues that good things happen when people stick to their duty (Gītā 2.32). Deontologically, this is to be expected if duty is good action. Yet, Kṛṣṇa has also defended a more radical procedural ethic, of Bhakti, and this is the direction of his dialectic, which allows the individual to be subsumed by the moral ideal (Gītā 18.55). However, this subsumption leads not only to renouncing outcomes to the ideal, in the final analysis, it should also lead to giving up on moral principles—good rules—as a sacrifice to the ideal (Gītā 18.57). Indeed, especially if the good actions and rules are themselves a mere outcome of devotion, moral progress would demand that we abandon them as standards to assess our own actions, as we pursue devotion.

In the Western tradition, going beyond duty in service of an ideal of morality is often called supererogation (for a classic article on the topic, see Urmson 1958). Here, Kṛṣṇa appears to be recommending the supererogatory as a means of embracing Bhakti. This leads to excellence in action that surmounts all challenges (Gītā 18:58). This move, however, treats Deontology and its substance—moral rules, principles, and good actions—as matters to be sacrificed for the ideal. Hence, in light of the tension between bhakti and karma yoga, and that bhakti yoga is the radically procedural option, which does away with teleological considerations altogether—the very considerations that lead to Arjuna’s despondency in the face of evil (the moral parasitism of the Kauravas)—Kṛṣṇa recommends bhakti and the idea that Arjuna should abandon all ethical claims (dharmas) and come to him, and that he (the procedural ideal, the Lord) will relieve Arjuna of all evil (Gītā 18:66). This seems like an argument for moral nihilism in the abstract, but it is consistent with the logic of the moral theory of bhakti, which Kṛṣṇa has defended all along.

This conclusion is exactly where the MTA should end if it is a dialectic that takes us from teleological considerations to a radical proceduralism. The MTA is an argument that takes us to a procedural ethics on the premise of the failures of teleological approaches. In the absence of this overall dialectic, the concluding remarks of Kṛṣṇa seem both counterintuitive but also contradictory to the main thrust of his argument. He has after all spent nearly sixteen chapters of the Gītā motivating Arjuna to stick to his duty as a warrior and here he recommends abandoning that for the regulative ideal: himself. If right action is ultimately defined by the regulative ideal, however, then devotion to this ideal would involve sacrificing conventional moral expectations.

When assessing the moral backdrop of the Bhagavad Gītā in the Mahābhārata, it is quite apparent that it is conventional morality, organized around the good, that creates the context in which the Pāṇḍavas are terrorized, and hence Kṛṣṇa’s recommendation that Arjuna should simply stop worrying about all moral considerations and simply approximate him as the regulative ideal is salutary, insofar as Kṛṣṇa represents the regulative ideal of yoga: unconservativism united with self-governance. It is also part of a logic that pushes us to embrace a radical procedural moral theory at the very breakdown of conventional morality: war.

After the Bhagavad Gītā ends, and the Pāṇḍavas wage war on the Kauravas, Kṛṣṇa in the epic the Mahābhārata counsels the Pāṇḍavas to engage in acts that violate conventional moral expectations. Viewed from the lens of conventional morality, this seems to be bad, and wrong. However, it was conventional morality that the Kauravas used as a weapon against the Pāṇḍavas, so Kṛṣṇa’s argument was not only expedient, it also permitted an alternative moral standard to guide the Pāṇḍavas at the breakdown of conventional morality. Kṛṣṇa himself points out that winning against the Kauravas would not have been possible had the Pāṇḍava’s continued to play by conventional morality (compare Harzer 2017). The argument for the alternative, though, made no appeal to success or the good as a moral primitive. It appealed to the radical proceduralism of Bhakti.

11. Scholarship

 

Influential scholarship on the Bhagavad Gītā begins with famous Vedānta philosophers, who at once acknowledge the Gītā as a smṛti text—a remembered or historical text—but treated it on par with the Upaniṣads of the Vedas: a text with intuited content (śruti). In the context of the procedural ethics and Deontology of the later Vedic tradition, as we find in the Pūrva Mīmāṃsā and Vedānta traditions, the Vedas is treated as a procedural justification for the various goods of practical rationality. For Vedānta authors, concerned with the latter part of the Vedas, the Gītā is an exceptional source, as it not only summarizes the teleological considerations of the earlier part of the Vedas but also pursues the Moral Transition Argument (from teleology to proceduralism) to a conclusion that we find expressed in the latter part of the Vedas, while ostensibly also endorsing a caste and Brahmanical frame that made their scholarship and activity possible. Yet, the commentaries on the Gītā differ significantly.

Competing commentaries on the Gītā are interesting independent of their fidelity to the text (compare Ram-Prasad 2013). Yet, as a matter of readings, they are different in accuracy and relevance. If interpretation—explanation by way of what one believes—is the default method of reading philosophical texts, then we should expect that the various commentaries on the Gītā from such philosophers would differ in accordance with the beliefs of the interpreter. Interpreted, there are as many accounts of the Gītā as there are belief systems of interpreters. The standard practice in philosophy (explication), however, employs logic to tease out reasons for controversial conclusions, so that contributions can be placed within a debate. This allows philosophers who disagree the ability to converge on philosophical contributions as contributions to a disagreement. Or put another way, in allowing us to understand a text such as the Gītā in terms of its contribution to philosophical debate, an explicatory approach allows us to formulate divergent opinions about the substantive claims of a text such as the Gītā. (For more on the divergence between interpretation and explication, see Ranganathan 2021.) In the South Asian tradition, the term often used to capture the challenge of philosophical understanding of texts is mīmāṃsā: investigation, reflection. This idea parts way with interpretation, as investigation is not an explanation of what one already believes, and shares much more with the explicatory activity of philosophy. Not all traditional scholars were equally skilled in differentiating their beliefs from the challenge of accounting for the Gītā, however.

Śaṅkara, in his famous preamble to his commentary on the Brahma Sūtra, argues in general that ordinary reality as we understand it is a result of a superimposition of subjectivity and the objects of awareness, resulting in an ersatz reality, which operates according to definite natural regularities (compare Preamble Śaṅkara [Ādi] 1994). Śaṅkara’s view is hence an interpretive theory of ersatz reality: It is by this confusion of the perspective of the individual and what they experience that we get the world as we know it. According to Śaṅkara, hence, laudable teachings on dharma, from Kṛṣṇa too, help desiring individuals regulate their behavior within this ersatz reality—a position defended as “desireism” (Marks 2013). However, in his commentary on the Gītā, Śaṅkara claims that those who are interested in freedom (mokṣa) should view dharma as an evil (commentary at 4.21, Śaṅkara [Ādi] 1991), for dharma brings bondage. On the whole, Śaṅkara argues that the point of the Gītā is not a defense of engaged action (that is what Kṛṣṇa defends) but rather non-action—renunciation. The moment we have knowledge, the individual will no longer be able to engage in action (Gītā Bhāṣya 5)—a position reminiscent of the Sāṅkhya Kārikā (67). The Gītā ’s theme, in contrast, is that jñāna yoga provides insight that is ancillary to karma yoga and especially bhakti yoga. When Kṛṣṇa argues at the end (Gītā 18:66) that we should abandon all dharmas and approach him, this seems like a vindication of Śaṅkara’s gloss. If Śaṅkara had adopted the policy of explication, Gītā 18:66 and other controversial claims in the texts would have to be explained by general theories entailed by what is said everywhere else in the Gītā. Interpreters in contrast seize on claims in isolation—the ones that reflect their doxographic commitments—and use these as a way to make sense of a text, and that indeed seems to be Śaṅkara’s procedure: If he were to elucidate a controversial claim as we find at Gītā 18:66 by way of the rest of the text, he would have to take at face value the various positive declarations and endorsements of action by Kṛṣṇa. In short, it is unclear whether one can provide Śaṅkara’s reading of the Gītā if one were not already committed to the position. For explicated (explained as an entailment of other theses expressed or entailed by the text), Gītā 18.66 as a controversial claim would not have the status of a first principle of reading the Gītā, but rather something to be logically explained by theories defended elsewhere in the text.

Rāmānuja, perhaps the most influential philosopher in India in the early twenty-first century, though not popularly thought to be so (Potter 1963: 252-253), endorses Kṛṣṇa’s arguments for karma yoga and bhakti yoga, but is challenged at the end when Kṛṣṇa recommends that we abandon all dharmas and seek out Kṛṣṇa, for this contradicts the idea of Kṛṣṇa as the moral ideal, who we please by way of dutiful action—an idea that Kṛṣṇa has some part in encouraging in the Gītā. So Rāmānuja argues in his Gītā Bhāṣya (commentary) on 18:66 (Rāmānuja 1991) that there are two possible ways of reading this ironic ending given everything else that has preceded it in the Gītā: (a) Kṛṣṇa is recommending the abandonment of Deontology and the fixation on good rules (as we find set out in the secondary, Vedic literature), in favor of an ethics of devotion; (b) or, Kṛṣṇa is providing encouragement to his friend, who is overwhelmed by an ethic of devotion, an encouragement that is itself consistent with the ethic of devotion. Rāmānuja’s two readings are outcomes of attempting to apply the theories of karma yoga and bhakti yoga to this last claim. If one assumes karma yoga as filling out duties, then indeed, Kṛṣṇa seems to be rejecting this in some measure. If one assumes bhakti, then 18:66 seems to be a straight entailment of the theory. However, Rāmānuja does insist, in keeping with what is said elsewhere, that this last controversial claim cannot be read as an invitation to abandon all ethical action as such, as this does not follow from the preceding considerations. Doing one’s duty is itself a means of devotion—an argument Kṛṣṇa delivers at the start of the Gītā, and so to this extent, duty cannot be abandoned without abandoning a means of devotion—not to mention that one’s duty is something suitable to oneself that one can perfect. One can and should however abandon conventional moral expectations, also called dharmas. This criticism of conventional dharma is at the root of the motivation for the Gītā.

Modern commentators on the Gītā largely continue the tradition in the literature of interpreting South Asian thought: To interpret South Asian thought is to use one’s beliefs in explaining the content of a text, and these beliefs are often derived from the interpreter’s exposure to Western theories—unsurprising if the theory of interpretation is generated by the historical account of thought in the Western tradition (Ranganathan 2021, 2018b, 2018a). Scholars who hence interpret South Asian philosophy and the Gītā, given their beliefs that are developed within the historical context of Western philosophy, will be inclined to read the Gītā in terms of familiar options of the Western tradition. Here, we find absent Bhakti/Yoga as a moral theory, and instead the main options are Virtue Ethics, Consequentialism, and Deontology. If we had to interpret the Gītā with these options, Kṛṣṇa’s encouragement that those who stick to their duty and are devoted to him will meet a good outcome sounds rather like Rule Consequentialism—the idea that we should adopt a procedure that promises to bring a good result in the long term (Sreekumar 2012; Theodor 2010). Deontological interpretations, by authors such as Amartya Sen, have been floated and criticized (Anderson 2012). Explicated, we look to the perspectives themselves to entail a theory that entails the various controversial claims, and we thereby bypass trying to assess the Gītā by way of familiar options in the Western tradition. One of the outcomes of course is the acknowledgment of a fourth moral theory: Bhakti/Yoga.

Further to this trend of seeing the Gītā via interpretation is the difficulty it causes for making sense of various claims Kṛṣṇa makes. For instance, explicated, the Gītā is a push for a procedural approach to morality that undermines the importance of the good and outcomes in general for thinking about practice. One of the results of this dialectic is that Kṛṣṇa, as the regulative ideal, takes responsibility for outcomes as something that is out of the control of the individual. In the absence of the context of practical rationality, this seems like an argument for hard determinism (Brodbeck 2004). Yet, when we bring back Bhakti/Yoga into the picture, this is consistent with the devaluation of outcomes that comes in tow with thinking about morality as a matter of conforming to the procedural ideal: The ideal accounts for the outcomes, not our effort, so the closer we are to a procedural approach and bhakti, the better the outcomes, but the less one will be able to call upon personal effort as the explanation for the outcomes.

All things considered, reading the Gītā via interpretation renders it controversial, not merely in scope and topic (for all philosophy is controversial in this way) but also in terms of content—it is unclear what the text has to say, for the reading is determined in large measure by the beliefs of the interpreter. Yet, ironically, interpretation deprives us the capacity to understand disagreement as we can only thereby understand in terms of what we believe (and disagreement involves what we do not believe), so the controversy of the conflicting interpretations of the Gītā remains opaque. Explication, an explanation by way of logic that links these with controversial conclusions, renders the content of controversy clear, but this also allows us to converge on a reading though we may substantively disagree with the content of the reading. The Gītā itself displays such explicatory talent as it constitutes an able exploration of moral theoretical disagreement. Students of the text benefit from adopting its receptivity to dissent, both in being able to understand its contribution to philosophy but also in terms of the inculcation of philosophical thinking.

12. References and Further Reading

  • The Aitareya Brahmanam of the Rigveda. 1922. Translated by Martin Haug. Edited by The Sacred Books of the Hindus. Vol. 2, Allahabad: Sudhindra Nath Vas, M.B.
  • Alexander, Larry , and Michael Moore. Winter 2012. “Deontological Ethics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta: http://plato.stanford.edu/archives/win2012/entries/ethics-deontological/.
  • Allen, Nick 2006. “Just War in the Mahābhārata.” In The Ethics of War: Shared Problems in Different Traditions, edited by Richard Sorabji and David Rodin, 138-49. Hants, England: Ashgate Publishing Limited.
  • Anderson, Joshua. “Sen and the Bhagavad Gita: Lessons for a Theory of Justice.” Asian Philosophy 22, no. 1 (2012/02/01 2012): 63-74.
  • Annas, Julia. “Being Virtuous and Doing the Right Thing.” Proceedings and Addresses of the American Philosophical Association 78, no. 2 (2004): 61-75.
  • Brink, David Owen. 1989. Moral Realism and the Foundations of Ethics. In Cambridge Studies in Philosophy. Cambridge; New York: Cambridge University Press.
  • Brink, David Owen. 1995. “Moral Realism.” In The Cambridge Dictionary of Philosophy, edited by Robert Audi, 511-512. Cambridge ; New York: Cambridge University Press.
  • Brodbeck, Simon. “Calling Krsna’s Bluff: Non-Attached Action in the Bhagavadgītā.” Journal of Indian Philosophy 32, no. 1 (February 01 2004): 81-103.
  • Cabezón, José Ignacio. “The Discipline and its Other: The Dialectic of Alterity in the Study of Religion.” Journal of the American Academy of Religion 74, no. 1 (2006): 21-38.
  • Clooney, Francis X. 2017. “Toward a Complete and Integral Mīmāṃsā Ethics: Learning with Mādhava’s Garland of Jaimini’s Reasons.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 299-318 Of Bloomsbury Research Handbooks in Asian Philosophy, edited by. London: Bloomsbury Academic.
  • Copp, David. “Moral Realism: Facts and Norms.” Ethics 101, no. 3 (1991): 610-624.
  • Dancy, Jonathan. 2017. “Moral Particularism.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta. https://plato.stanford.edu/archives/win2017/entries/moral-particularism/.
  • Davis, Richard H. 2015. The Bhagavad Gita: a Biography. Princeton: Princeton University Press.
  • Ellwood, R.S., and G.D. Alles. 2008. “Bhagavad-Gita.” In The Encyclopedia of World Religions: Facts on File.
  • Foot, Philipa. 2007. “The Problem of Abortion and The Doctrine of the Double Effect.” In Ethical Theory: An Anthology, edited by Russ Shafer-Landau, 582-589 Of Blackwell Philosophy Anthologies, edited by. Malden, MA: Blackwell Pub.
  • Fritzman, J. M. “The Bhagavadgītā, Sen, and Anderson.” Asian Philosophy 25, no. 4 (2015): 319-338.
  • Gandhi, M. K. 1969. The Collected Works of Mahatma Gandhi. New Delhi: Publication Division.
  • Goodman, Charles. 2009. Consequences of Compassion an Interpretation and Defense of Buddhist Ethics. Oxford: Oxford University Press.
  • Gottschalk, Peter. 2012. Religion, Science, and Empire: Classifying Hinduism and Islam in British India. Oxford: Oxford University Press.
  • Harzer, Edeltraud. 2017. “A Study in the Narrative Ethics of the Mahābhārata.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 321-340. London: Bloomsbury Academic.
  • Hursthouse, Rosalind. 1996. “Normative Virtue Ethics.” In How Should One Live?, edited by Roger Crisp, 19–33. Oxford: Oxford University Press.
  • Hursthouse, Rosalind. 2013. Virtue Ethics. Edited by Edward N. Zalta. In The Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/archives/fall2013/entries/ethics-virtue/.
  • Jezic, Mislav. 2009. “The Relationship Between the Bhagavadgītā and the Vedic Upaniṣads: Parallels and Relative Chronology.” In Epic Undertakings, edited by R.P. Goldman and M. Tokunaga, 215-282: Motilal Banarsidass Publishers.
  • Kripke, Saul A. 1980. Naming and Necessity. Cambridge, Mass.: Harvard University Press.
  • Marks, J. 2013. Ethics Without Morals: In Defence of Amorality. Routledge.
  • McMahan, Jeff. 2009. Killing in War. Oxford: Oxford University Press.
  • Potter, Karl H. 1963. Presuppositions of India‘s Philosophies. In Prentice-Hall Philosophy Series. Englewood Cliffs, N.J.: Prentice-Hall.
  • Ram-Prasad, C. 2013. Divine Self, Human Self: The Philosophy of Being in Two Gita Commentaries. London: Bloomsbury.
  • Rāmānuja. 1991. Śrī Rāmānuja Gītā Bhāṣya (edition and translation). Translated by Svami Adidevanada. Madras: Sri Ramakrishna Math.
  • Ranganathan, Shyam. 2016a. “Hindu Philosophy.” In Oxford Bibliographies Online, edited by Alf Hiltebeitel. http://www.oxfordbibliographies.com/.
  • Ranganathan, Shyam. 2016b. “Pūrva Mīmāṃsā: Non-Natural, Moral Realism (4.14).” In Ethics 1, edited by S. Ranganathan. Of Philosophy, edited by A. Raghuramaraju.
  • Ranganathan, Shyam (Ed.). 2017a. The Bloomsbury Research Handbook of Indian Ethics Of Bloomsbury Research Handbooks in Asian Philosophy. London: Bloomsbury Academic.
  • Ranganathan, Shyam. 2017b. “Patañjali’s Yoga: Universal Ethics as the Formal Cause of Autonomy.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 177-202. London: Bloomsbury Academic.
  • Ranganathan, Shyam. 2017c. “Three Vedāntas: Three Accounts of Character, Freedom and Responsibility.” In The Bloomsbury Research Handbook of Indian Ethics, edited by Shyam Ranganathan, 249-274. London: Bloomsbury Academic.
  • Ranganathan, Shyam. 2018a. “Context and Pragmatics.” In The Routledge Handbook of Translation and Philosophy edited by Philip Wilson and J Piers Rawling, 195-208 Of Routledge Handbooks in Translation and Interpreting Studies edited by. New York: Routledge.
  • Ranganathan, Shyam. 2018b. Hinduism: A Contemporary Philosophical Investigation. In Investigating Philosophy of Religion. New York: Routledge.
  • Ranganathan, Shyam. 2018c. “Vedas and Upaniṣads.” In The History of Evil in Antiquity 2000 B.C.E. – 450 C.E., edited by Tom Angier, 239-255 Of History of Evil, edited by C. Taliaferro and C. Meister. London: Routledge.
  • Ranganathan, Shyam. 2019. “Just War and the Indian Tradition: Arguments from the Battlefield.” In Comparative Just War Theory: An Introduction to International Perspectives edited by Luis Cordeiro-Rodrigues and Danny Singh, 173-190. Lanham, MD: Rowman & Littlefield.
  • Ranganathan, Shyam. 2021. “Modes of Interpretation.” In Encyclopedia of Religious Ethics, edited by William Schweiker, David A. Clairmont and Elizabeth Bucar. Hoboken NJ: Wiley Blackwell.
  • Śaṅkara (Ādi). 1991. Bhagavadgita with the commentary of Sankaracarya. Translated by Swami Gambhirananda. Calcutta: Advaita Ashrama.
  • Śaṅkara (Ādi). 1994. The Vedānta Sūtras with the Commentary by Śaṅkara (Brahma Sūtra Bhāṣya). Translated by George Thibaut. In Sacred books of the East 34 and 38.2 vols.: http://www.sacred-texts.com/hin/sbe34/sbe34007.htm.
  • Sayre-McCord, Geoff. Spring 2015 Edition. “Moral Realism.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta: http://plato.stanford.edu/archives/spr2015/entries/moral-realism/.
  • Shafer-Landau, Russ. 2003. Moral Realism: a Defence. Oxford: Clarendon.
  • Soni, Jayandra. 2017. “Jaina Virtue Ethics: Action and Non-Action” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 155-176. London: Bloomsbury Academic.
  • Sreekumar, Sandeep. “An Analysis of Consequentialism and Deontology in the Normative Ethics of the Bhagavadgītā.” Journal of Indian Philosophy 40, no. 3 (2012): 277-315.
  • Theodor, Ithamar. 2010. Exploring the Bhagavad Gitā: Philosophy, Structure, and Meaning. Farnham: Ashgate.
  • Urmson, J.O. 1958. “Saints and Heroes.” In Essays in Moral Philosophy, edited by A. I. Melden, 198-216. Seattle,: University of Washington Press.
  • Yāmunācārya. 1991. “Gītārtha-saṅgraha.” In Śrī Rāmānuja Gītā Bhāṣya (edition and translation), translated by Svami Adidevanada, 1-8. Madras: Sri Ramakrishna Math.

 

Author Information

Shyam Ranganathan
Email: shyamr@yorku.ca
York University
Canada

Humility Regarding Intrinsic Properties

The Humility Thesis is a persistent thesis in contemporary metaphysics. It is known by a variety of names, including, but not limited to, Humility, Intrinsic Humility, Kantian Humility, Kantian Physicalism, Intrinsic Ignorance, Categorical Ignorance, Irremediable Ignorance, and Noumenalism. According to the thesis, we human beings, and any knowers that share our general ways of knowing, are irremediably ignorant of a certain class of properties that are intrinsic to material entities. It is thus important to note that the term ‘humility’ is unrelated to humility in morality, but rather refers to the Humility theorist’s concession that our epistemic capabilities have certain limits and that therefore certain fundamental features of the world are beyond our epistemic capabilities. According to many Humility theorists, our knowledge of the world does not extend beyond the causal, dispositional, and structural properties of things. However, things have an underlying nature that is beyond these knowable properties: these are the sort of properties that are intrinsic to the nature of things, and which ground their existence and the causal-cum-structural features that we can know about. If any such properties exist, they do not fall within any class of knowable properties, and so it follows that human beings are unable to acquire any knowledge of them.

There are at least six questions regarding the Humility Thesis: (a) What exactly is the relevant class of properties? (b) Do such properties really exist? (c) Why are we incapable of knowing of them? (d) Is it true that we are incapable of knowing of them? (e) Even if we are incapable of knowing them, is this claim true only about our general ways of knowing things, with certain idiosyncratic exceptions? (f) How can this thesis be applied to other areas of philosophy and the history of ideas? This article explores some responses to these questions.

Table of Contents

  1. The Properties Concerned
    1. Terminological Variety
    2. The Existence and Characteristics of the Properties Concerned: An Elementary Introduction
        1. The Relational Argument
        2. The Argument from Abstractness
        3. The Modal Argument
    3. Extending the Humility Thesis
  2. The Scope of the Missing Knowledge
  3. A Brief History of the Humility Thesis
    1. Religious and Philosophical Mysticism
    2. Hume
    3. Kant
    4. Russell
    5. Armstrong
    6. Contemporary Metaphysics
  4. Arguments for Humility
    1. The Receptivity Argument
    2. The Argument from Our Semantic Structure
      1. Global Response-Dependence
      2. Ramseyan Humility
    3. The Multiple Realisability Argument
  5. Arguments against Humility
    1. The Objection from Reference-Fixing
    2. The Objection from Vacantness
    3. The Objection from Overkill
  6. Alternative Metaphysical Frameworks
    1. Ontological Minimalism: Appealing to Phenomenalism, Dynamism, or Dispositionalism
    2. Physics and Scientific Eliminativism
    3. Rationalist Categoricalism
  7. The Humility Thesis and Russellian Monism
    1. Constitution: from Intrinsic Properties to Qualia
    2. Introspection: from Qualia to Intrinsic Properties
  8. The Humility Thesis and Physicalism
  9. Conclusion
  10. References and Further Reading

1. The Properties Concerned

To begin with, the question of immediate concern in regard to the Humility Thesis is the nature of the relevant class(es) of properties. Any subsequent discussion is impossible without a proper characterisation of the subject matter under discussion. Furthermore, in order to understand the nature of these properties, a rough idea of why some philosophers believe in their existence also is required, for this helps us to understand the role these properties play in the ontological frameworks of those who believe in their existence.

a. Terminological Variety

 There is great terminological variety in the literature on Humility. Different authors discuss different properties: intrinsic properties (Langton 1998), categorical properties (Blackburn 1990; Smith & Stoljar 1998), fundamental properties (Lewis 2009; Jackson 1998), and so on. Very roughly, for our current purposes, the three mentioned kinds of properties can be understood as follows:

Intrinsic properties: Properties that objects have of themselves, independently of their relations with other things (for example, my having a brain).

Categorical properties: Properties that are qualitative, and not causal or dispositional‑namely, not the properties merely regarding how things causally behave or are disposed to causally behave (for example, shape and size).

Fundamental properties: Properties that are perfectly basic in the sense of not being constituted by anything else. (Contemporary physics tells us that mass, charge, spin, and the like are so far the most fundamental properties we know of, but it is an open question as to whether current physics has reached the most fundamental level of reality and whether it could ever reach it.)

Some authors also use the term ‘quiddities’ (Schaffer 2005; Chalmers 2012), which is taken from scholastic philosophy. The term historically stood for properties that made the object ‘what it is’, and was often used interchangeably with ‘essence’. In the contemporary literature on the Humility Thesis, it roughly means:

Quiddities: Some properties—typically intrinsic properties, categorical properties, or fundamental properties—that individuate the objects that hold them, and which ground the causal, dispositional, and structural properties of those objects.

At first glance, looking at the above characterisations of properties, the claim that the Humility Thesis concerns them may seem confusing to some non-specialists. For the list above gave examples of intrinsic properties and of categorical properties, and clearly we have knowledge of these examples. Furthermore, it may seem possible that properties like mass, charge, and spin are indeed fundamental as current physics understands them, and it at least seems conceivable that physics may uncover more fundamental levels of reality in the future and thus eventually reach the most fundamental level. A Humility theorist will answer that we are not irremediably ignorant of all conceivable intrinsic, categorical, or fundamental properties but only of a particular class of them. For example, Langton distinguishes between comparatively intrinsic properties from absolutely intrinsic properties: comparatively intrinsic properties are constituted by causal, dispositional properties, or by structural properties, whereas absolutely intrinsic properties are not so constituted. And her thesis concerns absolutely intrinsic properties, not comparatively intrinsic properties (Langton 1998, pp. 60-62). When Lewis discusses our ignorance of fundamental properties, he explicitly states that in his own view fundamental properties are intrinsic and not structural or dispositional (Lewis 2009, p. 204, 220-221n13) (though he also thinks that Humility spreads to structural properties—see Section 1c for discussion).

With this in mind, despite the terminological variety, one possible way to understand the literature is that the main contemporary authors are in fact concerned with roughly the same set of properties (Chan 2017, pp. 81-86). That is, what these authors describe is often the same set of properties under different descriptions. More specifically, when authors discuss our ignorance of properties which they may describe as intrinsic, categorical, or fundamental, the relevant properties are most often properties that belong to all three families, not only some of them—even though our ignorance of these properties may spread to non-members of the families when they constitute them (see Section 1c for further discussion). Henceforth, for the sake of simplicity, this article will only use the term ‘intrinsic properties’, unless the discussion is about an author who is discussing some other kind of property. But the reader should be aware that the intrinsic properties concerned are of a specific narrower class.

b. The Existence and Characteristics of the Properties Concerned: An Elementary Introduction

A further question is whether and why we should believe in the existence of the relevant intrinsic properties. Answering this question also allows us to understand the role that these properties have in the ontological frameworks of those who believe in their existence. This question concerns the debate between categoricalism (or categorialism) and dispositionalism in the metaphysics of properties. The categorialist believes that all fundamental properties are categorical properties, the latter of which are equivalent to the kind of intrinsic properties discussed in this article (see Section 1a). By contrast, the dispositionalist believes that all fundamental properties are dispositional properties, without there being any categorical properties that are more fundamental. This section surveys some common, elementary motivations for categoricalism.

Importantly, the reader should be aware that the full scope of this debate is impossible to encompass within this article, and thus the survey below is only elementary and thus includes only three common and simple arguments. There are some further, often more technically sophisticated arguments, for and against the existence of categorical properties, some of which are influential in the literature (see the article on ‘properties’).

The three surveyed arguments are interrelated. Many philosophers believe that the most fundamental physical properties discovered by science, such as mass, charge, and spin, are dispositional properties: the measure of an object’s mass is a measure of how the object is disposed to behave in certain ways (such as those observed in experiments). The three arguments are, then, all attempts to show that dispositional properties lack self-sufficiency and cannot exist in their own right, and thereby require some further ontological grounds – which the categorialist takes to be categorical properties. Note, though, that there are also some categorialists who do not posit categorical properties as something further to and distinct from causal, dispositional, and structural properties. Rather, they take the latter properties to be property roles which have to be filled in by realiser properties, in this case categorical properties (Lewis 2009).

i. The Relational Argument

Causal and dispositional properties appear to be relational. Specifically, when we say that an object possesses certain causal and dispositional properties, we are either describing (1) the way that the object responds to and interacts with other objects, or (2) the way that the object transforms into its future counterparts. Both (1) and (2) are relational because they concern the relation between the object and other objects or its future counterparts. The problem is whether an object can merely possess such relational properties. Some philosophers do not think so. For them, such objects would be a mere collection of relations, with nothing standing in the relevant relations. This means that there are brute relations without relata; and this seems implausible to them (Armstrong 1968, p. 282; Jackson 1998, p. 24; compare Lewis 1986, p. x). Hence, they argue, objects involved in relations must have some nature of their own that is independent of their relations to other objects, in order for them, or the relevant nature, to be the adequate relata. The candidate nature that many philosophers have in mind for what could exist independently is categorical properties. It is important to note, though, that some dispositionalists believe that dispositions could be intrinsic and non-relational, and thus reject this argument (Borghini & Williams 2008; Ellis 2014). There are also philosophers who accept the existence of brute relations (Ladyman & Ross 2007).

ii. The Argument from Abstractness

The causal and dispositional properties we find in science are often considered geometrical and mathematical, and hence overly abstract. On the one hand, Enlightenment physics is arguably all about the measure of extension and motion of physical objects: extension is ultimately about the geometry of an object’s space-occupation, and motion is ultimately the change in an object’s location in space. On the other hand, contemporary physics is (almost) exhausted by mathematical variables and equations which reflect the magnitudes of measurements. These geometrical properties and mathematical properties have seemed too abstract to many philosophers. For these philosophers, the physical universe should be something more concrete: there should be something more qualitative and robust that can, to use Blackburn’s phrase, ‘fill in’ the space and the equations (Blackburn 1990, p. 62). If this were not the case, these philosophers argue, there will be nothing that distinguishes the relevant geometrical and quantitative properties from empty space or empty variables that lack actual content (for examples of the empty space argument, see Armstrong 1961, pp. 185-187; Blackburn 1990, pp. 62-63; Langton 1998, pp.165-166; for examples of the empty variable argument, see Eddington 1929, pp. 250-259; Montero 2015, p. 217; Chalmers 1996, pp. 302-304). In this case too, the candidate that these philosophers have in mind to ‘fill in’ the space and the equations is categorical properties. It is important to note, though, that some structuralist philosophers and scientists believe that the world is fundamentally a mathematical structure, and would presumably find this argument unappealing (Heisenberg 1958/2000; Tegmark 2007; cf. Ladyman & Ross 2007).

iii. The Modal Argument

Causal and dispositional properties appear to be grounded in counterfactual affairs. Specifically, it appears that objects could robustly possess their causal and dispositional properties, even when those properties do not manifest themselves in virtue of the relevant behaviours. Consider the mass of a physical object. We may regard it as a dispositional property whose manifestations are inertia and gravitational attraction. Intuitively speaking, it seems that even when a physical object exhibits no behaviours related to inertia and gravitational attraction, it could nonetheless possess its mass. The question that arises is the following: what is the nature of such a non-manifest mass? One natural response is that its existence is grounded in the following counterfactual: in some near possible worlds where the manifest conditions of the dispositional property are met, and in which the manifest behaviours are found, the dispositional property manifests itself. But some philosophers find it very awkward and unsatisfactory that something actual is grounded in non-actual, otherworld affairs (see, for example, Blackburn 1990, pp. 64-65; Armstrong 1997, p. 79). A more satisfactory response, for some such philosophers, is that dispositional properties are grounded in some further properties which are robustly located in the actual world. Again, the candidate that many philosophers have in mind is categorical properties (but see Holton 1999; Handfield 2005; Borghini & Williams 2008).

c. Extending the Humility Thesis

Before continuing, it is worth noting that our irremediable ignorance of the above narrow class of intrinsic properties may entail our irremediable ignorance of some further properties. Lewis, for example, holds a view that he calls ‘Spreading Humility’ (2009, p. 214). He argues that almost all structural properties supervene on fundamental properties, and since we cannot know the supervenience bases of these structural properties, we cannot have perfect knowledge of them either. That is, most properties that we are ignorant of are not fundamental properties. Lewis concludes that we are irremediably ignorant of all qualitative properties, regardless of whether they are fundamental or structural, at least under ‘a more demanding sense’ of knowledge (Lewis 2009, p. 214) (for further discussion, see Section 2). Of course, the Spreading Humility view requires a point of departure before the ‘spreading’ takes place. In other words, the basic Humility Thesis, which concerns a narrower class of properties, must first be established before one can argue that any ‘spreading’ is possible.

2. The Scope of the Missing Knowledge

Throughout the history of philosophy, it has never been easy to posit irremediable ignorance of something. For the relevant theorists seem to know the fact that such things exist and their relations to the known world, as in the case of unknowable intrinsic properties (see Section 1). Specifically, to say that there is a part of the world of which we are ignorant, we at least have to say that the relevant things exist. Furthermore, we only say that the relevant things exist because they bear certain relations to the known parts of the world, and thus help to explain the nature of the latter. But this knowledge appears inconsistent with the alleged irremediable ignorance of such things—this problem traces back to Friedrich Heinrich Jacobi’s famous objection to Kant’s idea of unknowable things in themselves (for the latter, see Section 3c) (Jacobi 1787/2000, pp. 173-175; see also Strawson 1966, pp. 41-42). What adds to this complexity is that some Humility theorists go on and debate the metaphysical nature of the unknowable intrinsic properties, such as whether they are physical or fundamentally mental (see Sections 7a and 8). In order to avoid the above inconsistency, the Humility Thesis should be carefully framed. That is, the scope of our ignorance of intrinsic properties should be made precise.

There is at least one influential idea among contemporary Humility theorists: that the Humility Thesis concerns knowledge-which, to use Tom McClelland’s (2012) term (Pettit 1998; Lewis 2009; Locke 2009; McClelland 2012). More precisely, under Humility we are unable to identify a particular intrinsic property: when facing, say, a basic dispositional property D, we would not be able to tell which precise intrinsic property grounds it. For we are unable to distinguish the multiple possibilities in which different intrinsic properties do the grounding, and to tell which possibility is actual. For example, if there are two possible intrinsic properties I1 and I2 that could do the job, we would not be able to tell any difference and thereby identify the one that actually grounds D. This idea is based upon the multiple realisability argument for Humility, which is discussed in detail in Section 4c. By contrast, the sort of intrinsic knowledge discussed in the previous paragraph concerns only the characterisation, not the identification of intrinsic properties, and is thus not the target of Humility Thesis under this understanding.

Nonetheless, while the knowledge-which understanding of Humility may offer the required precision, it is definitely not conclusive. Firstly, it leads to some objections to Humility which seek to show that the relevant knowledge-which, and so the knowledge-which understanding, are trivial (see Sections 5a and 5b). Secondly, many Humility theorists believe that intrinsic properties possess some unknowable qualitative natures apart from their very exact identities (Russell 1927a/1992; Heil 2004). It remains unclear whether the knowledge-which understanding can fully capture the kind of Humility they have in mind (for further discussion, see Section 5b). Note that such unknowable qualitative natures are especially important to those Humility theorists who want to argue that certain intrinsic properties constitute human consciousness (see Sections 3d and 7) or other mysterious things (see Section 3a). Thirdly, the Humility theorists Rae Langton and Christopher Robichaud (2010, pp. 175-176) hold an even more ambitious version of the Humility Thesis. They argue that we cannot even know of the metaphysical nature of intrinsic properties, such as whether or not they are fundamentally mental. Thus, they dismiss the knowledge-which understanding as too restricted and conservative (for further discussion, see Section 8).

In sum, the scope of Humility remains controversial, even among its advocates, and has led to certain criticisms. In the following discussion, a number of problems surrounding the scope of Humility is explored.

3. A Brief History of the Humility Thesis

Like many philosophical ideas, the Humility Thesis has been independently developed by many thinkers from different traditions over the course of history. This section briefly explores some representative snapshots of its history.

a. Religious and Philosophical Mysticism

Ever since ancient times, the Humility Thesis and similar theories have played an important role in religious and mystical thought. However, most of the relevant thinkers did not fully embrace the kind of ignorance described by the Humility Thesis: they believed that such an epistemic limit is only found in our ordinary perceptual and scientific knowledge, but that it can be overcome by certain meditative or mystical knowledge.

A certain form of Hindu mysticism is paradigmatic of this line of thought. According to the view, there is an ultimate reality of the universe which is called the Brahman. The Brahman has a variety of understandings within Hinduism, but a common line of understanding, found for example in the Upanishads, takes it as the single immutable ground and the highest principle of all beings. The Brahman is out of reach of our ordinary sensory knowledge. However, since we, like all other beings in the universe, are ultimately grounded in and identical to the Brahman, certain extraordinary meditative experiences—specifically the kind in which we introspect the inner, fundamental nature of our own self—allow us to grasp it (Flood 1996, pp. 84-85; Mahony 1998, pp. 114-121).

Arguably, the Brahman may be somewhat analogous to the unknowable intrinsic properties described by the Humility Thesis, for both are possibly the fundamental and non-dynamic nature of things which is out of reach of our ordinary knowledge. Moreover, as we shall see, the idea that we can know our own intrinsic properties via introspection of our own consciousness has been independently developed by many philosophers, including a number of those working in the analytic tradition (see Sections 3d and 7). Of course, despite the possible similarities between the Brahman and intrinsic properties, their important differences should not be ignored: the former is unique and singular, and is also described by Hindu mystics as the cause of everything, rather than being non-causal. Furthermore, as mentioned above, Hindu understandings of the Brahman are diverse, and the aforementioned understanding is only one of them (see Deutsch 1969, pp. 27-45; see also the article on ‘Advaita Vedanta’).

There are certain Western theologies and philosophical mysticisms that resemble the above line of Hindu thought, such as those concerning the divine inner nature of the universe (for example, Schleiermacher 1799/1988) and the Schopenhauerian Will (Schopenhauer 1818/1966). Precisely, according to these views, the ultimate nature of the universe, whatever it is, is also out of reach of our ordinary knowledge, but it can be known via some sort of introspection. Of course, the ultimate nature concerned may or may not be intrinsic, non-relational, non-causal, non-dynamic, and so on; this often depends on one’s interpretation. Nonetheless, there seems to remain some similarities with the Humility Thesis.

b. Hume

 18th century Scottish philosopher David Hume is a notable advocate of the Humility Thesis in the Enlightenment period. Even though Hume is not a Humility theorist per se because he is sceptical of the existence of external objects—namely, objects that are mind-independent—let alone their intrinsic properties (T 1.4.2), he does take the Humility Thesis to be a necessary consequence of the early modern materialistic theory of matter, which he therefore rejects due to the emptiness of the resultant ontological framework (T 1.4.4).

Hume’s argument is roughly as follows. Early modern materialism takes properties like sounds, colours, heat, and cold to be subjective qualities which should be attributed to the human subject’s sensations, rather than to the external material objects themselves. This leaves material objects with only two kinds of basic properties: extension and solidity. Other measurable properties like motion, gravity, and cohesion, for Hume, are only about changes in the two kinds of basic properties. However, extension and solidity cannot be ‘possest of a real, continu’d, and independent existence’ (T 1.4.4.6). This is because extension requires simple and indivisible space-occupiers, but the theory of early modern materialism offers no such things (T 1.4.4.8). Solidity ultimately concerns relations between multiple objects rather than the intrinsic nature of a single object: it is about how an object is impenetrable by another object (T 1.4.4.9). Hume concludes that under early modern materialism we are in fact unable to form a robust idea of material objects.

c. Kant

Like Hume, 18th century German philosopher Immanuel Kant is another notable advocate of the Humility Thesis in the Enlightenment period. He makes the famous distinction between phenomena and things-in-themselves in his transcendental idealism. The idea of transcendental idealism is very roughly that all laws of nature, including metaphysical laws, physical laws, and special science laws are, in fact, cognitive laws that rational human agents are necessarily subject to. Since things-in-themselves, which are the mind-independent side of things, must not be attributed any such subjective cognitive features, their nature must be unknowable to us (CPR A246/B303). We can only know of things as they appear to us subjectively as phenomena, under our cognitive laws such as space, time, and causality (CPR A42/B59).

It is important to note that Kant intends transcendental idealism to be a response to some philosophical problems put forward by his contemporaries, and that these philosophical problems are often not the concerns of contemporary Humility theorists. Examples include the subject-object problem and the mind-independent external reality problem put forward by Hume and Berkeley (CPR B274). Furthermore, it is also worth noting that Kant’s views have a variety of interpretations, for interpreting his views is never an easy task—his transcendental idealism is no exception (see the article on ‘Kant’). However, if the nature of things-in-themselves, being free from extrinsic relations to us the perceivers and other extrinsic relations we attribute to them (for example, spatiotemporal relations with other things), can be considered as the intrinsic properties of things, then transcendental idealism entails the Humility Thesis. In addition, no matter what the correct interpretation of Kant really is, Kant as he is commonly understood plays a significant and representative role in the history of the Humility Thesis from his time until now. Unlike Hume, who takes the Humility Thesis to be a reason for doubting the metaphysical theories that imply it, Kant takes the Humility Thesis to be true of the world—even though his German idealist successors like Fichte and Hegel tend to reject this latter part of his philosophy.

Finally, it is noteworthy that one of the most important texts in the contemporary literature on the Humility Thesis, Langton’s book Kantian Humility: Our Ignorance of Things in Themselves (1998), is an interpretation of Kant. In the book, Langton develops and defends the view that Kant’s Humility Thesis could be understood in terms of a more conventional metaphysics of properties, independently of Kant’s transcendental idealism. Specifically, she argues that Kantian ignorance of things-in-themselves should be understood as ignorance of intrinsic properties. The book and the arguments within are discussed in Sections 3f and 4a.

d. Russell

The pioneer of the Humility Thesis in analytic philosophy is one of the founding fathers of the tradition, Bertrand Russell. Historical studies of Russell’s philosophy show that Russell kept on revising his views, and hence, like many of his ideas, his Humility Thesis only reflects his views during a certain period of his very long life (Tully 2003; Wishon 2015). Russell’s version of the Humility Thesis is found in and popularised by his book, Analysis of Matter (1927). Like the Hindu mystic mentioned above, Russell is best described as a partial Humility theorist, for he also believes that some of those intrinsic properties which are unknowable by scientific means constitute our phenomenal experiences, and can thereby be known through introspecting such experiences.

Russell proposes a theory of the philosophy of mind which he calls psycho-cerebral parallelism. According to the theory, (1) physical properties are ‘causally dominant’, and (2) mental experiences are a part of the physical world and are ‘determined by the physical character of their stimuli’ (Russell 1927a/1992, p. 391). Despite this, our physical science has its limits. Its aim is only ‘to discover what we may call the causal skeleton of the world’ (p. 391, emphasis added); it cannot tell us the intrinsic character of matter. Nevertheless, some such intrinsic character can be known in our mental experiences because those experiences are one such character. As Russell remarks in a work published in the same year as The Analysis of Matter, ‘we now realise that we know nothing of the intrinsic quality of physical phenomena except when they happen to be sensations’ (1927b, p. 154, emphasis added).

Russell’s view that scientifically unknowable intrinsic properties constitute what we now describe as qualia is an influential solution to the hard problem of consciousness in the philosophy of mind, known today as ‘Russellian monism’. Before the mid-1990s, this view had already attracted some followers (see, for example, Maxwell 1978; Lockwood 1989, 1992) and sympathisers (see, for example, Feigl 1967), but it was often overshadowed by the dominant physicalist theories of mind (like the identity theory and functionalism). This situation ended with the publication of Chalmers’s book The Conscious Mind (1996), which has effectively promoted Russellian monism to a more general audience. Further discussion of contemporary Russellian monism is in Section 7.

e. Armstrong

Among the next generation of analytic philosophers after Russell, members of the Australian materialist school developed an interest in the problem of Humility as they inquired into the nature of material entities (Armstrong 1961, 1968; Smart 1963; Mackie 1973); and among them, David Armstrong is a representative advocate of the Humility Thesis (Armstrong 1961, 1968). Armstrong begins with the assumption that physical objects are different from empty space, and then investigates what sort of intrinsic properties of physical objects make the difference between them and empty space (1961, p. 185). He then makes use of an argument which, by his own acknowledgement, largely resembles Hume’s (Armstrong 1968, p. 282; see the argument in Section 3b) to conclude that no posited properties in the physicist’s theory can make the difference between physical objects and empty space. Unlike Hume, who is sceptical of the existence of physical objects, however, Armstrong is not a sceptic and thus believes that what makes the difference must be some properties additional to the physicist’s list of properties. What follows is that these properties must not be within the scope of current physics, and thus we have no knowledge of them.

It is important to note, though, that Armstrong accepts the Humility Thesis rather reluctantly. Accepting the Humility Thesis follows from his theory, and he sees this as a difficulty facing his theory of the nature of physical objects. He says he has no solution to this difficulty (1961, p. 190, 1968, p. 283). Hence, despite his belief that intrinsic properties are currently unknown, Armstrong does not go as far as to accept the now popular full-blown version of the Humility Thesis according to which intrinsic properties must be in principle unknowable (1961, pp. 189-190).

f. Contemporary Metaphysics

Here is a sketch of how the debate has panned out in the more recent literature. For a few decades, the Humility Thesis was often an epistemic complaint made by dispositionalists towards categoricalism, such as the version of the view offered by Lewis (1986). For these philosophers, who take it that all fundamental properties are dispositional, the idea of there being more fundamental intrinsic properties implies that we are irremediably ignorant of the relevant properties. They argue that we should not posit the existence of things we cannot ever know about. Therefore, we should not posit the existence of intrinsic properties (see, for example, Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Ellis & Lierse 1994).

Since the 1990s, there was a trend among categorialists to respond positively to the problem of Humility: it has become their mainstream view that while the existence of intrinsic properties is necessary for the existence of matter, we cannot ever know about them. Blackburn’s short article (1990) is a pioneer of this trend; it inspired Langton’s book Kantian Humility: Our Ignorance of Things in Themselves (1998), from which the term ‘Humility’ originated (Langton acknowledges this in her 2015, p. 106). While the book is meant to be an interpretation of Kant, Langton defends the view that Kant’s Humility Thesis could be understood independently of—and perhaps even incompatible with—his transcendental idealism (Langton 1998, p. 143n7, 2004, p. 129). In addition, Langton argues that the thesis is very relevant to contemporary analytic metaphysics. While her interpretation of Kant is controversial and is often called ‘Langton’s Kant’, the interpretation is often considered as an independent thesis, and has attracted many sympathisers and engendered many discussions in the metaphysics of properties. Examples include discussions of Jackson’s (1998) ‘Kantian physicalism’, Lewis’s (2009) ‘Ramseyan Humility’, and Philip Pettit’s (1998) ‘noumenalism’. As Lewis remarks, ‘my interest is not in whether the thesis of Humility, as she conceives it, is Kantian, but rather in whether it is true’ (Lewis 2009, p. 203)—and he thinks that it is true.

4. Arguments for Humility

Some historically significant arguments for Humility were surveyed above; this section offers an introduction to the most influential arguments for Humility in the contemporary literature. While the arguments will be discussed in turn, it is important to note that the arguments are often taken to be interrelated. Furthermore, some influential authors, as discussed below, use some combination of these and do not advocate the view that such combined arguments could work if disassembled into separate arguments.

a. The Receptivity Argument

The receptivity argument is perhaps the most famous argument for Humility (see, for example, Russell 1912/1978; Langton 1998; Jackson 1998; Pettit 1998). Langton (1998) offers a particularly detailed formulation of it. The argument begins with the assumption that we know about things only though receptivity, in which the relevant things causally affect us (or our experimental instruments) and thus allow us to form adequate representations of them. For instance, Langton remarks that ‘human knowledge depends on sensibility, and sensibility is receptive: we can have knowledge of an object only in so far as it affects us’ (Langton 1998, p. 125). An upshot of this assumption is that we could have knowledge of whatever directly or indirectly affects us (p. 126). In light of this, since things affect us in virtue of their causal and dispositional properties, we can know of these properties.

However, the proponents of the receptivity argument continue, such a condition of knowledge would also impose an epistemic limitation on us: we will be unable to know of things that cannot possibly affect us. While things causally affect us in virtue of their causal and dispositional properties, as long as their intrinsic properties are another class of properties, there is a question as to whether we can know them. To answer this question, we must determine the nature of the relationship between things’ causal and dispositional properties and their intrinsic properties, and whether such a relationship allows for knowledge of intrinsic properties in virtue of the relevant causal and dispositional properties. If this is not the case, we need to determine whether this leads to an insurmountable limit on our knowledge. Jackson, for example, believes that the receptivity argument in the above form is incomplete. He argues that we may have knowledge of intrinsic properties—or, in his work, fundamental properties—via the causal and dispositional properties they bear, and that the receptivity argument in the above form can be completed by supplementing it with the multiple realisability argument (Jackson 1998, p. 23). This is discussed in detail in Section 4c.

For Langton, knowledge of intrinsic properties is impossible because causal and dispositional properties are irreducible to intrinsic properties in the sense that any of the former does not supervene on any of the latter (Langton 1998, p. 109). Nonetheless, the irreducibility thesis does not spell an end to this discussion. On the one hand, Langton elsewhere points out that the receptivity argument still works if there are instead necessary connections between the relevant properties—as long as they remain different properties (Langton & Robichaud 2010, p. 173). On the other hand, James Van Cleve worries that Langton’s argument from irreducibility is nevertheless incomplete, for a non-reductive relationship alone does not imply the impossibility of intrinsic knowledge (Van Cleve 2002, pp. 225-226). In sum, regardless of whether Langton’s irreducibility thesis is correct, there are some further questions as to whether or not we are receptive to intrinsic properties.

b. The Argument from Our Semantic Structure

The second argument for Humility appeals to the ways in which the terms and concepts in our language are structured (see, for example, Blackburn 1990; Pettit 1998; Lewis 2009). Depending on particular formulations of the argument, the language concerned could be the language of our scientific theories or all human languages. Nonetheless, all versions of this argument share the common argumentative strategy according to which all terms and/or concepts found in the relevant language(s) capture only causal properties, dispositional properties, and structural properties but not intrinsic properties. The idea is that if our knowledge of the world is formulated by the language(s) concerned, then there will be no knowledge of intrinsic properties.

i. Global Response-Dependence

One version of this argument is developed by Pettit (1998). Note that his commitment to a Humility Thesis under the name of ‘noumenalism’ is also a reply to Michael Smith and Daniel Stoljar’s (1998) argument that his view implies noumenalism. In response to the argument, Pettit accepts noumenalism as an implication of his view (Pettit 1998, p. 130).

Pettit advocates a thesis called global response-dependence (GRD), which he considers to be an a priori truth about the nature of all terms and concepts in our language. According to GRD, all terms and concepts in our language are either (1) defined ostensively by the ways that their referents are disposed to causally impact on normal or ideal subjects in normal or ideal circumstances, or (2) are defined by other terms and concepts which eventually trace back to those of the former kind (Pettit 1998, p. 113-114). If this is so, then it follows that all terms and concepts are in effect defined dispositionally with reference to their referents’ patterns of causal behaviours.  If there are any non-dispositional properties that ground the dispositions, then, as Pettit remarks, ‘we do not know them in their essence; we do not know which properties they are’ (pp. 121-122).

Of course, there is a question as to whether GRD is an attractive thesis. It is controversial, and its validity is an independent open question that goes beyond the scope of this article. In Pettit’s case, he commits himself to an epistemology (Pettit 1998, p. 113) that is very similar to Langton’s receptivity thesis that is discussed in Section 4a.

ii. Ramseyan Humility

The most famous version of the argument from our semantic structure is developed by Lewis (2009), even though Blackburn offers an earlier rough sketch of the argument which appeals to the Lewisian semantic theory (Blackburn 1990, p. 63), and Pettit anticipates that such a theory would imply the Humility Thesis just as his GRD does (Pettit 1998, p. 128). The argument is based on the Ramsey-Lewis method of defining theoretical terms in scientific theories, which Lewis develops in his early article ‘How to define theoretical terms’ (1970), and which is in turn inspired by Frank Ramsey—this is why Lewis calls his version of the Humility Thesis ‘Ramseyan Humility’.

Lewis is a scientific realist. He asks us to suppose that there is a final scientific theory T about the natural world. In his view, theory T, like all other scientific theories, consists of O-terms and T-terms. O-terms are the terms that are used in our older and ordinary language, which is outside theory T; T-terms are theoretical terms that are specifically defined in theory T. Each T-term has to be defined holistically in relation with other T-terms by O-terms. The relevant relations include nomological and locational roles in theory T (Lewis 2009, p. 207). Some such nomological and locational roles named by T-terms would be played by fundamental properties, while Lewis assumes that none of these properties will be named by O-terms. He writes, ‘The fundamental properties mentioned in T will be named by T-terms. I assume that no fundamental properties are named in O-language, except as occupants of roles’ (p. 206). Although Lewis in his 2009 article does not make it clear why he assumes so, in his other work (1972) he argues that the use of O-terms is to name and define nomological and locational relations.

With the assumption that the roles played by intrinsic properties are identified solely by relational means, Lewis makes the following argument. While theory T is uniquely realised by a particular set of fundamental properties in the actual world, theory T is incapable of identifying such properties, namely individuating the exact fundamental properties that realise it. This is because, for theory T, fundamental properties are mere occupants of the T-term roles defined by O-terms (Lewis 2009, p. 215), which are, in turn, all about their nomological and locational roles. But then theory T is unable to tell exactly which fundamental property occupies a particular role (p. 215)—as Lewis remarks, “To be the ground of a disposition is to occupy a role, but it is one thing to know that a role is occupied, another thing to know what occupies it” (p. 204). Lewis has much more to say about his argument in relation to the multiple realisability argument, which he takes to be another indispensable core part of his argument, and which will be discussed in detail in section 4c.

Before we go on to the multiple realisability argument, there is again the further question as to why we should accept the underlying semantic theory of the argument—in this case the Ramsey-Lewis model of scientific theories. Indeed, some critics of Lewis’s Ramseyan Humility target the conceptual or scientific plausibility of the semantic theory (Ladyman & Ross 2007; Leuenberger 2010). Rather than a defence of an independent thesis, Lewis’s 2009 article seems to be an attempt to develop the Ramseyan Humility Thesis as a consequence of his systematic philosophy, which he has been developing for decades. In any case, taking into account the influence of the Lewisian systematic philosophy in contemporary analytic philosophy, its entailment of the Humility Thesis is of considerable philosophical significance.

c. The Multiple Realisability Argument

The multiple realisability argument is a particularly popular argument for Humility, and is endorsed by a number of Humility theorists regardless of whether they also offer independent arguments for Humility (see, for example, Lewis 2009; Jackson 1998; Yates 2018; see also Russell 1927a/1992, p. 390; Maxwell 1978, p. 399; Pettit 1998, p. 117). The basic idea is that the causal, dispositional, and structural properties of things with which we are familiar are roles. We can at best know that such roles have some intrinsic properties as their realisers, but we have no idea which intrinsic properties actually do the realizing job. For these roles can also be realised by some alternative possible sets of intrinsic properties, and we cannot distinguish the relevant possibilities from the actual ones.

As mentioned above, some authors such as Jackson and Lewis believe that their receptivity arguments or arguments from our semantic structure are themselves incomplete and have to be supplemented with the multiple realisability argument. For example, Jackson believes that our receptive knowledge is multiply realisable by different sets of fundamental properties (Jackson 1998; see also Section 4a); and Lewis believes that our final scientific theory is multiply realisable by different sets of fundamental properties (Lewis 2009; see also Section 4b). Multiple realisability is for them the reason why we cannot possibly know of intrinsic properties via our receptive knowledge or via the final scientific theory. Here we see that the multiple realisability argument is often considered as an indispensable component of more complex arguments.

Whereas certain formulations of the multiple realisability argument appeal to metaphysical possibilities (Lewis 2009), Jonathan Schaffer—a critic of the argument—argues that epistemic possibilities alone suffice to make the argument work, since its aim is to determine the nature of our knowledge (Schaffer 2005, p. 19). Hence, the argument cannot be blocked by positing a metaphysically necessary link between intrinsic properties and their roles that eliminates the metaphysical possibilities suggested by the proponents of the argument.

Lewis and Jackson offer detailed discussion of how some forms of multiple realisation are possible. Three corresponding versions of the multiple realisability argument are briefly surveyed in turn below.

The permutation argument is offered by Lewis (2009). It begins with the assumption that the laws of nature are contingent (p. 209). Lewis argues that a scenario in which the realisers of two actual dispositional roles are swapped will not change anything else, including the nomological roles they play and the locations they occupy. Hence, a permutation of realisers is another possible realisation of our scientific theory. Since our science cannot distinguish between the actual realisation of nomological roles and its permutations, we do not know which properties it consists of.

The replacement argument is also offered by Lewis (2009). Unlike the permutation argument, this argument does not appeal to an exchange of roles. Instead, it begins with the assumption that the realisers of dispositions are replaceable by what Lewis calls idlers and aliens. Idlers are among the fundamental properties within the actual world, but they play no nomological role; and aliens are fundamental properties that only exist in nonactual possible worlds (p. 205). Multiple realisability then follows. Again, Lewis argues that replacing the realisers of the actual nomological roles with idlers and aliens will not change anything else; what we have is simply other possible realisations of our scientific theory. And again, since our science cannot distinguish between the actual realisation of nomological roles and its replacements, we do not know which properties realise these roles in the actual world.

The succession argument is offered by Jackson (1998). The argument appeals to the possibility of there being two distinct fundamental properties realizing the same nomological role in our science in succession (Jackson 1998, pp. 23-24). For Jackson, it is impossible for our science to distinguish whether or not this possibility is actualised—specifically, it is impossible for our science to distinguish whether the nomological role is actually realised by one or two properties. This reveals that we do not know which property actually plays the nomological role.

5. Arguments against Humility

We have seen above some influential arguments for Humility in the literature. In what follows, the main arguments against the thesis will be surveyed.

a. The Objection from Reference-Fixing

An immediate objection considered by Pettit and Lewis in their defence of Humility is the objection from reference-fixing (Pettit 1998, p. 122; Lewis 2009, p. 216; but see Whittle 2006, pp. 470-472). The idea is that we can refer to an intrinsic property as the bearer of a dispositional property, and thereby identify it and know of it. For example, when asked what the bearer of dispositional property D is and whether we have knowledge of it, we may respond in the following way: ‘The bearer of D is whatever bears D; and we know that it bears D.’

Unsurprisingly, Pettit and Lewis are not convinced. Pettit responds, ‘under this picture it is no surprise that we are represented as knowing those very properties, not in their essence, but only their effects’ (Pettit 1998, p. 122). Lewis, on the other hand, dismisses the objection as ‘cheating’ (Lewis 2009, p. 216). Consider the answer concerning the bearer of dispositional property D above. On Lewis’s view, while that answer is undoubtedly true, we simply have no idea which particular proposition is expressed by the answer. Some of the relevant issues are discussed in Section 2.

Ann Whittle, an advocate of the objection from reference-fixing, argues that Humility theorists like Lewis set an unreasonably high bar for the condition of identification (Whittle 2006, pp. 470-472; but see Locke 2009, p. 228). For it seems that, in the case of our ordinary knowledge, we typically identify things in virtue of their effects and connections to us. For example, when we have knowledge of the historical figure Napoleon, we identify him via the great things he has done and the spatiotemporal connections he has with us.  By contrast, it is difficult for our knowledge to single out a particular person across possible worlds as Lewis’s condition requires us to, for someone else might have done the same things as Napoleon did. And if we allow for knowledge about Napoleon according to our ordinary conditions of identification, there seems to be no reason for not allowing for knowledge of intrinsic properties under the same consideration.

b. The Objection from Vacantness

The objection from vacantness is developed by Whittle (2006, pp. 473-477; but see Locke 2009, p. 228). This objection specifically targets Humility theorists like Lewis and Armstrong. According to Whittle, Lewis and Armstrong have the background belief that fundamental intrinsic properties are simple and basic to the extent that they are featureless in themselves, with the only exception of their bare identities. With this in mind, the only interesting nature of these properties is their being bearers of causal, dispositional, or structural properties, and nothing else. If this is so, we are actually not going to miss out on anything even if we grant the Humility Thesis to be true. Lewis’s and Armstrong’s Humility theses, then, at best imply that we would be ignorant of the bare identities of intrinsic properties. Hence, ‘there is no reason to regard it as anything more than a rather esoteric, minimal epistemic limitation’ (p. 477).

While Whittle’s charge of esotericism is debatable, it is noteworthy that her interpretation of Lewis’s and Armstrong’s metaphysical frameworks is shared by some other philosophers (Chalmers 2012, p. 349; Stoljar 2014, p. 26)—Chalmers, for example, calls them a ‘thin quiddity picture’ (Chalmers 2012, pp. 349). Nonetheless, it is also important to note that, as these philosophers point out, there are some alternative versions of the Humility Thesis which count as ‘thick quiddity pictures’, and according to which intrinsic properties have substantial qualities (for example, Russell 1927a/1992; Heil 2004).

c. The Objection from Overkill

The Humility Thesis is an attempt to draw a very specific limit to our knowledge: its aim is to show that knowledge of intrinsic properties is impossible, despite the fact that other knowledge remains possible. Specifically, if we can know of intrinsic properties, then the thesis fails; but by contrast, if the purported ignorance goes too far and applies equally to our ordinary knowledge, then the resultant scepticism would render the thesis trivial and implausible. For one thing, if we are ignorant of everything, then it would be very surprising that knowledge of intrinsic properties is an exception. For another, scepticism seems to be an unacceptable conclusion which should be avoided.

The objection from overkill, then, is that the specific boundary cannot be achieved: the claim is that there are no good arguments that favour the Humility Thesis but exclude scepticism of some other kind; a possible further claim is that there is no way to avoid this wider scepticism without rendering the Humility Thesis weak or erroneous (Van Cleve 2002; Schaffer 2005; Whittle 2006; Cowling 2010; cf. Langton 2004; but see Locke 2009). For example, Van Cleve argues that Langton’s argument from receptivity and irreducibility is too strong and must have something wrong with it. For if Hume is correct that causal laws are not necessary, then nothing necessitates their effects on us – namely, these effects are irreducible to the relevant things. But if we follow Langton’s argument, then such irreducibility means that we know nothing (Van Cleve 2002, pp. 229-234). Schaffer argues that Lewis’s appeal to the distinction between appearance and reality, and the multiple realisability of appearance, is shared by external world sceptics (Schaffer 2005, p. 20). In addition, Schaffer argues that the standard responses to external world scepticism such as abductionism, contextualism, deductionism, and direct realism apply equally to the Humility Thesis (pp. 21-23).

In response, a possible counter-strategy would be to argue that standard responses to scepticism do not apply to the Humility Thesis (Locke 2009). For example, Dustin Locke argues that when we do abductions, we identify the distinguishing features of competing hypotheses, and thereby pick out the best hypothesis among them. But the different intrinsic realisations of our knowledge considered by the multiple realisation argument exhibit no such distinguishing features (Locke 2009, pp. 232-233).

6. Alternative Metaphysical Frameworks

Rather than offering straightforward arguments against Humility, some critics of Humility instead develop alternative metaphysical frameworks to the Humility Thesis and the kind of categoricalism that underlies it (see Section 1b). These alternative frameworks, if true, undercut the possibility of there being unknowable intrinsic properties. In what follows, some such metaphysical frameworks are surveyed.

a. Ontological Minimalism: Appealing to Phenomenalism, Dynamism, or Dispositionalism

Philosophers have a very long tradition of avoiding ontological commitments to unobservables and unknowables, such as substance, substratum, Kantian things in themselves, the intrinsic nature of things, and the divine ground and mover of everything. Among these philosophers, phenomenalists and idealists have taken the perhaps most extreme measure: with a few exceptions, anything beyond immediate mental phenomena is eliminated (Berkeley 1710/1988; Hume 1739/1978; Mill 1865/1996; Clifford 1875/2011; Mach 1897/1984). Among such approaches to ontology, a phenomenalism that rejects matter is indeed Hume’s response to the Humility problem: he is altogether sceptical about the existence of matter, together with its unknowable intrinsic nature (Hume 1739/1978; see Section 3b). In other words, while Hume agrees that if matter exists then we are ignorant of its intrinsic nature, he does not believe there is such a thing in the world for us to be ignorant of.

Although many other philosophers regard phenomenalism and idealism as far too radical, the ontological minimalist attitude is nonetheless available to philosophers with a more realist and naturalist stance. The idea is that if the dynamics of things—their motions, forces, dynamic processes, relational features, and so forth—are their only scientifically accessible features, then we should attribute to them only such features and no further mysterious features. Moreover, we should identify these former features as their sole natures. This minimalist dynamist line of thought is not uncommonly found in the thoughts of the modern scientific naturalists—philosophers and scientists alike (see, for example, Diderot 1770/1979; d’Holbach 1770/1820, Pt. I, Ch. 2; Faraday 1844; Nietzsche 1887/2006, Ch. 1.3; Schlick 1925b/1985, Pt. III.A; see also a discussion of Michael Faraday’s dynamism and its contemporary significance in Langton & Robichaud 2010, pp. 171-173).

The most prominent incarnation of dynamism in contemporary metaphysics is dispositionalism—the idea that all fundamental properties are dispositional properties (see Section 1b). Contemporary dispositionalists have independently discovered the ontological minimalist attitude in their debates with their rivals, the categorialists, who believe that all fundamental properties are intrinsic, categorical properties. The interesting fact here is that the mainstream dispositionalists and categorialists in contemporary metaphysics actually share an agreement regarding Humility: many from both sides agree that if intrinsic properties of the kind described by categoricalism exist, then we are irremediably ignorant of them (Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Ellis & Lierse 1994, p. 32; Hawthorne 2001, pp. 368-369; Black 2000, pp. 92-95; Bird 2005, p. 453; Ney 2007, pp. 53-56; see also Whittle 2006, pp. 485-490 for a related argument). However, whereas the categorialists concede such an ignorance, the dispositionalists argue that we should not believe in the existence of something we simply cannot know about. Put simply, much like the categorialists, the dispositionalists too agree that categoricalism implies the Humility Thesis, but they take this as good reason for rejecting categoricalism.

There are at least two issues here related to a dispositionalism that grounds such an ontological minimalist attitude. The first issue can be considered in light of Lewis’s question: ‘Why should I want to block [the Humility argument]? Why is Humility “ominous”? Whoever promised me that I was capable in principle of knowing everything’ (Lewis 2009, p. 211)? The minimalist dispositionalists need some epistemic principle to justify their minimalist attitude. Some of them appeal to some more a priori epistemic principles according to which we should not posit anything that cannot contribute to our knowledge (Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Black 2000, pp. 92-95; Bird 2005, p. 453). Others hold a more scientific attitude according to which our ontological posits should not go beyond science, together with the assumption that all properties mentioned in science are dispositional (Hawthorne 2001, pp. 368-369; cf. Ellis & Lierse 1994, p. 32; Ney 2007, pp. 53-56).

The second issue is that the status of the ontological minimalist argument is one of the many questions in the debate between categoricalism and dispositionalism. Hence, it seems that the argument must be considered alongside other arguments—such as the ones mentioned in Section 1b—when choosing between the two views.

b. Physics and Scientific Eliminativism

The renowned physicist Werner Heisenberg took Humility to be a consequence of the atomistic theories of the kind defended by the Ancient Greek philosopher Democritus—such theories cannot possibly offer a more fundamental description of the atom than those of the atoms’ motions and arrangements (Heisenberg 1958/2000, pp. 34-35). However, like many other early- to mid-20th century scientists and philosophers (for example, Whitehead 1925/1967; Schlick 1925a/1979), Heisenberg argued that such a conception of matter is already old-fashioned and incompatible with contemporary physics. On his view, quantum mechanics has provided us with a novel metaphysical worldview: the ‘thing in itself’ of a particle is a mathematical structure (Heisenberg 1958/2000, p. 51; but see Eddington 1929).

In contemporary metaphysics, the idea that quantum mechanics leads to a scientific eliminativism about intrinsic properties is defended by James Ladyman and Don Ross (2007). Specifically, Ladyman and Ross argue their ontic structural realism (OSR) is a theoretical framework that is better than categoricalism and the Humility Thesis—including Langton’s, Lewis’s, and Jackson’s versions (Ladyman & Ross 2007, p. 127n53)—and should thus simply replace them. OSR is the view that the relational structure of the world is ontologically fundamental, and is not one that consists of individuals with intrinsic properties. Identity and individuality of objects, on their view, depend only on the relational structure.

OSR is developed from an analysis of empirical science, especially quantum physics where quantum particles are found not to have exact space-time locations. On Ladyman and Ross’s view, quantum particles and field theory should be given a non-individualistic interpretation in which concepts of individual objects should be eliminated (p. 140). We come to have our ordinary concepts of individual objects only because of the distinguishability or discernibility of things, not due to their objective individuality (p. 134). With this in mind, Ladyman and Ross argue that the standard assumptions in metaphysics are all challenged by OSR. They list assumptions as follows:

(i) There are individuals in space-time whose existence is independent of each other. Facts about the identity and diversity of these individuals are determined independently of their relations to each other.

(ii) Each has some properties that are intrinsic to it.

(iii) The relations between individuals other than their spatio-temporal relations supervene on the intrinsic properties of the relata (Humean supervenience).

(iv) The Principle of the Identity of Indiscernibles] is true, so there are some properties (perhaps including spatio-temporal properties) that distinguish each thing from every other thing, and the identity and individuality of physical objects can be accounted for in purely qualitative terms. (Ladyman & Ross 2007, p. 151)

Unsurprisingly, for Ladyman and Ross, Lewis and Jackson are merely some traditional metaphysicians who assume the existence of individuals with intrinsic natures, but ‘our best physics puts severe pressure on such a view’ (Ladyman & Ross 2007, p. 154).

In sum, scientific eliminativists, much like the ontological minimalists discussed in Section 6a, refuse to posit the existence of unknowable intrinsic properties. However, they do not do so only because of ontological parsimony; rather, they believe that categoricalism and the Humility Thesis are attached to some old-fashioned, prescientific worldview, and that our best science has turned out to offer a different, more advanced worldview which simply makes no such commitments. The key is replacement rather than curtailment. It is important to note that Ladyman and Ross’s OSR and their scientific eliminativism are both philosophical interpretations of physics rather than part of the physical theories themselves, and it remains open to debate whether these interpretations are the best ones (compare Eddington 1929; Chalmers 1996).

c. Rationalist categoricalism

Different from the previous two metaphysical frameworks, the third alternative metaphysical framework to the Humility Thesis is a variant rather than a denial of categoricalism. This view might be called a ‘rationalist categoricalism’, following J. L. Mackie’s use of the term ‘rationalist view’ to describe a particular response to Humility, though he rejects the view (Mackie 1973, p. 149). According to this view, intrinsic properties not only exist but could, against the Humility Thesis, also be properly described by our best physical theories or their successors (Smart 1963; Ney 2015; Hiddleston 2019).

Let us suppose that our current physical theories are final and that some of these theories have reached the most fundamental level possible. The rationalist categorialist argues that what the physicist calls fundamental properties, such as mass and charge, are attributed to objects as their intrinsic properties, not as dispositional properties. There is of course no doubt that, as pointed out by the proponents of the receptivity argument for Humility, we always discover the properties of an object in experiments and observations, which means that we measure these properties via their causal effects. Nonetheless, while the properties are measured and defined causally in terms of the relevant dispositions, they could in themselves be intrinsic and categorical properties. For the properties should not be identified with the means of measurements, but rather should be understood as something revealed by them.

Mackie, though not a friend of rationalist categoricalism, nicely illustrates the rationalist categorialist’s interpretation of the relation between mass and its relevant dispositions—which are, presumably, the active gravitational force, the passive gravitational force, and inertia—in the following passage:

Someone who takes what I have called a rationalist view will treat mass as a property which an object has in itself, which is inevitably a distinct existence from most of the force-acceleration combinations which would reveal it, and yet whose presence entails all the conditionals connecting resultant force with acceleration. (Mackie 1973, p. 149)

And in response to the receptivity argument for Humility, J. J. C. Smart, a sympathiser of rationalist categoricalism, argues that the Humility theorist commits herself to verificationism of some kind:

We could explore the possibility of giving a theory of length, mass, and so on, as absolute and not relational.… We do indeed test propositions about length relationally, but that to go on to say that length is purely relational is to be unduly verificationist about meaning. (Smart 1963, p. 74)

Unlike mainstream categoricalism and dispositionalism, rationalist categoricalism remains a minority view, but it has nonetheless attracted some serious sympathisers (Smart 1963; Ney 2015; Hiddleston 2019).

7. The Humility Thesis and Russellian Monism

As was mentioned in the discussion of Russell’s view on Humility in Section 3d, Russell developed a peculiar mind/body theory which is now called Russellian monism (for another pioneer of Russellian monism, see Eddington 1929), and this view has recently gained a wide amount of traction and followers in the philosophy of mind. The current version of the view is typically framed as a solution to the hard problem of consciousness. According to the problem, our consciousness has a particular kind of feature, namely qualia, which seems to persistently resists any physical explanations (Chalmers 1995; see also the article on ‘The hard problem of consciousness’). Qualia are the ‘subjective feels’, ‘phenomenal qualities’, or ‘what it is like’ for a conscious subject to have certain experiences. Russellian monism, then, is the view that those unknowable intrinsic properties described by the Humility Thesis play a role in the constitution of our qualia.

Apart from its own significance in the philosophy of mind, Russellian monism also has a complex relationship with the Humility Thesis. For, on the one hand, it is developed from the Humility Thesis, for it makes use of the unknowable intrinsic properties described by the Humility Thesis to account for qualia. On the other hand, it is sometimes considered and developed as a response to the Humility Thesis, for it leads to the possibility that we may know certain intrinsic properties as we introspect our own qualia.

In what follows, some ontological issues surrounding the constitution of qualia by intrinsic properties are surveyed. Following that there is a survey of the epistemic issues surrounding the introspective knowledge of intrinsic properties.

a. Constitution: from Intrinsic Properties to Qualia

To begin with, there is a question as to why someone would be attracted to the view that unknowable intrinsic properties play a role in the constitution of our qualia. It traces back to the reason why many philosophers think that the hard problem of consciousness is particularly hard to solve. For these philosophers, qualia seem intrinsic and non-causal—it is conceivable that two people might have different qualia, but still exhibit the exactly same neurophysiological and behavioural responses—and thus the standard physical properties which seem causal cannot possibly account for qualia (Levine 1983, 2001; Chalmers 1995; 1996, 2003; Kim 2005; Goff 2017; Leibniz 1714/1989; Russell 1927b). But if intrinsic properties of the kind described by the Humility Thesis are likewise intrinsic and non-causal, then it seems that they can be a part of a good explanation of qualia (Russell 1927b; Goff 2017). Furthermore, the use of intrinsic properties in explaining qualia—unlike most other alternatives to classical physicalism, such as substance dualism—avoids positing idiosyncratic entities which appear to be in conflict with a unified, elegant, and scientifically respectable ontological framework (Chalmers 1996, pp. 151-153; Heil 2004, pp. 239-240; Seager 2009, p. 208; Stoljar 2014, p. 19; Goff 2017).

Russellian monists disagree on what intrinsic properties have to be like in order for these properties to be the constituents of qualia. This leads to the variety of versions of Russellian monism, and there are at least four such major versions: (1) Russellian neutral monism, (2) Russellian panpsychism, (3) Russellian panprotopsychism, and (4) Russellian physicalism. (1) Russellian neutral monism is endorsed by Russell. According to this view, intrinsic properties are neither physical nor mental, but rather are neutral properties that are neutral between the two (Russell 1921/1922; Heil 2004). (2) For the Russellian panpsychist, intrinsic properties that constitute our qualia must themselves also be qualia, albeit being smaller in scale. Since such intrinsic properties are presumably found in fundamental physical entities such as electrons, up quarks, down quarks, gluons, and strings, the Russellian panpsychist also accepts that such entities possess qualia. This thus leads to a commitment to panpsychism, the view that mental properties are ubiquitous (Seager 2009). (3) Russellian panprotopsychism is a view similar to Russellian panpsychism, but it denies that the intrinsic properties that constitute qualia must also be some kind of qualia. Rather, it takes these microscale properties to be ‘proto-qualia’, which are similar in nature to qualia (compare Chalmers 1996, 2015). (4) Finally, for the Russellian physicalist, intrinsic properties should be counted as physical due to their being possessed by physical entities like electrons. Russellian physicalists also disagree with Russellian panpsychists and Russellian panprotopsychists that the raw materials of qualia must themselves be qualia or be similar to qualia, and so distance themselves from panpsychism and panprotopsychism (Stoljar 2001; Montero 2015; see also Section 8). Due to the recent popularity of Russellian monism, the above views are all ongoing research programs. Some readers may see the striking similarity between Russellian panpsychism and some ancient and pre-modern panpsychistic views mentioned in Section 3a. So perhaps surprisingly, the Humility Thesis provides room for panpsychistic views to persist.

Nonetheless, the use of the Humility Thesis and the relevant intrinsic properties in accounting for qualia leads to a list of related discussions. Firstly, philosophers disagree on whether it is really a good explanation of qualia. For one thing, it is questionable whether an explanation that appeals to an unknowable explanans could do any real explanatory work (Majeed 2013, pp. 267-268). Some Russellian monists, in response, argue that our theory of mind should not only aim at explanatory success according to scientific standards, but should also aim at truth (Goff 2017). For another, intrinsic properties may seem to be an adequate and attractive explanans only under the intuitive assumption that qualia are intrinsic and non-causal, but not everyone agrees that consciousness studies should hold onto such intuitive assumptions. And if such assumptions are revisable, then it might be less obvious that intrinsic properties are the adequate explanans of qualia (Chan & Latham 2019; compare Churchland 1996). Of course, there is an old debate in the philosophy of mind as to whether or not our intuitive assumptions concerning qualia are accurate—and whether or not they are accurate enough to support non-naturalistic theories of mind (Levine 1983, pp. 360-361; Chalmers 1997, 2018; contra Churchland 1988, 1996; Dennett 1991, pp. 68-70).

Secondly, if it is the case that the intrinsic properties of physical entities constitute qualia, then the relevant intrinsic properties are supposedly those of fundamental physical entities such as electrons, up quarks, down quarks, gluons, and strings, or those that play the roles of basic physical properties such as mass and charge. But this leads to the question—which is often called ‘the combination problem’—as to how such microphysical intrinsic properties can ever combine into our qualia, which appear to be macro-scale entities (Hohwy 2005; Goff 2006; Majeed 2013; Chalmers 2017; Chan 2020b). In response, Goff (2017) makes use of Humility, and thereby argues that the bonding of intrinsic properties is likewise beyond our grasp. Other sympathisers of Russellian monism argue that all theoretical frameworks in philosophy of mind need further development: Russellian monism is no exception, and thus should not be expected to be capable of accounting for every detail of how our mind works (Stoljar 2001, p. 275; Alter & Nagasawa 2012, pp. 90-92; Montero 2015, pp. 221-222).

Thirdly, there seems to be a gap between intrinsic properties and causal and dispositional properties in the Humility Thesis: the intrinsic properties are making no substantive contribution to the causal makeup of the world apart from grounding it. For many, the use of the Humility Thesis in explaining qualia means that the gap is inherent in the mind/body relation in Russellian monism—namely, the qualia constituted by the intrinsic properties will not be the causes of our cognitive activities and bodily behaviours. This, in turn, means that Russellian monism ultimately collapses into epiphenomenalism (Braddon-Mitchell & Jackson 2007, p. 141; Howell 2015; Robinson 2018; compare Chan 2020a). For most contemporary philosophers of mind, the epiphenomenalist idea that our phenomenal consciousness possesses no causal profile and cannot cause our cognitive activities and bodily behaviours is very implausible. If these philosophers are correct, and if Russellian monism makes the same commitment, then it is equally implausible. In response, some sympathisers of Russellian monism argue that there is a more intimate relationship between intrinsic properties and causal and dispositional properties, and that this relationship makes intrinsic properties causally relevant or efficacious (Chalmers 1996, pp. 153-154; Seager 2009, pp. 217-218, Alter & Coleman 2020).

b. Introspection: from Qualia to Intrinsic Properties

Russellian monism also allows for a possible response to Humility which traces back to ancient religious and philosophical mysticism: the idea that if intrinsic properties constitute our qualia, then we may know of the former via introspection of the latter. The idea is taken seriously by a number of prominent Humility theorists (Blackburn 1990, p. 65; Lewis 2009, pp. 217-218; Langton & Robichaud 2010, pp. 174-175), and is also discussed by some Russellian monists (Russell 1927b; Maxwell 1978, p. 395; Heil 2004, p. 227; Rosenberg 2004; Strawson 2006).

There are currently two major proposals regarding how the introspection of intrinsic properties may work. The first might be called a Schopenhauerian-Russellian identity thesis. The thesis is developed by Russell and its form can be found earlier in Arthur Schopenhauer’s work:

We now realise that we know nothing of the intrinsic quality of physical phenomena except when they happen to be sensations. (Russell 1927b, p. 154, emphasis added)

We ourselves are the thing-in-itself. Consequently, a way from within stands open to us as to that real inner nature of things to which we cannot penetrate from without. (Schopenhauer 1818/1966, p. 195, original emphasis)

What Russell and Schopenhauer seem to be saying is that certain mental experiences and certain intrinsic properties (or, in Schopenhauer’s case, Kantian things in themselves) are the same thing, and that the former are a part of us of which we can obviously know. Hence, since we are capable of knowing the former, then we are automatically capable of knowing the latter.

Another proposal, the identification thesis, is formulated by Lewis. Lewis ultimately rejects it because he finds it incompatible with materialism, though he nonetheless takes it, when combined with Russellian panpsychism, as a possible reply to the Humility Thesis (Lewis 1995, p. 142; 2009, p. 217; for discussion, see Majeed 2017). The thesis concerns the nature of our experience of qualia: as we experience a quale, we will be able to identify it, to the extent that its essence—something it has and nothing else does—will be revealed to us (1995, p. 142). While Lewis believes that the thesis is ‘uncommonly demanding’ (1995, p. 141), he also believes that it is an obvious part of our folk psychology and is thus deserving of serious assessment (but see Stoljar 2009):

Why do I think it must be part of the folk theory of qualia? Because so many philosophers find it so very obvious. I think it seems obvious because it is built into folk psychology. Others will think it gets built into folk psychology because it is so obvious; but either way, the obviousness and the folk-psychological status go together. (Lewis 1995, p. 142)

Humility theorists typically dismiss introspective knowledge of intrinsic properties by doubting Russellian monism (Blackburn 1990, p. 65; Langton & Robichaud 2010, p. 175) or by emphasizing their sympathies to standard physicalism in the philosophy of mind (Lewis 2009, p. 217). Nonetheless, some further surrounding issues have been raised. The first might be called a reversed combination problem (see the discussion on the combination problem in Section 7a). The problem is that even if the Schopenhauerian-Russellian identity thesis or the identification thesis is correct, this only means that we can thereby know some aggregates of fundamental, intrinsic properties—for a quale is supposedly constituted by a large sum of fundamental, intrinsic properties, not a single fundamental, intrinsic property (Majeed 2017, p. 84). Just as we cannot know of fundamental physical particles just by knowing of a cup they constitute, it is likewise not obvious that we can know of fundamental, intrinsic properties via knowing the qualia they constitute. Hence, it is not obvious that the two epistemic theses offer any real solution to Humility, unless we consider a quale as an intrinsic property which is itself a target of the Humility Thesis.

The second issue is related to the alleged similarity between Russellian monism and epiphenomenalism. For many, epiphenomenalism is committed to what Chalmers calls the paradox of phenomenal judgement: if epiphenomenalism is true—if qualia are causally inefficacious—then our judgments concerning qualia cannot be caused by qualia, and thus cannot be considered as tracking them (Chalmers 1996, p. 177). Since, as discussed in Section 7a, Russellian monism appears to share some of crucial theoretical features of epiphenomenalism, certain critics of Russellian monism thereby argue that Russellian monism faces the same paradox as epiphenomenalism does (Hawthorne 2001, pp. 371-372; Smart 2004, p. 48; Braddon-Mitchell & Jackson 2007, p. 141; Chan 2020a). If this is correct, then Russellian monism cannot even allow for knowledge of qualia—including Russellian monism itself—let alone that of intrinsic properties. It is, however, noteworthy that some sympathisers of epiphenomenalism argue that epiphenomenalism can actually account for knowledge of qualia (Chalmers 1996, pp. 196-209).

8. The Humility Thesis and Physicalism

 Physicalism is the view that everything in the actual world is physical. Despite the fact that a number of prominent Humility theorists are also famous physicalists (Armstrong 1968; Jackson 1998; Lewis 2009)—Jackson even calls his version of the Humility Thesis ‘Kantian physicalism’ (Jackson 1998, p. 23)—questions have been raised as to whether the Humility Thesis and physicalism are really compatible. Specifically, the questions are of two kinds. The first concerns whether or not we are in a position to know that an unknowable property is physical; the second concerns whether or not there could be an unknowable intrinsic property that is physical.

The first question is raised by Sam Cowling (2010, p. 662), a critic of the Humility Thesis, as a part of his formulation of the objection from overkill (see Section 5c). On his view, if the Humility Thesis is true, then systematic metaphysics is impossible. For we cannot judge whether our world is a physical one or one of Berkeleyian idealism in which all things are ultimately ideas in God’s mind. In fact, Langton and Robichaud (2010, pp. 175-176) positively hold such a radical version of the Humility Thesis.

In response to Cowling, Tom McClelland (2012) argues that the kind of knowledge he discusses is not really what the Humility Thesis concerns. Specifically, on McClelland’s view, the Humility Thesis concerns only our knowledge-which of intrinsic properties, which concerns only the distinctive features that make the property differ from any other (pp. 68-69). In light of this, the knowledge that intrinsic properties are physical does not concern the distinctive features of these intrinsic properties, and it is thus compatible with the Humility Thesis. Of course, as discussed in Section 2, even if McClelland is correct, there remains a question as to whether all important versions of the Humility Thesis concern only knowledge-which, and whether those other versions would nonetheless lead to the problem raised by Cowling—we have at least seen that Langton and Robichaud dismiss the knowledge-which version of the Humility Thesis defended by McClelland.

More philosophers raise the second question concerning the compatibility between the Humility Thesis and physicalism, namely whether or not there could be an unknowable intrinsic property that is physical (Foster 1993; Langton 1998, pp. 207-208, Braddon-Mitchell & Jackson 2007, p. 141; Ney 2007). These philosophers define the physical as whatever is posited by physics, but if the Humility Thesis is true, intrinsic properties are necessarily out of reach of physics, and thereby by definition cannot possibly be counted as physical.

In response, Stoljar (2001) and Barbara Montero (2015) argue that the physicalist should accept some alternative conceptions of physicalism (and the physical) which could accommodate the Humility Thesis. They thus both advocate some top-down conceptions of physicalism (compare Maxwell 1978; Chalmers 2015; for a survey, see Chan 2020b). These top-down conceptions first recognise some things as physical—which are, in Stoljar’s case, paradigmatic physical objects like tables and chairs (Stoljar 2015; for an earlier influential formulation of this conception of physicalism, see also Jackson 1998, pp. 6-8), and in Montero’s case, the referents of physical theories (Montero 2015, p. 217)—and then recognise whatever plays a part in their constitution as physical. In light of this, since intrinsic properties play a part in the constitution of physical objects, they could thereby be counted as physical. Nonetheless, there is a famous problem facing these top-down conceptions of physicalism which is recognised by both proponents (Jackson 1998, p. 7; Stoljar 2001, p. 257n10) and critics (Langton & Robichaud 2010, p. 175; Braddon-Mitchell & Jackson 2007, p. 33). The problem is that if panpsychism, pantheism, idealism, and the like are correct, then things such as the electron’s consciousness and God play a part in the constitution of physical objects, and they should thereby be counted as physical. But it appears that any conception of physicalism (or the physical) that counts such things as physical should not really be considered as physicalism. In response, Stoljar argues that one might supplement constraints to his conception of physicalism to overcome this weakness (Stoljar 2001, p. 257n10).

9. Conclusion

In a frequently cited and discussed article on Humility, Ann Whittle remarks, ‘Perhaps surprisingly, a number of philosophers from disparate backgrounds have felt compelled to deny that we have any [intrinsic] knowledge’ (Whittle 2006, p. 461). This is certainly true. A number of questions surrounding the Humility Thesis were listed in the introductory section, but no matter what one’s answers to these questions are and whether one is convinced by the Humility Thesis or not, as we have seen the Humility Thesis has always explicitly or tacitly played a salient role in the history of ideas, in analytic metaphysics, in the philosophy of science, and even in the philosophy of mind. Particularly, the Humility Thesis is at least important in the following respects: that the thesis and some similar theories are plausibly utilised in the formulations of a number of religious and philosophical mysticisms in history; that the thesis has inspired many historically important thinkers such as Hume, Russell, and perhaps Kant and Schleiermacher; that the thesis is a key concern in the contemporary philosophy of properties; that the thesis implies an understanding of what scientific knowledge is about; and that the thesis is the basis of Russellian monism and some ancient and contemporary versions of panpsychism. Understanding the Humility Thesis thus provides us with a better insight into how a number of important philosophical frameworks and discussions were developed and framed. This will be useful to their inquirers, proponents, and critics alike.

10. References and Further Reading

  • Alter, T & Coleman, S 2020, ‘Russellian monism and mental causation’, Noûs, online first: https://doi.org/10.1111/nous.12318.
  • Alter, T & Nagasawa, Y 2012, ‘What is Russellian monism?’, Journal of Consciousness Studies, vol. 19, no. 9-10), pp. 67-95.
  • Armstrong, D 1961, Perception and the physical world, Routledge, London.
  • Armstrong D 1968, A materialist theory of the mind, Routledge & Kegan Paul, London.
  • Armstrong D 1997, A world of states of affairs, Cambridge University Press, Cambridge.
  • Berkeley, G 1710/1988, Principles of Human Knowledge and Three Dialogues, Penguin Books, London.
  • Bird, A 2005, ‘Laws and essences’, Ratio, vol. 18, no. 4, pp. 437-461.
  • Black, R 2000, ‘Against quidditism’, Australasian Journal of Philosophy, vol. 78, no. 1, pp. 87-104.
  • Blackburn, S 1990, ‘Filling in space’, Analysis, vol. 50, no. 2, pp. 62-65.
  • Borghini, A & Williams, N 2008, ‘A dispositional theory of possibility’, Dialectica, vol. 62, no. 1, pp. 21-41.
  • Braddon-Mitchell, D & Jackson, F 2007, The philosophy of mind and cognition, 2nd edn, Blackwell, Malden.
  • Chalmers, D 1995, ‘Facing up to the problem of consciousness’, Journal of Consciousness Studies, vol. 2, no. 3, pp. 200-219.
  • Chalmers, D 1996, The conscious mind: in search of a fundamental theory, Oxford University Press, New York.
  • Chalmers, D 1997, ‘Moving forward on the problem of consciousness’, Journal of Consciousness Studies, vol. 4, no. 1, pp. 3-46.
  • Chalmers, D 2003, ‘Consciousness and its place in nature’, in DS Stich & F Warfield (eds.), The Blackwell guide to philosophy of mind, Blackwell Publishing, Malden, pp. 102-142.
  • Chalmers, D 2012, Constructing the world, Oxford University Press, Oxford.
  • Chalmers, D 2015, ‘Panpsychism and panprotochism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 246-276.
  • Chalmers, D 2017, ‘The combination problem for panpsychism’, in G Brüntrup & L Jaskolla (eds.), Panpsychism: contemporary perspectives, Oxford University Press, New York, pp. 179-214.
  • Chalmers, D 2018, ‘The meta-problem of consciousness’, Journal of Consciousness Studies, vol. 25, no. 9-10, pp. 6-61.
  • Chan, LC 2017, Metaphysical naturalism and the ignorance of categorical properties, PhD thesis, University of Sydney, retrieved 28 November 2019,  <https://ses.library.usyd.edu.au/handle/2123/16555>
  • Chan, LC  2020a, ‘Can the Russellian monist escape the epiphenomenalist’s paradox?’, Topoi, vol. 39, pp. 1093–1102.
  • Chan, LC  2020b, ‘Russellian physicalism and its dilemma’, Philosophical Studies, online first: https://doi.org/10.1111/nous.12318.
  • Chan, LC & Latham AJ 2019, ‘Four meta-methods for the study of qualia’, Erkenntnis, vol. 84, no. 1, pp. 145-167.
  • Churchland, PS 1988, ‘Reduction and the neurobiological basis of consciousness’, in A Marcel & E Bisiach (eds.), Consciousness in contemporary science, Oxford University Press, New York, pp. 273-304.
  • Churchland, PS 1996, ‘The hornswoggle problem’, Journal of Consciousness Studies, vol. 3, no. 5-6, pp. 402-408.
  • Clifford, WK, 1875/2011, ‘The unseen universe’, in L Stephen & F Pollock (eds.), Lectures and essays: volume II, Cambridge University Press, Cambridge.Cowling, S 2010, ‘Kantian humility and ontological categories’, Analysis, vol. 70, no. 4, pp. 659-665.
  • Cowling, S 2010, ‘Kantian humility and ontological categories’, Analysis, vol. 70, no. 4, pp. 659-665.
  • d’Holbach, PH 1770/1820 The system of nature, trans. by De Mirabaud, M, retrieved 26 September 2018, <http://www.ftarchives.net/holbach/system/0syscontents.htm>
  • Dennett, D 1991, Consciousness explained, Penguin Books, London.
  • Deutsch, E 1969, Advaita Vedanta: a philosophical reconstruction, East-West Center Press, Honolulu.
  • Diderot, D 1770/1979, ‘Philosophic principles matter and motion’, in Diderot: interpreter of nature, trans. by Stewart, J & Kemp, J, Hpyerion Press, Wesport, pp.127-133.
  • Eddington, A 1929, The nature of the physical world, Cambridge University Press, Cambridge.
  • Ellis, B 2014, The philosophy of nature: a guide to the new essentialism, Routledge, London.
  • Ellis, B & Lierse, C 1994, ‘Dispositional essentialism’, Australasian Journal of Philosophy, vol. 72, no. 1, pp. 27-45.
  • Faraday, M 1844, ‘A speculation touching electric condition and the nature of matter’, in Experimental researches in Electricity, vol. ii, Richard & John Edward Taylor, London.
  • Feigl, H 1967, The ‘mental’ and the ‘physical’: the essay and a postscript, University of Minnesota Press, Minneapolis.
  • Flood, G 1996, An introduction to Hinduism, Cambridge University Press, Cambridge.
  • Foster, J 1993, ‘The succinct case for idealism’, in H Robinson (ed.), Objections to physicalism, Clarendon Press, Oxford, pp. 293-313.
  • Goff, P 2006, ‘Experiences don’t sum’, Journal of Consciousness Studies, vol. 13, pp. 53-61.
  • Goff, P 2017, ‘The phenomenal bonding solution to the combination problem’, in G Bruntrup & L Jaskolla (eds.), Panpsychism: contemporary perspectives, Oxford University Press, New York, pp. 283-303.
  • Handfield, T 2005, ‘Armstrong and the modal inversion of dispositions’, Philosophical Quarterly, vol. 55, no. 220, pp. 452–461.
  • Hawthorne, J 2001, ‘Causal structuralism’, Philosophical Perspectives, vol. 15, pp. 361-378.
  • Heil, J 2004, Philosophy of mind, 2nd edn, Routledge, New York.
  • Heisenberg, W 1958/2000, Physics and philosophy: the revolution in modern science, Penguin Books, London.
  • Hiddleston, E 2019, ‘Dispositional and categorical properties, and Russellian monism’, Philosophical Studies, vol. 176, no. 1, pp. 65-92.
  • Hohwy, J 2005, ‘Explanation and two conceptions of the physical’, Erkenntnis, vol. 62, no. 1, pp. 71-89.
  • Holton, R 1999, ‘Dispositions all the way round’, Analysis, vol. 59, no. 1, pp. 9-14
  • Howell, R 2015, ‘The Russellian monist’s problems with mental causation’, The Philosophical Quarterly, vol. 65, no. 258, pp. 22-39.
  • Hume, D 1739/1978, A treatise of human nature, Oxford University Press, Oxford.
  • Jackson, F 1998, From metaphysics to ethics: a defence of conceptual analysis, Oxford University Press, Oxford.
  • Jacobi, FH 1787/2000, ‘On transcendental idealism’, in B Sassen (ed.), Kant’s early critics: the empiricist critique of the theoretical philosophy, Cambridge University Press, Cambridge, pp.169-175.
  • Kant, I 1781/1998 Critique of pure reason, trans. by P Guyer & A Wood, Cambridge University Press, Cambridge.
  • Kim, J 2005, Physicalism, or something near enough, Princeton University Press, Princeton.
  • Ladyman, J & Ross, D (with Spurrett, D & Collier, J) 2007, Every thing must go: metaphysics naturalized, Oxford University Press, Oxford.
  • Langton, R 1998, Kantian humility: our ignorance of things in themselves, Oxford University Press, Oxford.
  • Langton, R 2004, ‘Elusive knowledge of things in themselves’, Australasian Journal of Philosophy, vol. 82, no. 1, pp. 129-136.
  • Langton, R 2015, ‘The impossible necessity of “filling in space’’’, in R Johnson & M Smith (eds.), Passions and projections: themes from the philosophy of Simon Blackburn, Oxford University Press, Oxford, pp. 106-114.
  • Langton, R & Robichaud, C 2010, ‘Ghosts in the world machine? Humility and its alternatives’, in A Hazlett (ed.), New waves in metaphysics, Palgrave Macmillan, New York, pp. 156-178.
  • Leibniz, G 1714/1989, ‘The monadology’, in R Ariew & D Garber (trans. and eds.), Philosophical essays, Indianapolis, Hackett.
  • Leuenberger, S 2010, ‘Humility and constraints on O-language’, Philosophical Studies, vol. 149, no. 3, pp. 327-354.
  • Levine, J 1983, ‘Materialism and qualia: the explanatory gap’, Pacific Philosophical Quarterly, vol. 64, pp. 354-361.
  • Levine, J 2001, Purple Haze: the puzzle of consciousness, Oxford University Press, Oxford.
  • Lewis, D 1970, ‘How to define theoretical terms’, Journal of Philosophy, vol. 67, no. 13, pp. 427-446.
  • Lewis, D 1972, ‘Psychophysical and theoretical identifications’, Australasian Journal of Philosophy, vol. 50, no. 3, pp. 249-258.
  • Lewis, D 1986, Philosophical papers, vol. 2, Oxford University Press, New York.
  • Lewis, D 1995, ‘Should a materialist believe in qualia?’, Australasian Journal of Philosophy, vol. 73, no. 1, pp. 140-44.
  • Lewis, D 2009, ‘Ramseyan humility’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, pp. 203-222.
  • Locke, D 2009, ‘A partial defense of Ramseyan humility’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, MA, pp. 223-242.
  • Lockwood, M 1989, Mind, brain, and quantum, Blackwell, Oxford.
  • Lockwood, M 1992, ‘The grain problem’, in H Robinson (ed.), Objections to physicalism, Oxford University Press, Oxford, pp. 271-292.
  • Mach, E 1897/1984, The analysis of sensations and the relation of the physical to the psychical, trans. by CM Williams, Open Court, La Salle.
  • Mackie, JL 1973, Truth, probability and paradox, Oxford University Press, Oxford.
  • Mahony, W 1998, The artful universe: an introduction to the Vedic religious imagination, SUNY Press, Albany.
  • Majeed, R 2013, ‘Pleading ignorance in response to experiential primitivism’, Philosophical Studies, vol. 163, no. 1, pp. 251-269.
  • Majeed, R 2017, ‘Ramseyan Humility: the response from revelation and panpsychism’, Canadian Journal of Philosophy, vol. 47, no. 1, pp. 75-96
  • Maxwell, G 1978, ‘Rigid designators and mind-brain identity’, in W Savage (ed.), Perception and cognition: issues in the foundations of psychology, University of Minnesota Press, Minneapolis, pp. 365-403.
  • McClelland, T 2012, ‘In defence of Kantian humility’, Thought, vol. 1, no. 1, pp. 62-70.
  • Mill, JS 1865/1996, An examination of Sir William Hamilton’s philosophy, Routledge, London.
  • Montero, B 2015, ‘Russellian physicalism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian monism, Oxford University Press, New York, pp. 209-223.
  • Ney, A 2007, ‘Physicalism and our knowledge of intrinsic properties’, Australasian Journal of Philosophy, vol. 85, no. 1, pp. 41-60.
  • Ney, A 2015, ‘A physicalist critique of Russellian Monism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 324-345.
  • Nietzche, F 1887/2006, On the genealogy of morality, trans. by Diethe, C, Cambridge University Press, Cambridge.
  • Tegmark, M 2007, ‘The mathematical universe’, Foundations of physics, vol. 38, pp. 101–150.
  • Tully, RE 2003, ‘Russell’s neutral monism’, in N Griffin (ed.), The Cambridge companion to Bertrand Russell, Cambridge University Press, Cambridge, pp. 332-370.
  • Pettit, P 1998, ‘Noumenalism and response-dependence’, Monist, vol. 81, no. 1, pp. 112-132.
  • Robinson, W 2018, ‘Russellian monism and epiphenomenalism’, Pacific Philosophical Quarterly, vol. 99, no. 1, pp. 100–117.
  • Rosenberg, G 2004, A place for consciousness: probing the deep structure of the natural world, Oxford University Press, Oxford.
  • Russell, B 1912/1978, The problems of philosophy, Oxford University Press, Oxford.
  • Russell, B 1921/1922, The analysis of mind, George Allen & Unwin, London.
  • Russell, B 1927a/1992, The analysis of matter, Routledge, London.
  • Russell, B 1927b, An outline of philosophy, George Allen & Unwin, London.
  • Schaffer, J 2005, ‘Quiddistic knowledge’, Philosophical Studies, vol. 123, pp. 1-32.
  • Schleiermacher, F 1799/1988, On religion: speeches to its cultured despisers, Cambridge University Press, Cambridge.
  • Schlick, M 1925a/1979, ‘Outlines of the philosophy of nature’, in Philosophical Papers: Volume II (1925-1936), pp. 1-90.
  • Schlick, M 1925b/1985, General theory of knowledge, trans. by Blumberg, A, Open Court, Chicago.
  • Schopenhauer, A 1818/1966, The world as will and representation, vol. 2, trans by EFJ Payne, Dover, New York.
  • Seager, W 2009, ‘Panpsychism’, in A Beckermann & BP McLaughlin (eds.), The Oxford handbook of philosophy of mind, Oxford University Press, New York, pp. 206-219.
  • Shoemaker, S 1980, ‘Causality and properties’, in P van Inwagen (ed.), Time and cause, Reidel, Dordrecht, pp. 109-135.
  • Smart, JJC 1963, Philosophy and scientific realism, Routledge, London.
  • Smart, JJC 2004, ‘Consciousness and awareness’, Journal of Consciousness Studies, vol. 11, no. 2, pp. 41-50.
  • Smith, M & Stoljar, D 1998, ‘Global response-dependence and noumenal realism’, Monist, vol. 81, no. 1, pp. 85-111.
  • Stoljar, D 2001, ‘Two conceptions of the physical’, Philosophy and Phenomenological Research, vol. 62, no. 2, pp. 253-281.
  • Stoljar, D 2009, ‘The argument from revelation’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, pp. 113-138.
  • Stoljar, D 2014, ‘Four kinds of Russellian Monism’, in U Kriegel (ed.), Current controversies in philosophy of mind, Routledge, New York, pp. 17-39.
  • Stoljar, D 2015, ‘Physicalism’, in E Zalta (ed.), Stanford encyclopedia of philosophy, retrieved 8 March 2020, <http://plato.stanford.edu/entries/physicalism/>
  • Strawson, G 2006, ‘Realistic monism: why physicalism entails panpsychism’, Journal of Consciousness Studies, vol. 13, no. 10-11, pp. 3-31.
  • Strawson, PF 1966, The bounds of sense: an essay on Kant’s Critique of Pure Reason, Methuen, London.
  • Swoyer, C 1982, ‘The nature of laws of nature’, Australasian Journal of Philosophy, vol. 60, no. 3, pp. 203-223.
  • Van Cleve, J 2002, ‘Receptivity and our knowledge of intrinsic properties’, Philosophy and Phenomenological Research, vol. 65, no. 1, pp. 218-237.
  • Whitehead, AN 1925/1967, Science and the modern world, The Free Press, New York.
  • Whittle, A 2006, ‘On an argument for Humility’, Philosophical Studies, vol. 130, no. 3, pp. 461-497.
  • Wishon, D 2015, ‘Russell on Russellian Monism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 91-120.
  • Yates, D 2018, ‘Three arguments for humility’, Philosophical Studies, vol. 175, no. 2, pp. 461-481.

 

Author Information

Lok-Chi Chan
Email: lokchan@ntu.edu.tw
National Taiwan University
Taiwan

Critical Thinking

Critical Thinking is the process of using and assessing reasons to evaluate statements, assumptions, and arguments in ordinary situations. The goal of this process is to help us have good beliefs, where “good” means that our beliefs meet certain goals of thought, such as truth, usefulness, or rationality. Critical thinking is widely regarded as a species of informal logic, although critical thinking makes use of some formal methods. In contrast with formal reasoning processes that are largely restricted to deductive methods—decision theory, logic, statistics—the process of critical thinking allows a wide range of reasoning methods, including formal and informal logic, linguistic analysis, experimental methods of the sciences, historical and textual methods, and philosophical methods, such as Socratic questioning and reasoning by counterexample.

The goals of critical thinking are also more diverse than those of formal reasoning systems. While formal methods focus on deductive validity and truth, critical thinkers may evaluate a statement’s truth, its usefulness, its religious value, its aesthetic value, or its rhetorical value. Because critical thinking arose primarily from the Anglo-American philosophical tradition (also known as “analytic philosophy”), contemporary critical thinking is largely concerned with a statement’s truth. But some thinkers, such as Aristotle (in Rhetoric), give substantial attention to rhetorical value.

The primary subject matter of critical thinking is the proper use and goals of a range of reasoning methods, how they are applied in a variety of social contexts, and errors in reasoning. This article also discusses the scope and virtues of critical thinking.

Critical thinking should not be confused with Critical Theory. Critical Theory refers to a way of doing philosophy that involves a moral critique of culture. A “critical” theory, in this sense, is a theory that attempts to disprove or discredit a widely held or influential idea or way of thinking in society. Thus, critical race theorists and critical gender theorists offer critiques of traditional views and latent assumptions about race and gender. Critical theorists may use critical thinking methodology, but their subject matter is distinct, and they also may offer critical analyses of critical thinking itself.

Table of Contents

  1. Clarity
  2. Argument and Evaluation
  3. Formal Reasoning
    1. Categorical Logic
    2. Propositional Logic
    3. Modal Logic
    4. Predicate Logic
    5. Other Formal Systems
  4. Informal Reasoning
    1. Generalization
    2. Analogy
    3. Causal Reasoning
    4. Abduction
  5. Detecting Poor Reasoning
    1. Formal Fallacies
    2. Informal Fallacies
    3. Heuristics and Biases
  6. The Scope and Virtues of Good Reasoning
    1. Context
    2. The Principle of Charity/Humility
    3. The Principle of Caution
    4. The Expansiveness of Critical Thinking
    5. Productivity and the Limits of Rationality
  7. Approaches to Improving Reasoning through Critical Thinking
    1. Classical Approaches
    2. The Paul/Elder Model
    3. Other Approaches
  8. References and Further Reading

1. Clarity

The process of evaluating a statement traditionally begins with making sure we understand it; that is, a statement must express a clear meaning. A statement is generally regarded as clear if it expresses a proposition, which is the meaning the author of that statement intends to express, including definitions, referents of terms, and indexicals, such as subject, context, and time. There is significant controversy over what sort of “entity” propositions are, whether abstract objects or linguistic constructions or something else entirely. Whatever its metaphysical status, it is used here simply to refer to whatever meaning a speaker intends to convey in a statement.

The difficulty with identifying intended propositions is that we typically speak and think in natural languages (English, Swedish, French), and natural languages can be misleading. For instance, two different sentences in the same natural language may express the same proposition, as in these two English sentences:

Jamie is taller than his father.
Jamie’s father is shorter than he.

Further, the same sentence in a natural language can express more than one proposition depending on who utters it at a time:

I am shorter than my father right now.

The pronoun “I” is an indexical; it picks out, or “indexes,” whoever utters the sentence and, therefore, expresses a different proposition for each new speaker who utters it. Similarly, “right now” is a temporal indexical; it indexes the time the sentence is uttered. The proposition it is used to express changes each new time the sentence is uttered and, therefore, may have a different truth value at different times (as, say, the speaker grows taller: “I am now five feet tall” may be true today, but false a year from now). Other indexical terms that can affect the meaning of the sentence include other pronouns (he, she, it) and definite articles (that, the).

Further still, different sentences in different natural languages may express the same proposition. For example, all of the following express the proposition “Snow is white”:

Snow is white. (English)

Der Schnee ist weiss. (German)

La neige est blanche. (French)

La neve é bianca. (Italian)

Finally, statements in natural languages are often vague or ambiguous, either of which can obscure the propositions actually intended by their authors. And even in cases where they are not vague or ambiguous, statements’ truth values sometimes vary from context to context. Consider the following example.

The English statement, “It is heavy,” includes the pronoun “it,” which (when used without contextual clues) is ambiguous because it can index any impersonal subject. If, in this case, “it” refers to the computer on which you are reading this right now, its author intends to express the proposition, “The computer on which you are reading this right now is heavy.” Further, the term “heavy” reflects an unspecified standard of heaviness (again, if contextual clues are absent). Assuming we are talking about the computer, it may be heavy relative to other computer models but not to automobiles. Further still, even if we identify or invoke a standard of heaviness by which to evaluate the appropriateness of its use in this context, there may be no weight at which an object is rightly regarded as heavy according to that standard. (For instance, is an object heavy because it weighs 5.3 pounds but not if it weighs 5.2 pounds? Or is it heavy when it is heavier than a mouse but lighter than an anvil?) This means “heavy” is a vague term. In order to construct a precise statement, vague terms (heavy, cold, tall) must often be replaced with terms expressing an objective standard (pounds, temperature, feet).

Part of the challenge of critical thinking is to clearly identify the propositions (meanings) intended by those making statements so we can effectively reason about them. The rules of language help us identify when a term or statement is ambiguous or vague, but they cannot, by themselves, help us resolve ambiguity or vagueness. In many cases, this requires assessing the context in which the statement is made or asking the author what she intends by the terms. If we cannot discern the meaning from the context and we cannot ask the author, we may stipulate a meaning, but this requires charity, to stipulate a plausible meaning, and humility, to admit when we discover that our stipulation is likely mistaken.

2. Argument and Evaluation

Once we are satisfied that a statement is clear, we can begin evaluating it. A statement can be evaluated according to a variety of standards. Commonly, statements are evaluated for truth, usefulness, or rationality. The most common of these goals is truth, so that is the focus of this article.

The truth of a statement is most commonly evaluated in terms of its relation to other statements and direct experiences. If a statement follows from or can be inferred from other statements that we already have good reasons to believe, then we have a reason to believe that statement. For instance, the statement “The ball is blue” can be derived from “The ball is blue and round.” Similarly, if a statement seems true in light of, or is implied by, an experience, then we have a reason to believe that statement. For instance, the experience of seeing a red car is a reason to believe, “The car is red.” (Whether these reasons are good enough for us to believe is a further question about justification, which is beyond the scope of this article, but see “Epistemic Justification.”) Any statement we derive in these ways is called a conclusion. Though we regularly form conclusions from other statements and experiences—often without thinking about it—there is still a question of whether these conclusions are true: Did we draw those conclusions well? A common way to evaluate the truth of a statement is to identify those statements and experiences that support our conclusions and organize them into structures called arguments. (See also, “Argument.”)

An argument is one or more statements (called premises) intended to support the truth of another statement (the conclusion). Premises comprise the evidence offered in favor of the truth of a conclusion. It is important to entertain any premises that are intended to support a conclusion, even if the attempt is unsuccessful. Unsuccessful attempts at supporting a proposition constitute bad arguments, but they are still arguments. The support intended for the conclusion may be formal or informal. In a formal, or deductive, argument, an arguer intends to construct an argument such that, if the premises are true, the conclusion must be true. This strong relationship between premises and conclusion is called validity. This relationship between the premises and conclusion is called “formal” because it is determined by the form (that is, the structure) of the argument (see §3). In an informal, or inductive, argument, the conclusion may be false even if the premises are true. In other words, whether an inductive argument is good depends on something more than the form of the argument. Therefore, all inductive arguments are invalid, but this does not mean they are bad arguments. Even if an argument is invalid, its premises can increase the probability that its conclusion is true. So, the form of inductive arguments is evaluated in terms of the strength the premises confer on the conclusion, and stronger inductive arguments are preferred to weaker ones (see §4). (See also, “Deductive and Inductive Arguments.”)

Psychological states, such as sensations, memories, introspections, and intuitions often constitute evidence for statements. Although these states are not themselves statements, they can be expressed as statements. And when they are, they can be used in and evaluated by arguments. For instance, my seeing a red wall is evidence for me that, “There is a red wall,” but the physiological process of seeing is not a statement. Nevertheless, the experience of seeing a red wall can be expressed as the proposition, “I see a red wall” and can be included in an argument such as the following:

    1. I see a red wall in front of me.
    2. Therefore, there is a red wall in front of me.

This is an inductive argument, though not a strong one. We do not yet know whether seeing something (under these circumstances) is reliable evidence for the existence of what I am seeing. Perhaps I am “seeing” in a dream, in which case my seeing is not good evidence that there is a wall. For similar reasons, there is also reason to doubt whether I am actually seeing. To be cautious, we might say we seem to see a red wall.

To be good, an argument must meet two conditions: the conclusion must follow from the premises—either validly or with a high degree of likelihood—and the premises must be true. If the premises are true and the conclusion follows validly, the argument is sound. If the premises are true and the premises make the conclusion probable (either objectively or relative to alternative conclusions), the argument is cogent.

Here are two examples:

Example 1:

    1. Earth is larger than its moon.
    2. Our sun is larger than Earth.
    3. Therefore, our sun is larger than Earth’s moon.

In example 1, the premises are true. And since “larger than” is a transitive relation, the structure of the argument guarantees that, if the premises are true, the conclusion must be true. This means the argument is also valid. Since it is both valid and has true premises, this deductive argument is sound.

 Example 2:

    1. It is sunny in Montana about 205 days per year.
    2. I will be in Montana in February.
    3. Hence, it will probably be sunny when I am in Montana.

In example 2, premise 1 is true, and let us assume premise 2 is true. The phrase “almost always” indicates that a majority of days in Montana are sunny, so that, for any day you choose, it will probably be a sunny day. Premise 2 says I am choosing days in February to visit. Together, these premises strongly support (though they do not guarantee) the conclusion that it will be sunny when I am there, and so this inductive argument is cogent.

In some cases, arguments will be missing some important piece, whether a premise or a conclusion. For instance, imagine someone says, “Well, she asked you to go, so you have to go.” The idea that you have to go does not follow logically from the fact that she asked you to go without more information. What is it about her asking you to go that implies you have to go? Arguments missing important information are called enthymemes. A crucial part of critical thinking is identifying missing or assumed information in order to effectively evaluate an argument. In this example, the missing premise might be that, “She is your boss, and you have to do what she asks you to do.” Or it might be that, “She is the woman you are interested in dating, and if you want a real chance at dating her, you must do what she asks.” Before we can evaluate whether her asking implies that you have to go, we need to know this missing bit of information. And without that missing bit of information, we can simply reply, “That conclusion doesn’t follow from that premise.”

The two categories of reasoning associated with soundness and cogency—formal and informal, respectively—are considered, by some, to be the only two types of argument. Others add a third category, called abductive reasoning, according to which one reasons according to the rules of explanation rather than the rules of inference. Those who do not regard abductive reasoning as a third, distinct category typically regard it as a species of informal reasoning. Although abductive reasoning has unique features, here it is treated, for reasons explained in §4d, as a species of informal reasoning, but little hangs on this characterization for the purposes of this article.

3. Formal Reasoning

Although critical thinking is widely regarded as a type of informal reasoning, it nevertheless makes substantial use of formal reasoning strategies. Formal reasoning is deductive, which means an arguer intends to infer or derive a proposition from one or more propositions on the basis of the form or structure exhibited by the premises. Valid argument forms guarantee that particular propositions can be derived from them. Some forms look like they make such guarantees but fail to do so (we identify these as formal fallacies in §5a). If an arguer intends or supposes that a premise or set of premises guarantee a particular conclusion, we may evaluate that argument form as deductive even if the form fails to guarantee the conclusion, and is thus discovered to be invalid.

Before continuing in this section, it is important to note that, while formal reasoning provides a set of strict rules for drawing valid inferences, it cannot help us determine the truth of many of our original premises or our starting assumptions. And in fact, very little critical thinking that occurs in our daily lives (unless you are a philosopher, engineer, computer programmer, or statistician) involves formal reasoning. When we make decisions about whether to board an airplane, whether to move in with our significant others, whether to vote for a particular candidate, whether it is worth it to drive ten miles faster the speed limit even if I am fairly sure I will not get a ticket, whether it is worth it to cheat on a diet, or whether we should take a job overseas, we are reasoning informally. We are reasoning with imperfect information (I do not know much about my flight crew or the airplane’s history), with incomplete information (no one knows what the future is like), and with a number of built-in biases, some conscious (I really like my significant other right now), others unconscious (I have never gotten a ticket before, so I probably will not get one this time). Readers who are more interested in these informal contexts may want to skip to §4.

An argument form is a template that includes variables that can be replaced with sentences. Consider the following form (found within the formal system known as sentential logic):

    1. If p, then q.
    2. p.
    3. Therefore, q.

This form was named modus ponens (Latin, “method of putting”) by medieval philosophers. p and q are variables that can be replaced with any proposition, however simple or complex. And as long as the variables are replaced consistently (that is, each instance of p is replaced with the same sentence and the same for q), the conclusion (line 3), q, follows from these premises. To be more precise, the inference from the premises to the conclusion is valid. “Validity” describes a particular relationship between the premises and the conclusion, namely: in all cases, the conclusion follows necessarily from the premises, or, to use more technical language, the premises logically guarantee an instance of the conclusion.

Notice we have said nothing yet about truth. As critical thinkers, we are interested, primarily, in evaluating the truth of sentences that express propositions, but all we have discussed so far is a type of relationship between premises and conclusion (validity). This formal relationship is analogous to grammar in natural languages and is known in both fields as syntax. A sentence is grammatically correct if its syntax is appropriate for that language (in English, for example, a grammatically correct simple sentence has a subject and a predicate—“He runs.” “Laura is Chairperson.”—and it is grammatically correct regardless of what subject or predicate is used—“Jupiter sings.”—and regardless of whether the terms are meaningful—“Geflorble rowdies.”). Whether a sentence is meaningful, and therefore, whether it can be true or false, depends on its semantics, which refers to the meaning of individual terms (subjects and predicates) and the meaning that emerges from particular orderings of terms. Some terms are meaningless—geflorble; rowdies—and some orderings are meaningless even though their terms are meaningful—“Quadruplicity drinks procrastination,” and “Colorless green ideas sleep furiously.”.

Despite the ways that syntax and semantics come apart, if sentences are meaningful, then syntactic relationships between premises and conclusions allow reasoners to infer truth values for conclusions. Because of this, a more common definition of validity is this: it is not possible for all the premises to be true and the conclusion false. Formal logical systems in which syntax allows us to infer semantic values are called truth-functional or truth-preserving—proper syntax preserves truth throughout inferences.

The point of this is to note that formal reasoning only tells us what is true if we already know our premises are true. It cannot tell us whether our experiences are reliable or whether scientific experiments tell us what they seem to tell us. Logic can be used to help us determine whether a statement is true, but only if we already know some true things. This is why a broad conception of critical thinking is so important: we need many different tools to evaluate whether our beliefs are any good.

Consider, again, the form modus ponens, and replace p with “It is a cat” and q with “It is a mammal”:

    1. If it is a cat, then it is a mammal.
    2. It is a cat.
    3. Therefore, it is a mammal.

In this case, we seem to “see” (in a metaphorical sense of see) that the premises guarantee the truth of the conclusion. On reflection, it is also clear that the premises might not be true; for instance, if “it” picks out a rock instead of a cat, premise 1 is still true, but premise 2 is false. It is also possible for the conclusion to be true when the premises are false. For instance, if the “it” picks out a dog instead of a cat, the conclusion “It is a mammal” is true. But in that case, the premises do not guarantee that conclusion; they do not constitute a reason to believe the conclusion is true.

Summing up, an argument is valid if its premises logically guarantee an instance of its conclusion (syntactically), or if it is not possible for its premises to be true and its conclusion false (semantically). Logic is truth-preserving but not truth-detecting; we still need evidence that our premises are true to use logic effectively.

            A Brief Technical Point

Some readers might find it worth noting that the semantic definition of validity has two counterintuitive consequences. First, it implies that any argument with a necessarily true conclusion is valid. Notice that the condition is phrased hypothetically: if the premises are true, then the conclusion cannot be false. This condition is met if the conclusion cannot be false:

        1. If it is a cat, then it is a mammal.
        2. It is a cat.
        3. Two added to two equals four.

This is because the hypothetical (or “conditional”) statement would still be true even if the premises were false:

        1. If it is blue, then it flies.
        2. It is an airplane.
        3. Two added to two equals four.

It is true of this argument that if the premises were true, the conclusion would be since the conclusion is true no matter what.

Second, the semantic formulation also implies that any argument with necessarily false premises is valid. The semantic condition for validity is met if the premises cannot be true:

        1. Some bachelors are married.
        2. Earth’s moon is heavier than Jupiter.

In this case, if the premise were true, the conclusion could not be false (this is because anything follows syntactically from a contradiction), and therefore, the argument is valid. There is nothing particularly problematic about these two consequences. But they highlight unexpected implications of our standard formulations of validity, and they show why there is more to good arguments than validity.

Despite these counterintuitive implications, valid reasoning is essential to thinking critically because it is a truth-preserving strategy: if deductive reasoning is applied to true premises, true conclusions will result.

There are a number of types of formal reasoning, but here we review only some of the most common: categorical logic, propositional logic, modal logic, and predicate logic.

a. Categorical Logic

Categorical logic is formal reasoning about categories or collections of subjects, where subjects refers to anything that can be regarded as a member of a class, whether objects, properties, or events or even a single object, property, or event. Categorical logic employs the quantifiers “all,” “some,” and “none” to refer to the members of categories, and categorical propositions are formulated in four ways:

A claims: All As are Bs (where the capitals “A” and “B” represent categories of subjects).

E claims: No As are Bs.

I claims: Some As are Bs.

O claims: Some As are not Bs.

Categorical syllogisms are syllogisms (two-premised formal arguments) that employ categorical propositions. Here are two examples:

  1. All cats are mammals. (A claim) 1. No bachelors are married. (E claim)
  2. Some cats are furry. (I claim) 2. All the people in this building are bachelors. (A claim)
  3. Therefore, some mammals are furry. (I claim) 3. Thus, no people in this building are married. (E claim)

There are interesting limitations on what categorical logic can do. For instance, if one premise says that, “Some As are not Bs,” may we infer that some As are Bs, in what is known as an “existential assumption”? Aristotle seemed to think so (De Interpretatione), but this cannot be decided within the rules of the system. Further, and counterintuitively, it would mean that a proposition such as, “Some bachelors are not married,” is false since it implies that some bachelors are married.

Another limitation on categorical logic is that arguments with more than three categories cannot be easily evaluated for validity. The standard method for evaluating the validity of categorical syllogisms is the Venn diagram (named after John Venn, who introduced it in 1881), which expresses categorical propositions in terms of two overlapping circles and categorical arguments in terms of three overlapping circles, each circle representing a category of subjects.

Venn diagram for claim and Venn diagram for argument

A, B, and C represent categories of objects, properties, or events. The symbol “” comes from mathematical set theory to indicate “intersects with.” “A∩B” means all those As that are also Bs and vice versa. 

Though there are ways of constructing Venn diagrams with more than three categories, determining the validity of these arguments using Venn diagrams is very difficult (and often requires computers). These limitations led to the development of more powerful systems of formal reasoning.

b. Propositional Logic

Propositional, or sentential, logic has advantages and disadvantages relative to categorical logic. It is more powerful than categorical logic in that it is not restricted in the number of terms it can evaluate, and therefore, it is not restricted to the syllogistic form. But it is weaker than categorical logic in that it has no operators for quantifying over subjects, such as “all” or “some.” For those, we must appeal to predicate logic (see §3c below).

Basic propositional logic involves formal reasoning about propositions (as opposed to categories), and its most basic unit of evaluation is the atomic proposition. “Atom” means the smallest indivisible unit of something, and simple English statements (subject + predicate) are atomic wholes because if either part is missing, the word or words cease to be a statement, and therefore ceases to be capable of expressing a proposition. Atomic propositions are simple subject-predicate combinations, for instance, “It is a cat” and “I am a mammal.” Variable letters such as p and q in argument forms are replaced with semantically rich constants, indicated by capital letters, such as A and B. Consider modus ponens again (noting that the atomic propositions are underlined in the English argument):

Argument Form English Argument Semantic Replacement
1. If p, then q. 1. If it is a cat, then it is a mammal. 1. If C, then M
2. p. 2. It is a cat. 2. C
3. Therefore, q. 3. Therefore, it is a mammal. 3. M

As you can see from premise 1 of the Semantic Replacement, atomic propositions can be combined into more complex propositions using symbols that represent their logical relationships (such as “If…, then…”). These symbols are called “operators” or “connectives.” The five standard operators in basic propositional logic are:

Operator/Connective Symbol Example Translation
“not” ~ or ¬ or It is not the case that p. ~p
“and” & or • Both p and q. p & q
“or” v Either p or q. p v q
“If…, then…” à or ⊃ If p, then q. p ⊃ q
“if and only if” ≡ or ⬌ or iff p if and only if q. p ≡ q

These operations allow us to identify valid relations among propositions: that is, they allow us to formulate a set of rules by which we can validly infer propositions from and validly replace them with others. These rules of inference (such as modus ponens; modus tollens; disjunctive syllogism) and rules of replacement (such as double negation; contraposition; DeMorgan’s Law) comprise the syntax of propositional logic, guaranteeing the validity of the arguments employing them.

Two Rules of Inference:

Conjunction Argument Form Propositional Translation
1. It is raining. 1. p 1. R
2. It is windy. 2. q 2. W
3. Therefore, it is raining and it is windy. 3. /.: (p & q) 3. /.: (R & W)
Disjunctive Syllogism Argument Form Propositional Translation
1. Either it is raining or my car is dirty. 1. (p v q) 1. (R v C)
2. My car is not dirty. 2. ~q 2. ~C
3. Therefore, it is raining. 3. /.: p 3. /.: R

 

Two Rules of Replacement:

Material Implication Replacement Form Propositional Translation
If it is raining, then the sidewalk is wet if and only if either it is not raining or the sidewalk is wet. (p ⊃ q) ≡ (~p v q) (R ⊃ W) ≡ (~R v W)

 

DeMorgan’s Laws                               Replacement Form Propositional Translation
It is not the case that the job is a good fit for you and you hate it if and only if it either is not a good fit for your or you do not hate it. ~(p & q) ≡ (~p v ~q) ~(F & H) ≡ (~F v ~H)
It is not the case that he is either a lawyer or a nice guy if and only if he is neither a lawyer nor a nice guy. ~(p v q) ≡ (~p & ~q) ~(L v N) ≡ (~L & ~N)

For more, see “Propositional Logic.”

c. Modal Logic

Standard propositional logic does not capture every type of proposition we wish to express (recall that it does not allow us to evaluate categorical quantifiers such as “all” or “some”). It also does not allow us to evaluate propositions expressed as possibly true or necessarily true, modifications that are called modal operators or modal quantifiers.

Modal logic refers to a family of formal propositional systems, the most prominent of which includes operators for necessity (□) and possibility (◊) (see §3d below for examples of other modal systems). If a proposition, p, is possibly true, ◊p, it may or may not be true. If p is necessarily true, □p, it must be true; it cannot be false. If p is necessarily false, either ~◊p or □~p, it must be false; it cannot be true.

There is a variety of modal systems, the weakest of which is called K (after Saul Kripke, who exerted important influence on the development of modal logic), and it involves only two additional rules:

Necessitation Rule:   If A is a theorem of K, then so is □A.

Distribution Axiom:  □(AB) ⊃ (□A⊃□B).  [If it is necessarily the case that if A, then B, then if it is necessarily the case that A, it is necessarily the case that B.]

Other systems maintain these rules and add others for increasing strength. For instance, the (S4) modal system includes axiom (4):

(4)  □A ⊃ □□A  [If it is necessarily the case that A, then it is necessarily necessary that A.]

An influential and intuitive way of thinking about modal concepts is the idea of “possible worlds” (see Plantinga, 1974; Lewis 1986). A world is just the set of all true propositions. The actual world is the set of all actually true propositions—everything that was true, is true, and (depending on what you believe about the future) will be true. A possible world is a way the actual world might have been. Imagine you wore green underwear today. The actual world might have been different in that way: you might have worn blue underwear. In this interpretation of modal quantifiers, there is a possible world in which you wore blue underwear instead of green underwear. And for every possibility like this, and every combination of those possibilities, there is a distinct possible world.

If a proposition is not possible, then there is no possible world in which that proposition is true. The statement, “That object is red all over and blue all over at the same time” is not true in any possible worlds. Therefore, it is not possible (~◊P), or, in other words, necessarily false (□~P). If a proposition is true in all possible worlds, it is necessarily true. For instance, the proposition, “Two plus two equal four,” is true in all possible worlds, so it is necessarily true (□P) or not possibly false (~◊~P).

All modal systems have a number of controversial implications, and there is not space to review them here. Here we need only note that modal logic is a type of formal reasoning that increases the power of propositional logic to capture more of what we attempt to express in natural languages. (For more, see “Modal Logic: A Contemporary View.”)

d. Predicate Logic

Predicate logic, in particular, first-order predicate logic, is even more powerful than propositional logic. Whereas propositional logic treats propositions as atomic wholes, predicate logic allows reasoners to identify and refer to subjects of propositions, independently of their predicates. For instance, whereas the proposition, “Susan is witty,” would be replaced with a single upper-case letter, say “S,” in propositional logic, predicate logic would assign the subject “Susan” a lower-case letter, s, and the predicate “is witty” an upper-case letter, W, and the translation (or formula) would be: Ws.

In addition to distinguishing subjects and predicates, first-order predicate logic allows reasoners to quantify over subjects. The quantifiers in predicate logic are “All…,” which is comparable to “All” quantifier in categorical logic and is sometimes symbolized with an upside-down A: ∀ (though it may not be symbolized at all), and “There is at least one…,” which is comparable to “Some” quantifier in categorical logic and is symbolized with a backward E: ∃. E and O claims are formed by employing the negation operator from propositional logic. In this formal system, the proposition, “Someone is witty,” for example, has the form: There is an x, such that x has the property of being witty, which is symbolized: (∃x)(Wx). Similarly, the proposition, “Everyone is witty,” has the form: For all x, x has the property of being witty, which is symbolized (∀x)(Wx) or, without the ∀: (x)(Wx).

Predicate derivations are conducted according to the same rules of inference and replacement as propositional logic with the exception of four rules to accommodate adding and eliminating quantifiers.

Second-order predicate logic extends first-order predicate logic to allow critical thinkers to quantify over and draw inferences about subjects and predicates, including relations among subjects and predicates. In both first- and second-order logic, predicates typically take the form of properties (one-place predicates) or relations (two-place predicates), though there is no upper limit on place numbers. Second-order logic allows us to treat both as falling under quantifiers, such as everything that is (specifically, that has the property of being) a tea cup and everything that is a bachelor is unmarried.

e. Other Formal Systems

It is worth noting here that the formal reasoning systems we have seen thus far (categorical, propositional, and predicate) all presuppose that truth is bivalent, that is, two-valued. The two values critical thinkers are most often concerned with are true and false, but any bivalent system is subject to the rules of inference and replacement of propositional logic. The most common alternative to truth values is the binary code of 1s and 0s used in computer programming. All logics that presuppose bivalence are called classical logics. In the next section, we see that not all formal systems are bivalent; there are non-classical logics. The existence of non-classical systems raises interesting philosophical questions about the nature of truth and the legitimacy of our basic rules of reasoning, but these questions are too far afield for this context. Many philosophers regard bivalent systems as legitimate for all but the most abstract and purely formal contexts. Included below is a brief description of three of the most common non-classical logics.

Tense logic, or temporal logic, is a formal modal system developed by Arthur Prior (1957, 1967, 1968) to accommodate propositional language about time. For example, in addition to standard propositional operators, tense logic includes four operators for indexing times: P “It has at some time been the case that…”; F “It will at some time be the case that…”; H “It has always been the case that…”; and G “It will always be the case that….”

Many-valued logic, or n-valued logic, is a family of formal logical systems that attempts to accommodate intuitions that suggest some propositions have values in addition to true and false. These are often motivated by intuitions that some propositions have neither of the classic truth values; their truth value is indeterminate (not just undeterminable, but neither true nor false), for example, propositions about the future such as, “There will be a sea battle tomorrow.” If the future does not yet exist, there is no fact about the future, and therefore, nothing for a proposition to express.

Fuzzy logic is a type of many-valued logic developed out of Lotfi Zadeh’s (1965) work on mathematical sets. Fuzzy logic attempts to accommodate intuitions that suggest some propositions have truth value in degrees, that is, some degree of truth between true and false. It is motivated by concerns about vagueness in reality, for example whether a certain color is red or some degree of red, or whether some temperature is hot or some degree of hotness.

Formal reasoning plays an important role in critical thinking, but not very often. There are significant limits to how we might use formal tools in our daily lives. If that is true, how do critical thinkers reason well when formal reasoning cannot help? That brings us to informal reasoning.

4. Informal Reasoning

Informal reasoning is inductive, which means that a proposition is inferred (but not derived) from one or more propositions on the basis of the strength provided by the premises (where “strength” means some degree of likelihood less than certainty or some degree of probability less than 1 but greater than 0; a proposition with 0% probability is necessarily false).

Particular premises grant strength to premises to the degree that they reflect certain relationships or structures in the world. For instance, if a particular type of event, p, is known to cause or indicate another type of event, q, then upon encountering an event of type p, we may infer that an event of type q is likely to occur. We may express this relationship among events propositionally as follows:

    1. Events of type p typically cause or indicate events of type q.
    2. An event of type p occurred.
    3. Therefore, an event of type q probably occurred.

If the structure of the world (for instance, natural laws) makes premise 1 true, then, if premise 2 is true, we can reasonably (though not certainly) infer the conclusion.

Unlike formal reasoning, the adequacy of informal reasoning depends on how well the premises reflect relationships or structures in the world. And since we have not experienced every relationship among objects or events or every structure, we cannot infer with certainty that a particular conclusion follows from a true set of premises about these relationships or structures. We can only infer them to some degree of likelihood by determining to the best of our ability either their objective probability or their probability relative to alternative conclusions.

The objective probability of a conclusion refers to how likely, given the way the world is regardless of whether we know it, that conclusion is to be true. The epistemic probability of a conclusion refers to how likely that conclusion is to be true given what we know about the world, or more precisely, given our evidence for its objective likelihood.

Objective probabilities are determined by facts about the world and they are not truths of logic, so we often need evidence for objective probabilities. For instance, imagine you are about to draw a card from a standard playing deck of 52 cards. Given particular assumptions about the world (that this deck contains 52 cards and that one of them is the Ace of Spades), the objective likelihood that you will draw an Ace of Spades is 1/52. These assumptions allow us to calculate the objective probability of drawing an Ace of Spades regardless of whether we have ever drawn a card before. But these are assumptions about the world that are not guaranteed by logic: we have to actually count the cards, to be sure we count accurately and are not dreaming or hallucinating, and that our memory (once we have finished counting) reliably maintains our conclusions. None of these processes logically guarantees true beliefs. So, if our assumptions are correct, we know the objective probability of actually drawing an Ace of Spades in the real world. But since there is no logical guarantee that our assumptions are right, we are left only with the epistemic probability (the probability based on our evidence) of drawing that card. If our assumptions are right, then the objective probability is the same as our epistemic probability: 1/52. But even if we are right, objective and epistemic probabilities can come apart under some circumstances.

Imagine you draw a card without looking at it and lay it face down. What is the objective probability that that card is an Ace of Spades? The structure of the world has now settled the question, though you do not know the outcome. If it is an Ace of Spades, the objective probability is 1 (100%); it is the Ace of Spades. If it is not the Ace of Spades, the objective probability is 0 (0%); it is not the Ace of Spades. But what is the epistemic probability? Since you do not know any more about the world than you did before you drew the card, the epistemic probability is the same as before you drew it: 1/52.

Since much of the way the world is is hidden from us (like the card laid face down), and since it is not obvious that we perceive reality as it actually is (we do not know whether the actual coins we flip are evenly weighted or whether the actual dice we roll are unbiased), our conclusions about probabilities in the actual world are inevitably epistemic probabilities. We can certainly calculate objective probabilities about abstract objects (for instance, hypothetically fair coins and dice—and these calculations can be evaluated formally using probability theory and statistics), but as soon as we apply these calculations to the real world, we must accommodate the fact that our evidence is incomplete.

There are four well-established categories of informal reasoning: generalization, analogy, causal reasoning, and abduction.

a. Generalization

Generalization is a way of reasoning informally from instances of a type to a conclusion about the type. This commonly takes two forms: reasoning from a sample of a population to the whole population, and reasoning from past instances of an object or event to future instances of that object or event. The latter is sometimes called “enumerative induction” because it involves enumerating past instances of a type in order to draw an inference about a future instance. But this distinction is weak; both forms of generalization use past or current data to infer statements about future instances and whole current populations.

A popular instance of inductive generalization is the opinion poll: a sample of a population of people is polled with respect to some statement or belief. For instance, if we poll 57 sophomores enrolled at a particular college about their experiences of living in dorms, these 57 comprise our sample of the population of sophomores at that particular college. We want to be careful how we define our population given who is part of our sample. Not all college students are like sophomores, so it is not prudent to draw inferences about all college students from these sophomores. Similarly, sophomores at other colleges are not necessarily like sophomores at this college (it could be the difference between a liberal arts college and a research university), so it is prudent not to draw inferences about all sophomores from this sample at a particular college.

Let us say that 90% of the 57 sophomores we polled hate the showers in their dorms. From this information, we might generalize in the following way:

  1. We polled 57 sophomores at Plato’s Academy. (the sample)
  2. 90% of our sample hates the showers in their dorms. (the polling data)
  3. Therefore, probably 90% of all sophomores at Plato’s Academy hate the showers in their dorms. (a generalization from our sample to the whole population of sophomores at Plato’s Academy)

Is this good evidence that 90% of all sophomores at that college hate the showers in their dorms?

A generalization is typically regarded as a good argument if its sample is representative of its population. A sample is representative if it is similar in the relevant respects to its population. A perfectly representative sample would include the whole population: the sample would be identical with the population, and thus, perfectly representative. In that case, no generalization is necessary. But we rarely have the time or resources to evaluate whole populations. And so, a sample is generally regarded as representative if it is large relative to its population and unbiased.

In our example, whether our inference is good depends, in part, on how many sophomores there are. Are there 100, 2,000? If there are only 100, then our sample size seems adequate—we have polled over half the population. Is our sample unbiased? That depends on the composition of the sample. Is it comprised only of women or only of men? If this college is not co-ed, that is not a problem. But if the college is co-ed and we have sampled only women, our sample is biased against men. We have information only about female freshmen dorm experiences, and therefore, we cannot generalize about male freshmen dorm experiences.

How large is large enough? This is a difficult question to answer. A poll of 1% of your high school does not seem large enough to be representative. You should probably gather more data. Yet a poll of 1% of your whole country is practically impossible (you are not likely to ever have enough grant money to conduct that poll). But could a poll of less than 1% be acceptable? This question is not easily answered, even by experts in the field. The simple answer is: the more, the better. The more complicated answer is: it depends on how many other factors you can control for, such as bias and hidden variables (see §4c for more on experimental controls).

Similarly, we might ask what counts as an unbiased sample. An overly simple answer is: the sample is taken randomly, that is, by using a procedure that prevents consciously or unconsciously favoring one segment of the population over another (flipping a coin, drawing lottery balls). But reality is not simple. In political polls, it is important not to use a selection procedure that results in a sample with a larger number of members of one political party than another relative to their distribution in the population, even if the resulting sample is random. For example, the two most prominent parties in the U.S. are the Democratic Party and the Republican Party. If 47% of the U.S. is Republican and 53% is Democrat, an unbiased sample would have approximately 47% Republicans and 53% Democrats. But notice that simply choosing at random may not guarantee that result; it could easily occur, just by choosing randomly, that our sample has 70% Democrats and 30% Republicans (suppose our computer chose, albeit randomly, from a highly Democratic neighborhood). Therefore, we want to control for representativeness in some criteria, such as gender, age, and education. And we explicitly want to avoid controlling for the results we are interested in; if we controlled for particular answers to the questions on our poll, we would not learn anything—we would get all and only the answers we controlled for.

Difficulties determining representativeness suggest that reliable generalizations are not easy to construct. If we generalize on the basis of samples that are too small or if we cannot control for bias, we commit the informal fallacy of hasty generalization (see §5b). In order to generalize well, it seems we need a bit of machinery to guarantee representativeness. In fact, it seems we need an experiment, one of the primary tools in causal reasoning (see §4c below).

b. Analogy

Argument from Analogy, also called analogical reasoning, is a way of reasoning informally about events or objects based on their similarities. A classic instance of reasoning by analogy occurs in archaeology, when researchers attempt to determine whether a stone object is an artifact (a human-made item) or simply a rock. By comparing the features of an unknown stone with well-known artifacts, archaeologists can infer whether a particular stone is an artifact. Other examples include identifying animals’ tracks by their similarities with pictures in a guidebook and consumer reports on the reliability of products.

To see how arguments from analogy work in detail, imagine two people who, independently of one another, want to buy a new pickup truck. Each chooses a make and model he or she likes, and let us say they decide on the same truck. They then visit a number of consumer reporting websites to read reports on trucks matching the features of the make and model they chose, for instance, the year it was built, the size of the engine (6 cyl. or 8 cyl.), the type of transmission (2WD or 4WD), the fuel mileage, and the cab size (standard, extended, crew). Now, let us say one of our prospective buyers is interested in safety—he or she wants a tough, safe vehicle that will protect against injuries in case of a crash. The other potential buyer is interested in mechanical reliability—he or she does not want to spend a lot of time and money fixing mechanical problems.

With this in mind, here is how our two buyers might reason analogically about whether to purchase the truck (with some fake report data included):

Buyer 1

  1. The truck I have in mind was built in 2012, has a 6-cylinder engine, a 2WD transmission, and a king cab.
  2. 62 people who bought trucks like this one posted consumer reports and have driven it for more than a year.
  3. 88% of those 62 people report that the truck feels very safe.
  4. Therefore, the truck I am looking at will likely be very safe.

Buyer 2

  1. The truck I have in mind was built in 2012, has a 6-cylinder engine, a 2WD transmission, and a king cab.
  2. 62 people who bought trucks like this one posted consumer reports and have driven it for more than a year.
  3. 88% of those 62 people report that the truck has had no mechanical problems.
  4. Therefore, the truck I am looking at will likely have no mechanical problems.

Are the features of these analogous vehicles (the ones reported on) sufficiently numerous and relevant for helping our prospective truck buyers decide whether to purchase the truck in question (the one on the lot)? Since we have some idea that the type of engine and transmission in a vehicle contribute to its mechanical reliability, Buyer 2 may have some relevant features on which to draw a reliable analogy. Fuel mileage and cab size are not obviously relevant, but engine specifications seem to be. Are these specifications numerous enough? That depends on whether anything else that we are not aware of contributes to overall reliability. Of course, if the trucks having the features we know also have all other relevant features we do not know (if there are any), then Buyer 2 may still be able to draw a reliable inference from analogy. Of course, we do not currently know this.

Alternatively, Buyer 1 seems to have very few relevant features on which to draw a reliable analogy. The features listed are not obviously related to safety. Are there safety options a buyer may choose but that are not included in the list? For example, can a buyer choose side-curtain airbags, or do such airbags come standard in this model? Does cab size contribute to overall safety? Although there are a number of similarities between the trucks, it is not obvious that we have identified features relevant to safety or whether there are enough of them. Further, reports of “feeling safe” are not equivalent to a truck actually being safe. Better evidence would be crash test data or data from actual accidents involving this truck. This information is not likely to be on a consumer reports website.

A further difficulty is that, in many cases, it is difficult to know whether many similarities are necessary if the similarities are relevant. For instance, if having lots of room for passengers is your primary concern, then any other features are relevant only insofar as they affect cab size. The features that affect cab size may be relatively small.

This example shows that arguments from analogy are difficult to formulate well. Arguments from analogy can be good arguments when critical thinkers identify a sufficient number of features of known objects that are also relevant to the feature inferred to be shared by the object in question. If a rock is shaped like a cutting tool, has marks consistent with shaping and sharpening, and has wear marks consistent with being held in a human hand, it is likely that rock is an artifact. But not all cases are as clear.

It is often difficult to determine whether the features we have identified are sufficiently numerous or relevant to our interests. To determine whether an argument from analogy is good, a person may need to identify a causal relationship between those features and the one in which she is interested (as in the case with a vehicle’s mechanical reliability). This usually takes the form of an experiment, which we explore below (§4c).

Difficulties with constructing reliable generalizations and analogies have led critical thinkers to develop sophisticated methods for controlling for the ways these arguments can go wrong. The most common way to avoid the pitfalls of these arguments is to identify the causal structures in the world that account for or underwrite successful generalizations and analogies. Causal arguments are the primary method of controlling for extraneous causal influences and identifying relevant causes. Their development and complexity warrant regarding them as a distinct form of informal reasoning.

c. Causal Reasoning

Causal arguments attempt to draw causal conclusions (that is, statements that express propositions about causes: x causes y) from premises about relationships among events or objects. Though it is not always possible to construct a causal argument, when available, they have an advantage over other types of inductive arguments in that they can employ mechanisms (experiments) that reduce the risks involved in generalizations and analogies.

The interest in identifying causal relationships often begins with the desire to explain correlations among events (as pollen levels increase, so do allergy symptoms) or with the desire to replicate an event (building muscle, starting a fire) or to eliminate an event (polio, head trauma in football).

Correlations among events may be positive (where each event increases at roughly the same rate) or negative (where one event decreases in proportion to another’s increase). Correlations suggest a causal relationship among the events correlated.

graphs of correlations

But we must be careful; correlations are merely suggestive—other forces may be at work. Let us say the y-axis in the charts above represents the number of millionaires in the U.S. and the x-axis represents the amount of money U.S. citizens pay for healthcare each year. Without further analysis, a positive correlation between these two may lead someone to conclude that increasing wealth causes people to be more health conscious and to seek medical treatment more often. A negative correlation may lead someone to conclude that wealth makes people healthier and, therefore, that they need to seek medical care less frequently.

Unfortunately, correlations can occur without any causal structures (mere coincidence) or because of a third, as-yet-unidentified event (a cause common to both events, or “common cause”), or the causal relationship may flow in an unexpected direction (what seems like the cause is really the effect). In order to determine precisely which event (if any) is responsible for the correlation, reasoners must eliminate possible influences on the correlation by “controlling” for possible influences on the relationship (variables).

Critical thinking about causes begins by constructing hypotheses about the origins of particular events. A hypothesis is an explanation or event that would account for the event in question. For example, if the question is how to account for increased acne during adolescence, and we are not aware of the existence of hormones, we might formulate a number of hypotheses about why this happens: during adolescence, people’s diets change (parents no longer dictate their meals), so perhaps some types of food cause acne; during adolescence, people become increasingly anxious about how they appear to others, so perhaps anxiety or stress causes acne; and so on.

After we have formulated a hypothesis, we identify a test implication that will help us determine whether our hypothesis is correct. For instance, if some types of food cause acne, we might choose a particular food, say, chocolate, and say: if chocolate causes acne (hypothesis), then decreasing chocolate will decrease acne (test implication). We then conduct an experiment to see whether our test implication occurs.

Reasoning about our experiment would then look like one of the following arguments:

Confirming Experiment Disconfirming Experiment
1. If H, then TI 1. If H, then TI.
2. TI. 2. Not-TI.
3. Therefore, probably H. 3. Therefore, probably Not-H.

There are a couple of important things to note about these arguments. First, despite appearances, both are inductive arguments. The one on the left commits the formal fallacy of affirming the consequent, so, at best, the premises confer only some degree of probability on the conclusion. The argument on the right looks to be deductive (on the face of it, it has the valid form modus tollens), but it would be inappropriate to regard it deductively. This is because we are not evaluating a logical connection between H and TI, we are evaluating a causal connection—TI might be true or false regardless of H (we might have chosen an inappropriate test implication or simply gotten lucky), and therefore, we cannot conclude with certainty that H does not causally influence TI. Therefore, “If…, then…” statements in experiments must be read as causal conditionals and not material conditionals (the term for how we used conditionals above).

Second, experiments can go wrong in many ways, so no single experiment will grant a high degree of probability to its causal conclusion. Experiments may be biased by hidden variables (causes we did not consider or detect, such as age, diet, medical history, or lifestyle), auxiliary assumptions (the theoretical assumptions by which evaluating the results may be faulty), or underdetermination (there may be a number of hypotheses consistent with those results; for example, if it is actually sugar that causes acne, then chocolate bars, ice cream, candy, and sodas would yield the same test results). Because of this, experiments either confirm or disconfirm a hypothesis; that is, they give us some reason (but not a particularly strong reason) to believe our hypothesized causes are or are not the causes of our test implications, and therefore, of our observations (see Quine and Ullian, 1978). Because of this, experiments must be conducted many times, and only after we have a number of confirming or disconfirming results can we draw a strong inductive conclusion. (For more, see “Confirmation and Induction.”)

Experiments may be formal or informal. In formal experiments, critical thinkers exert explicit control over experimental conditions: experimenters choose participants, include or exclude certain variables, and identify or introduce hypothesized events. Test subjects are selected according to control criteria (criteria that may affect the results and, therefore, that we want to mitigate, such as age, diet, and lifestyle) and divided into control groups (groups where the hypothesized cause is absent) and experimental groups (groups where the hypothesized cause is present, either because it is introduced or selected for).

Subjects are then placed in experimental conditions. For instance, in a randomized study, the control group receives a placebo (an inert medium) whereas the experimental group receives the hypothesized cause—the putative cause is introduced, the groups are observed, and the results are recorded and compared. When a hypothesized cause is dangerous (such as smoking) or its effects potentially irreversible (for instance, post-traumatic stress disorder), the experimental design must be restricted to selecting for the hypothesized cause already present in subjects, for example, in retrospective (backward-looking) and prospective (forward-looking) studies. In all types of formal experiments, subjects are observed under exposure to the test or placebo conditions for a specified time, and results are recorded and compared.

In informal experiments, critical thinkers do not have access to sophisticated equipment or facilities and, therefore, cannot exert explicit control over experimental conditions. They are left to make considered judgments about variables. The most common informal experiments are John Stuart Mill’s five methods of inductive reasoning, called Mill’s Methods, which he first formulated in A System of Logic (1843). Here is a very brief summary of Mill’s five methods:

(1) The Method of Agreement

If all conditions containing the event y also contain x, x is probably the cause of y.

For example:

“I’ve eaten from the same box of cereal every day this week, but all the times I got sick after eating cereal were times when I added strawberries. Therefore, the strawberries must be bad.”

(2) The Method of Difference

If all conditions lacking y also lack x, x is probably the cause of y.

For example:

“The organization turned all its tax forms in on time for years, that is, until our comptroller, George, left; after that, we were always late. Only after George left were we late. Therefore, George was probably responsible for getting our tax forms in on time.”

(3) The Joint Method of Agreement and Difference

If all conditions containing event y also contain event x, and all events lacking y also lack x, x is probably the cause of y.

For example:

“The conditions at the animal shelter have been pretty regular, except we had a string of about four months last year when the dogs barked all night, every night. But at the beginning of those four months we sheltered a redbone coonhound, and the barking stopped right after a family adopted her. All the times the redbone hound wasn’t present, there was no barking. Only the time she was present was there barking. Therefore, she probably incited all the other dogs to bark.”

(4) The Method of Concomitant Variation

If the frequency of event y increases and decreases as event x increases and decreases, respectively, x is probably the cause of y.

For example:

“We can predict the amount of alcohol sales by the rate of unemployment. As unemployment rises, so do alcohol sales. As unemployment drops, so do alcohol sales. Last quarter marked the highest unemployment in three years, and our sales last quarter are the highest they had been in those three years. Therefore, unemployment probably causes people to buy alcohol.”

(5) The Method of Residues

If a number of factors x, y, and z, may be responsible for a set of events A, B, and C, and if we discover reasons for thinking that x is the cause of A and y is the cause of B, then we have reason to believe z is the cause of C.

For example:

“The people who come through this medical facility are usually starving and have malaria, and a few have polio. We are particularly interested in treating the polio. Take this patient here: she is emaciated, which is caused by starvation; and she has a fever, which is caused by malaria. But notice that her muscles are deteriorating, and her bones are sore. This suggests she also has polio.”

d. Abduction

Not all inductive reasoning is inferential. In some cases, an explanation is needed before we can even begin drawing inferences. Consider Darwin’s idea of natural selection. Natural selection is not an object, like a blood vessel or a cellular wall, and it is not, strictly speaking, a single event. It cannot be detected in individual organisms or observed in a generation of offspring. Natural selection is an explanation of biodiversity that combines the process of heritable variation and environmental pressures to account for biomorphic change over long periods of time. With this explanation in hand, we can begin to draw some inferences. For instance, we can separate members of a single species of fruit flies, allow them to reproduce for several generations, and then observe whether the offspring of the two groups can reproduce. If we discover they cannot reproduce, this is likely due to certain mutations in their body types that prevent them from procreating. And since this is something we would expect if natural selection were true, we have one piece of confirming evidence for natural selection. But how do we know the explanations we come up with are worth our time?

Coined by C. S. Peirce (1839-1914), abduction, also called retroduction, or inference to the best explanation, refers to a way of reasoning informally that provides guidelines for evaluating explanations. Rather than appealing to types of arguments (generalization, analogy, causation), the value of an explanation depends on the theoretical virtues it exemplifies. A theoretical virtue is a quality that renders an explanation more or less fitting as an account of some event. What constitutes fittingness (or “loveliness,” as Peter Lipton (2004) calls it) is controversial, but many of the virtues are intuitively compelling, and abduction is a widely accepted tool of critical thinking.

The most widely recognized theoretical virtue is probably simplicity, historically associated with William of Ockham (1288-1347) and known as Ockham’s Razor. A legend has it that Ockham was asked whether his arguments for God’s existence prove that only one God exists or whether they allow for the possibility that many gods exist. He supposedly responded, “Do not multiply entities beyond necessity.” Though this claim is not found in his writings, Ockham is now famous for advocating that we restrict our beliefs about what is true to only what is absolutely necessary for explaining what we observe.

In contemporary theoretical use, the virtue of simplicity is invoked to encourage caution in how many mechanisms we introduce to explain an event. For example, if natural selection can explain the origin of biological diversity by itself, there is no need to hypothesize both natural selection and a divine designer. But if natural selection cannot explain the origin of, say, the duck-billed platypus, then some other mechanism must be introduced. Of course, not just any mechanism will do. It would not suffice to say the duck-billed platypus is explained by natural selection plus gremlins. Just why this is the case depends on other theoretical virtues; ideally, the virtues work together to help critical thinkers decide among competing hypotheses to test. Here is a brief sketch of some other theoretical virtues or ideals:

Conservatism – a good explanation does not contradict well-established views in a field.

Independent Testability – a good explanation is successful on different occasions under similar circumstances.

Fecundity – a good explanation leads to results that make even more research possible.

Explanatory Depth – a good explanation provides details of how an event occurs.

Explanatory Breadth – a good explanation also explains other, similar events.

Though abduction is structurally distinct from other inductive arguments, it functions similarly in practice: a good explanation provides a probabilistic reason to believe a proposition. This is why it is included here as a species of inductive reasoning. It might be thought that explanations only function to help critical thinkers formulate hypotheses, and do not, strictly speaking, support propositions. But there are intuitive examples of explanations that support propositions independently of however else they may be used. For example, a critical thinker may argue that material objects exist outside our minds is a better explanation of why we perceive what we do (and therefore, a reason to believe it) than that an evil demon is deceiving me, even if there is no inductive or deductive argument sufficient for believing that the latter is false. (For more, see “Charles Sanders Peirce: Logic.”)

5. Detecting Poor Reasoning

Our attempts at thinking critically often go wrong, whether we are formulating our own arguments or evaluating the arguments of others. Sometimes it is in our interests for our reasoning to go wrong, such as when we would prefer someone to agree with us than to discover the truth value of a proposition. Other times it is not in our interests; we are genuinely interested in the truth, but we have unwittingly made a mistake in inferring one proposition from others. Whether our errors in reasoning are intentional or unintentional, such errors are called fallacies (from the Latin, fallax, which means “deceptive”). Recognizing and avoiding fallacies helps prevent critical thinkers from forming or maintaining defective beliefs.

Fallacies occur in a number of ways. An argument’s form may seem to us valid when it is not, resulting in a formal fallacy. Alternatively, an argument’s premises may seem to support its conclusion strongly but, due to some subtlety of meaning, do not, resulting in an informal fallacy. Additionally, some of our errors may be due to unconscious reasoning processes that may have been helpful in our evolutionary history, but do not function reliably in higher order reasoning. These unconscious reasoning processes are now widely known as heuristics and biases. Each type is briefly explained below.

a. Formal Fallacies

Formal fallacies occur when the form of an argument is presumed or seems to be valid (whether intentionally or unintentionally) when it is not. Formal fallacies are usually invalid variations of valid argument forms. Consider, for example, the valid argument form modus ponens (this is one of the rules of inference mentioned in §3b):

modus ponens (valid argument form)

1. p → q 1. If it is a cat, then it is a mammal.
2. p 2. It is a cat.
3. /.: q 3. Therefore, it is a mammal.

In modus ponens, we assume or “affirm” both the conditional and the left half of the conditional (called the antecedent): (p à q) and p. From these, we can infer that q, the second half or consequent, is true. This a valid argument form: if the premises are true, the conclusion cannot be false.

Sometimes, however, we invert the conclusion and the second premise, affirming that the conditional, (p à q), and the right half of the conditional, q (the consequent), are true, and then inferring that the left half, p (the antecedent), is true. Note in the example below how the conclusion and second premise are switched. Switching them in this way creates a problem.

modus ponens affirming the consequent
(valid argument form) (formal fallacy)
1. p → q 1. p → q
2. p 2. q q, the consequent of the conditional in premise 1, has been “affirmed” in premise 2
3. /.: q 3. /.: p (?)

To get an intuitive sense of why “affirming the consequent” is a problem, consider this simple example:

affirming the consequent

    1. If it is a cat, then it is a mammal.
    2. It is a mammal.
    3. Therefore, it is a cat.(?)

From the fact that something is a mammal, we cannot conclude that it is a cat. It may be a dog or a mouse or a whale. The premises can be true and yet the conclusion can still be false. Therefore, this is not a valid argument form. But since it is an easy mistake to make, it is included in the set of common formal fallacies.

Here is a second example with the rule of inference called modus tollens. Modus tollens involves affirming a conditional, (p à q), and denying that conditional’s consequent: ~q. From these two premises, we can validly infer the denial of the antecedent: ~p. But if we switch the conclusion and the second premise, we get another fallacy, called denying the antecedent.

modus tollens denying the antecedent
(valid argument form) (formal fallacy)
1. p → q 1. p → q p, the antecedent of the conditional in premise 1, has been “denied” in premise 2
2. ~q 2. ~p
3. ~p 3. /.: ~q(?)
1. If it is a cat, then it is a mammal. 1. If it is a cat, then it is a mammal.
2. It is not a mammal. 2. It is not a cat.
3. Therefore, it is not a cat. 3. Therefore, it is not a mammal.(?)

Technically, all informal reasoning is formally fallacious—all informal arguments are invalid. Nevertheless, since those who offer inductive arguments rarely presume they are valid, we do not regard them as reasoning fallaciously.

b. Informal Fallacies

Informal fallacies occur when the meaning of the terms used in the premises of an argument suggest a conclusion that does not actually follow from them (the conclusion either follows weakly or with no strength at all). Consider an example of the informal fallacy of equivocation, in which a word with two distinct meanings is used in both of its meanings:

    1. Any law can be repealed by Congress.
    2. Gravity is a law.
    3. Therefore, gravity can be repealed by Congress.

In this case, the argument’s premises are true when the word “law” is rightly interpreted, but the conclusion does not follow because the word law has a different referent in premise 1 (political laws) than in premise 2 (a law of nature). This argument equivocates on the meaning of law and is, therefore, fallacious.

Consider, also, the informal fallacy of ad hominem, abusive, when an arguer appeals to a person’s character as a reason to reject her proposition:

“Elizabeth argues that humans do not have souls; they are simply material beings. But Elizabeth is a terrible person and often talks down to children and the elderly. Therefore, she could not be right that humans do not have souls.”

The argument might look like this:

    1. Elizabeth is a terrible person and often talks down to children and the elderly.
    2. Therefore, Elizabeth is not right that humans do not have souls.

The conclusion does not follow because whether Elizabeth is a terrible person is irrelevant to the truth of the proposition that humans do not have souls. Elizabeth’s argument for this statement is relevant, but her character is not.

Another way to evaluate this fallacy is to note that, as the argument stands, it is an enthymeme (see §2); it is missing a crucial premise, namely: If anyone is a terrible person, that person makes false statements. But this premise is clearly false. There are many ways in which one can be a terrible person, and not all of them imply that someone makes false statements. (In fact, someone could be terrible precisely because they are viciously honest.) Once we fill in the missing premise, we see the argument is not cogent because at least one premise is false.

Importantly, we face a number of informal fallacies on a daily basis, and without the ability to recognize them, their regularity can make them seem legitimate. Here are three others that only scratch the surface:

Appeal to the People: We are often encouraged to believe or do something just because everyone else does. We are encouraged to believe what our political party believes, what the people in our churches or synagogues or mosques believe, what people in our family believe, and so on. We are encouraged to buy things because they are “bestsellers” (lots of people buy them). But the fact that lots of people believe or do something is not, on its own, a reason to believe or do what they do.

Tu Quoque (You, too!): We are often discouraged from pursuing a conclusion or action if our own beliefs or actions are inconsistent with them. For instance, if someone attempts to argue that everyone should stop smoking, but that person smokes, their argument is often given less weight: “Well, you smoke! Why should everyone else quit?” But the fact that someone believes or does something inconsistent with what they advocate does not, by itself, discredit the argument. Hypocrites may have very strong arguments despite their personal inconsistencies.

Base Rate Neglect: It is easy to look at what happens after we do something or enact a policy and conclude that the act or policy caused those effects. Consider a law reducing speed limits from 75 mph to 55 mph in order to reduce highway accidents. And, in fact, in the three years after the reduction, highway accidents dropped 30%! This seems like a direct effect of the reduction. However, this is not the whole story. Imagine you looked back at the three years prior to the law and discovered that accidents had dropped 30% over that time, too. If that happened, it might not actually be the law that caused the reduction in accidents. The law did not change the trend in accident reduction. If we only look at the evidence after the law, we are neglecting the rate at which the event occurred without the law. The base rate of an event is the rate that the event occurs without the potential cause under consideration. To take another example, imagine you start taking cold medicine, and your cold goes away in a week. Did the cold medicine cause your cold to go away? That depends on how long colds normally last and when you took the medicine. In order to determine whether a potential cause had the effect you suspect, do not neglect to compare its putative effects with the effects observed without that cause.

For more on formal and informal fallacies and over 200 different types with examples, see “Fallacies.”

c. Heuristics and Biases

In the 1960s, psychologists began to suspect there is more to human reasoning than conscious inference. Daniel Kahneman and Amos Tversky confirmed these suspicions with their discoveries that many of the standard assumptions about how humans reason in practice are unjustified. In fact, humans regularly violate these standard assumptions, the most significant for philosophers and economists being that humans are fairly good at calculating the costs and benefits of their behavior; that is, they naturally reason according to the dictates of Expected Utility Theory. Kahneman and Tversky showed that, in practice, reasoning is affected by many non-rational influences, such as the wording used to frame scenarios (framing bias) and information most vividly available to them (the availability heuristic).

Consider the difference in your belief about the likelihood of getting robbed before and after seeing a news report about a recent robbery, or the difference in your belief about whether you will be bitten by a shark the week before and after Discovery Channel’s “Shark Week.” For most of us, we are likely to regard their likelihood as higher after we have seen these things on television than before. Objectively, they are no more or less likely to happen regardless of our seeing them on television, but we perceive they are more likely because their possibility is more vivid to us. These are examples of the availability heuristic.

Since the 1960s, experimental psychologists and economists have conducted extensive research revealing dozens of these unconscious reasoning processes, including ordering bias, the representativeness heuristic, confirmation bias, attentional bias, and the anchoring effect. The field of behavioral economics, made popular by Dan Ariely (2008; 2010; 2012) and Richard Thaler and Cass Sunstein (2009), emerged from and contributes to heuristics and biases research and applies its insights to social and economic behaviors.

Ideally, recognizing and understanding these unconscious, non-rational reasoning processes will help us mitigate their undermining influence on our reasoning abilities (Gigerenzer, 2003). However, it is unclear whether we can simply choose to overcome them or whether we have to construct mechanisms that mitigate their influence (for instance, using double-blind experiments to prevent confirmation bias).

6. The Scope and Virtues of Good Reasoning

Whether the process of critical thinking is productive for reasoners—that is, whether it actually answers the questions they are interested in answering—often depends on a number of linguistic, psychological, and social factors. We encountered some of the linguistic factors in §1. In closing, let us consider some of the psychological and social factors that affect the success of applying the tools of critical thinking.

a. Context

Not all psychological and social contexts are conducive for effective critical thinking. When reasoners are depressed or sad or otherwise emotionally overwhelmed, critical thinking can often be unproductive or counterproductive. For instance, if someone’s child has just died, it would be unproductive (not to mention cruel) to press the philosophical question of why a good God would permit innocents to suffer or whether the child might possibly have a soul that could persist beyond death. Other instances need not be so extreme to make the same point: your company’s holiday party (where most people would rather remain cordial and superficial) is probably not the most productive context in which to debate the president’s domestic policy or the morality of abortion.

The process of critical thinking is primarily about detecting truth, and truth may not always be of paramount value. In some cases, comfort or usefulness may take precedence over truth. The case of the loss of a child is a case where comfort seems to take precedence over truth. Similarly, consider the case of determining what the speed limit should be on interstate highways. Imagine we are trying to decide whether it is better to allow drivers to travel at 75 mph or to restrict them to 65. To be sure, there may be no fact of the matter as to which is morally better, and there may not be any difference in the rate of interstate deaths between states that set the limit at 65 and those that set it at 75. But given the nature of the law, a decision about which speed limit to set must be made. If there is no relevant difference between setting the limit at 65 and setting it at 75, critical thinking can only tell us that, not which speed limit to set. This shows that, in some cases, concern with truth gives way to practical or preferential concerns (for example, Should I make this decision on the basis of what will make citizens happy? Should I base it on whether I will receive more campaign contributions from the business community?). All of this suggests that critical thinking is most productive in contexts where participants are already interested in truth.

b. The Principle of Charity/Humility

Critical thinking is also most productive when people in the conversation regard themselves as fallible, subject to error, misinformation, and deception. The desire to be “right” has a powerful influence on our reasoning behavior. It is so strong that our minds bias us in favor of the beliefs we already hold even in the face of disconfirming evidence (a phenomenon known as “confirmation bias”). In his famous article, “The Ethics of Belief” (1878), W. K. Clifford notes that, “We feel much happier and more secure when we think we know precisely what to do, no matter what happens, than when we have lost our way and do not know where to turn. … It is the sense of power attached to a sense of knowing that makes men desirous of believing, and afraid of doubting” (2010: 354).

Nevertheless, when we are open to the possibility that we are wrong, that is, if we are humble about our conclusions and we interpret others charitably, we have a better chance at having rational beliefs in two senses. First, if we are genuinely willing to consider evidence that we are wrong—and we demonstrate that humility—then we are more likely to listen to others when they raise arguments against our beliefs. If we are certain we are right, there would be little reason to consider contrary evidence. But if we are willing to hear it, we may discover that we really are wrong and give up faulty beliefs for more reasonable ones.

Second, if we are willing to be charitable to arguments against our beliefs, then if our beliefs are unreasonable, we have an opportunity to see the ways in which they are unreasonable. On the other hand, if our beliefs are reasonable, then we can explain more effectively just how well they stand against the criticism. This is weakly analogous to competition in certain types of sporting events, such as basketball. If you only play teams that are far inferior to your own, you do not know how good your team really is. But if you can beat a well-respected team on fair terms, any confidence you have is justified.

c. The Principle of Caution

In our excitement over good arguments, it is easy to overextend our conclusions, that is, to infer statements that are not really warranted by our evidence. From an argument for a first, uncaused cause of the universe, it is tempting to infer the existence of a sophisticated deity such as that of the Judeo-Christian tradition. From an argument for the compatibilism of the free will necessary for moral responsibility and determinism, it is tempting to infer that we are actually morally responsible for our behaviors. From an argument for negative natural rights, it is tempting to infer that no violation of a natural right is justifiable. Therefore, it is prudent to continually check our conclusions to be sure they do not include more content than our premises allow us to infer.

Of course, the principle of caution must itself be used with caution. If applied too strictly, it may lead reasoners to suspend all belief, and refrain from interacting with one another and their world. This is not, strictly speaking, problematic; ancient skeptics, such as the Pyrrhonians, advocated suspending all judgments except those about appearances in hopes of experiencing tranquility. However, at least some judgments about the long-term benefits and harms seem indispensable even for tranquility, for instance, whether we should retaliate in self-defense against an attacker or whether we should try to help a loved one who is addicted to drugs or alcohol.

d. The Expansiveness of Critical Thinking

The importance of critical thinking cannot be overstated because its relevance extends into every area of life, from politics, to science, to religion, to ethics. Not only does critical thinking help us draw inferences for ourselves, it helps us identify and evaluate the assumptions behind statements, the moral implications of statements, and the ideologies to which some statements commit us. This can be a disquieting and difficult process because it forces us to wrestle with preconceptions that might not be accurate. Nevertheless, if the process is conducted well, it can open new opportunities for dialogue, sometimes called “critical spaces,” that allow people who might otherwise disagree to find beliefs in common from which to engage in a more productive conversation.

It is this possibility of creating critical spaces that allows philosophical approaches like Critical Theory to effectively challenge the way social, political, and philosophical debates are framed. For example, if a discussion about race or gender or sexuality or gender is framed in terms that, because of the origins those terms or the way they have functioned socially, alienate or disproportionately exclude certain members of the population, then critical space is necessary for being able to evaluate that framing so that a more productive dialogue can occur (see Foresman, Fosl, and Watson, 2010, ch. 10 for more on how critical thinking and Critical Theory can be mutually supportive).

e. Productivity and the Limits of Rationality

Despite the fact that critical thinking extends into every area of life, not every important aspect of our lives is easily or productively subjected to the tools of language and logic. Thinkers who are tempted to subject everything to the cold light of reason may discover they miss some of what is deeply enjoyable about living. The psychologist Abraham Maslow writes, “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail” (1966: 16). But it is helpful to remember that language and logic are tools, not the projects themselves. Even formal reasoning systems depend on axioms that are not provable within their own systems (consider Euclidean geometry or Peano arithmetic). We must make some decisions about what beliefs to accept and how to live our lives on the basis of considerations outside of critical thinking.

Borrowing an example from William James (1896), consider the statement, “Religion X is true.” James says that, while some people find this statement interesting, and therefore, worth thinking critically about, others may not be able to consider the truth of the statement. For any particular religious tradition, we might not know enough about it to form a belief one way or the other, and even suspending judgment may be difficult, since it is not obvious what we are suspending judgment about.

If I say to you: ‘Be a theosophist or be a Mohammedan,’ it is probably a dead option, because for you neither hypothesis is likely to be alive. But if I say: ‘Be an agnostic or be a Christian,’ it is otherwise: trained as you are, each hypothesis makes some appeal, however small, to your belief (2010: 357).

Ignoring the circularity in his definition of “dead option,” James’s point seems to be that if you know nothing about a view or what statements it entails, no amount of logic or evidence could help you form a reasonable belief about that position.

We might criticize James at this point because his conclusion seems to imply that we have no duty to investigate dead options, that is, to discover if there is anything worth considering in them. If we are concerned with truth, the simple fact that we are not familiar with a proposition does not mean it is not true or potentially significant for us. But James’s argument is subtler than this criticism suggests. Even if you came to learn about a particularly foreign religious tradition, its tenets may be so contrary to your understanding of the world that you could not entertain them as possible beliefs of yours. For instance, you know perfectly well that, if some events had been different, Hitler would not have existed: his parents might have had no children, or his parents’ parents might have had no children. You know roughly what it would mean for Hitler not to have existed and the sort of events that could have made it true that he did not exist. But how much evidence would it take to convince you that, in fact, Hitler did not exist, that is, that your belief that Hitler did exist is false? Could there be an argument strong enough? Not obviously. Since all the information we have about Hitler unequivocally points to his existence, any arguments against that belief would have to affect a very broad range of statements; they would have to be strong enough to make us skeptical of large parts of reality.

7. Approaches to Improving Reasoning through Critical Thinking

Recall that the goal of critical thinking is not just to study what makes reasons and statements good, but to help us improve our ability to reason, that is, to improve our ability to form, hold, and discard beliefs according to whether they meet the standards of good thinking. Some ways of approaching this latter goal are more effective than others. While the classical approach focuses on technical reasoning skills, the Paul/Elder model encourages us to think in terms of critical concepts, and irrationality approaches use empirical research on instances of poor reasoning to help us improve reasoning where it is least obvious we need it and where we need it most. Which approach or combination of approaches is most effective depends, as noted above, on the context and limits of critical thinking, but also on scientific evidence of their effectiveness. Those who teach critical thinking, of all people, should be engaged with the evidence relevant to determining which approaches are most effective.

a. Classical Approaches

The classic approach to critical thinking follows roughly the structure of this article: critical thinkers attempt to interpret statements or arguments clearly and charitably, and then they apply the tools of formal and informal logic and science, while carefully attempting to avoid fallacious inferences (see Weston, 2008; Walton, 2008; Watson and Arp, 2015). This approach requires spending extensive time learning and practicing technical reasoning strategies. It presupposes that reasoning is primarily a conscious activity, and that enhancing our skills in these areas will improve our ability to reason well in ordinary situations.

There are at least two concerns about this approach. First, it is highly time intensive relative to its payoff. Learning the terminology of systems like propositional and categorical logic and the names of the fallacies, and practicing applying these tools to hypothetical cases requires significant time and energy. And it is not obvious, given the problems with heuristics and biases, whether this practice alone makes us better reasoners in ordinary contexts. Second, many of the ways we reason poorly are not consciously accessible (recall the heuristics and biases discussion in §5c). Our biases, combined with the heuristics we rely on in ordinary situations, can only be detected in experimental settings, and addressing them requires restructuring the ways in which we engage with evidence (see Thaler and Sunstein, 2009).

b. The Paul/Elder Model

Richard Paul and Linda Elder (Paul and Elder, 2006; Paul, 2012) developed an alternative to the classical approach on the assumption that critical thinking is not something that is limited to academic study or to the discipline of philosophy. On their account, critical thinking is a broad set of conceptual skills and habits aimed at a set of standards that are widely regarded as virtues of thinking: clarity, accuracy, depth, fairness, and others. They define it simply as “the art of analyzing and evaluating thinking with a view to improving it” (2006: 4). Their approach, then, is to focus on the elements of thought and intellectual virtues that help us form beliefs that meet these standards.

The Paul/Elder model is made up of three sets of concepts: elements of thought, intellectual standards, and intellectual traits. In this model, we begin by identifying the features present in every act of thought. They use “thought” to mean critical thought aimed at forming beliefs, not just any act of thinking, musing, wishing, hoping, remembering. According to the model, every act of thought involves:

point of view concepts
purpose interpretation and inference
implications and consequences information
assumptions question at issue

These comprise the subject matter of critical thinking; that is, they are what we are evaluating when we are thinking critically. We then engage with this subject matter by subjecting them to what Paul and Elder call universal intellectual standards. These are evaluative goals we should be aiming at with our thinking:

clarity breadth
accuracy logic
precision significance
relevance fairness
depth

While in classical approaches, logic is the predominant means of thinking critically, in the Paul/Elder model, it is put on equal footing with eight other standards. Finally, Paul and Elder argue that it is helpful to approach the critical thinking process with a set of intellectual traits or virtues that dispose us to using elements and standards well.

intellectual humility intellectual perseverance
intellectual autonomy confidence in reason
intellectual integrity intellectual empathy
intellectual courage fairmindedness

To remind us that these are virtues of thought relevant to critical thinking, they use “intellectual” to distinguish these traits from their moral counterparts (moral integrity, moral courage, and so on).

The aim is that, as we become familiar with these three sets of concepts and apply them in everyday contexts, we become better at analyzing and evaluating statements and arguments in ordinary situations.

Like the classical approach, this approach presupposes that reasoning is primarily a conscious activity, and that enhancing our skills will improve our reasoning. This means that it still lacks the ability to address the empirical evidence that many of our reasoning errors cannot be consciously detected or corrected. It differs from the classical approach in that it gives the technical tools of logic a much less prominent role and places emphasis on a broader, and perhaps more intuitive, set of conceptual tools. Learning and learning to apply these concepts still requires a great deal of time and energy, though perhaps less than learning formal and informal logic. And these concepts are easy to translate into disciplines outside philosophy. Students of history, psychology, and economics can more easily recognize the relevance of asking questions about an author’s point of view and assumptions than perhaps determining whether the author is making a deductive or inductive argument. The question, then, is whether this approach improves our ability to think better than the classical approach.

c. Other Approaches

A third approach that is becoming popular is to focus on the ways we commonly reason poorly and then attempt to correct them. This can be called the Rationality Approach, and it takes seriously the empirical evidence (§5c) that many of our errors in reasoning are not due to a lack of conscious competence with technical skills or misusing those skills, but are due to subconscious dispositions to ignore or dismiss relevant information or to rely on irrelevant information.

One way to pursue this approach is to focus on beliefs that are statistically rare or “weird.” These include beliefs of fringe groups, such as conspiracy theorists, religious extremists, paranormal psychologists, and proponents of New Age metaphysics (see Gilovich, 1992; Vaughn and Schick, 2010; Coady, 2012). If we recognize the sorts of tendencies that lead to these controversial beliefs, we might be able to recognize and avoid similar tendencies in our own reasoning about less extreme beliefs, such as beliefs about financial investing, how statistics are used to justify business decisions, and beliefs about which public policies to vote for.

Another way to pursue this approach is to focus directly on the research on error, those ordinary beliefs that psychologists and behavioral economists have discovered we reason poorly, and to explore ways of changing how we frame decisions about what to believe (see Nisbett and Ross, 1980; Gilovich, 1992; Ariely, 2008; Kahneman, 2011). For example, in one study, psychologists found that judges issue more convictions just before lunch and the end of the day than in the morning or just after lunch (Danzinger, et al., 2010). Given that dockets do not typically organize cases from less significant crimes to more significant crimes, this evidence suggests that something as irrelevant as hunger can bias judicial decisions. Even though hunger has nothing to do with the truth of a belief, knowing that it can affect how we evaluate a belief can help us avoid that effect. This study might suggest something as simple as that we should avoid being hungry when making important decisions. The more we learn ways in which our brains use irrelevant information, the better we can organize our reasoning to avoid these mistakes. For more on how decisions can be improved by restructuring our decisions, see Thaler and Sunstein, 2009.

A fourth approach is to take more seriously the role that language plays in our reasoning. Arguments involve complex patterns of expression, and we have already seen how vagueness and ambiguity can undermine good reasoning (§1). The pragma-dialectics approach (or pragma-dialectical theory) is the view that the quality of an argument is not solely or even primarily a matter of its logical structure, but is more fundamentally a matter of whether it is a form of reasonable discourse (Van Eemeren and Grootendorst, 1992). The proponents of this view contend that, “The study of argumentation should … be construed as a special branch of linguistic pragmatics in which descriptive and normative perspectives on argumentative discourse are methodically integrated” (Van Eemeren and Grootendorst, 1995: 130).

The pragma-dialectics approach is a highly technical approach that uses insights from speech act theory, H. P. Grice’s philosophy of language, and the study of discourse analysis. Its use, therefore, requires a great deal of background in philosophy and linguistics. It has an advantage over other approaches in that it highlights social and practical dimensions of arguments that other approaches largely ignore. For example, argument is often public (external), in that it creates an opportunity for opposition, which influences people’s motives and psychological attitudes toward their arguments. Argument is also social in that it is part of a discourse in which two or more people try to arrive at an agreement. Argument is also functional; it aims at a resolution that can only be accommodated by addressing all the aspects of disagreement or anticipated disagreement, which can include public and social elements. Argument also has a rhetorical role (dialectical) in that it is aimed at actually convincing others, which may have different requirements than simply identifying the conditions under which they should be convinced.

These four approaches are not mutually exclusive. All of them presuppose, for example, the importance of inductive reasoning and scientific evidence. Their distinctions turn largely on which aspects of statements and arguments should take precedence in the critical thinking process and on what information will help us have better beliefs.

8. References and Further Reading

  • Ariely, Dan. 2008. Predictably Irrational: The Hidden Forces that Shape Our Decisions. New York: Harper Perennial.
  • Ariely, Dan. 2010. The Upside of Irrationality. New York: Harper Perennial.
  • Ariely, Dan. 2012. The (Honest) Truth about Dishonesty. New York: Harper Perennial.
  • Aristotle. 2002. Categories and De Interpretatione, J. L. Akrill, editor. Oxford: University of Oxford Press.
  • Clifford, W. K. 2010. “The Ethics of Belief.” In Nils Ch. Rauhut and Robert Bass, eds., Readings on the Ultimate Questions: An Introduction to Philosophy, 3rd ed. Boston: Prentice Hall, 351-356.
  • Chomsky, Noam. 1957/2002. Syntactic Structures. Berlin: Mouton de Gruyter.
  • Coady, David. What To Believe Now: Applying Epistemology to Contemporary Issues. Malden, MA: Wiley-Blackwell, 2012.
  • Danzinger, Shai, Jonathan Levav, and Liora Avnaim-Pesso. 2011. “Extraneous Factors in Judicial Decisions.” Proceedings of the National Academy of Sciences of the United States of America. Vol. 108, No. 17, 6889-6892. doi: 10.1073/pnas.1018033108.
  • Foresman, Galen, Peter Fosl, and Jamie Carlin Watson. 2017. The Critical Thinking Toolkit. Malden, MA: Wiley-Blackwell.
  • Fogelin, Robert J. and Walter Sinnott-Armstrong. 2009. Understanding Arguments: An Introduction to Informal Logic, 8th ed. Belmont, CA: Wadsworth Cengage Learning.
  • Gigerenzer, Gerd. 2003. Calculated Risks: How To Know When Numbers Deceive You. New York: Simon and Schuster.
  • Gigerenzer, Gerd, Peter Todd, and the ABC Research Group. 2000. Simple Heuristics that Make Us Smart. Oxford University Press.
  • Gilovich, Thomas. 1992. How We Know What Isn’t So. New York: Free Press.
  • James, William. “The Will to Believe”, in Nils Ch. Rauhut and Robert Bass, eds., Readings on the Ultimate Questions: An Introduction to Philosophy, 3rd ed. Boston: Prentice Hall, 2010, 356-364.
  • Kahneman, Daniel. 2011. Thinking Fast and Slow. New York: Farrar, Strauss and Giroux.
  • Lewis, David. 1986. On the Plurality of Worlds. Oxford Blackwell.
  • Lipton, Peter. 2004. Inference to the Best Explanation, 2nd ed. London: Routledge.
  • Maslow, Abraham. 1966. The Psychology of Science: A Reconnaissance. New York: Harper & Row.
  • Mill, John Stuart. 2011. A System of Logic, Ratiocinative and Inductive. New York: Cambridge University Press.
  • Nisbett, Richard and Lee Ross. 1980. Human Inference: Strategies and Shortcomings of Social Judgment. Englewood Cliffs, NJ: Prentice Hall.
  • Paul, Richard. 2012. Critical Thinking: What Every Person Needs to Survive in a Rapidly Changing World. Tomales, CA: The Foundation for Critical Thinking.
  • Paul, Richard and Linda Elder. 2006. The Miniature Guide to Critical Thinking Concepts and Tools, 4th ed. Tomales, CA: The Foundation for Critical Thinking.
  • Plantinga, Alvin. 1974. The Nature of Necessity. Oxford Clarendon.
  • Prior, Arthur. 1957. Time and Modality. Oxford, UK: Oxford University Press.
  • Prior, Arthur. 1967. Past, Present and Future. Oxford, UK: Oxford University Press.
  • Prior, Arthur. 1968. Papers on Time and Tense. Oxford, UK: Oxford University Press.
  • Quine, W. V. O. and J. S. Ullian. 1978. The Web of Belief, 2nd ed. McGraw-Hill.
  • Russell, Bertrand. 1940/1996. An Inquiry into Meaning and Truth, 2nd ed. London: Routledge.
  • Thaler, Richard and Cass Sunstein. 2009. Nudge: Improving Decisions about Health, Wealth, and Happiness. New York: Penguin Books.
  • van Eemeren, Frans H. and Rob Grootendorst. 1992. Argumentation, Communication, and Fallacies: A Pragma-Dialectical Perspective. London: Routledge.
  • van Eemeren, Frans H. and Rob Grootendorst. 1995. “The Pragma-Dialectical Approach to Fallacies.” In Hans V. Hansen and Robert C. Pinto, eds. Fallacies: Classical and Contemporary Readings. Penn State University Press, 130-144.
  • Vaughn, Lewis and Theodore Schick. 2010. How To Think About Weird Things: Critical Thinking for a New Age, 6th ed. McGraw-Hill.
  • Walton, Douglas. 2008. Informal Logic: A Pragmatic Approach, 2nd ed. New York: Cambridge University Press.
  • Watson, Jamie Carlin and Robert Arp. 2015. Critical Thinking: An Introduction to Reasoning Well, 2nd ed. London: Bloomsbury Academic.
  • Weston, Anthony. 2008. A Rulebook for Arguments, 4th ed. Indianapolis: Hackett.
  • Zadeh, Lofti. 1965. “Fuzzy Sets and Systems.” In J. Fox, ed., System Theory. Brooklyn, NY: Polytechnic Press, 29-39.

 

Author Information

Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
University of Arkansas for Medical Sciences
U. S. A.

Empirical Aesthetics

Empirical aesthetics is a research area at the intersection of psychology and neuroscience that aims to understand how people experience, evaluate, and create objects aesthetically. Its central two questions are: How do we experience beauty? How do we experience art? In practice, this means that empirical aesthetics studies (1) prototypically aesthetic responses, such as beauty or chills, and (2) responses to prototypically aesthetic objects, such as paintings and music. Empirical aesthetics also encompasses broader questions about how we experience other aesthetic experiences, such as ugliness and the sublime, and about how we create art. The field of empirical aesthetics aims to understand how such aesthetic experiences and behaviors emerge and unfold. To do so, researchers in the field link the observer’s characteristics to her responses, link the object properties to the observer’s responses, or describe an interaction between them. As a science, empirical aesthetics relies heavily on the analysis and interpretation of data. Data is primarily generated from experiments: Researchers conduct studies in which they manipulate the independent variables to observe the effect of those manipulations on one or more independent variables. In addition, empirical aesthetics relies on observational data, where people’s behavior is observed or surveyed without the introduction of manipulations.

Empirical aesthetics is as old as empirical psychology. The first thorough written account dates back to Gustav Fechner, who published Vorschule der Aesthetik in 1876. Nonetheless, the modern field of empirical aesthetics can be considered rather young. Its gain in popularity in the 21st century can be linked to the emergence of neuroaesthetics—the study of brain responses associated with aesthetic experiences—in the late 1990s. Contemporary empirical aesthetics studies aesthetic experiences with a variety of methods, including brain-imaging and measures of other physiological responses, such as the movements of the eyes and facial muscles.

Table of Contents

  1. History
    1. Fechner’s Vorschule der Aesthetik
    2. Empirical Aesthetics in the 20th century
    3. First Renaissance: Berlyne’s Aesthetics and Psychobiology
    4. Early Modern Empirical Aesthetics
    5. Second Renaissance: The Emergence of Neuroaesthetics
    6. Empirical Aesthetics in the 21st Century
  2. Subfields of Empirical Aesthetics
    1. Perceptual (Visual) Aesthetics
    2. Neuroaesthetics
    3. Music Science
    4. Other Subfields
  3. Relations to Other Fields
    1. Relations to Other Fields of Psychology and Neuroscience
    2. Relationship to Computational Sciences
    3. Relationship to Aesthetics in Philosophy
  4. Controversies
    1. Is Aesthetics Special?
    2. What Should Empirical Aesthetics Mainly be About?
    3. Population Averages vs. Individual Differences
  5. References and Further Reading

1. History

a. Fechner’s Vorschule der Aesthetik

The first comprehensive treatise on what has become known as “empirical aesthetics” was written by Gustav Fechner and published in 1876 is the two-volume book Vorschule der Aesthetik (VdA). The first volume primarily contains the descriptions of 6 principles of aesthetics, as posited by Fechner himself. Notable is also the last chapter on taste. The second volume contains a large section on art as well as the description of a further seven aesthetic principles.

The main purpose of the book is to demonstrate that aesthetic experiences, primarily aesthetic pleasure or beauty, can be studied empirically just like any other form of perception. Fechner called this empirical approach to aesthetics one “from below” and distinguished it clearly from the philosophical approach “from above.” The basic distinction made is the following: Aesthetics from below observes individual cases of aesthetic responses and infers the laws that govern all of these responses from the pattern that crystallizes across individual cases. Aesthetics from above, in contrast, posits general laws and infers from those what an individual aesthetic response should look like. While the VdA itself only contains data and descriptions of a few experiments, Fechner’s descriptions of the proposed laws clearly focus on their observable effects, implying that they can be documented in an experiment.

The direct impact of the VdA on modern empirical aesthetics remains modest. This may be because it has not been published in a translated version, or it may reflect a general reluctance to cite early work in empirical aesthetics. It is, however, well known and cited as the first major work on aesthetics by an empirical psychologist. From its content, only one experiment often serves as reference probably because the associated article exists in English. This experiment examined the effect of a rectangle’s aspect ratio on aesthetic preference. Famously, Fechner found that his participants most often named the rectangle with an aspect ratio equivalent to the golden section (1:1.618) as the one they liked best. What is less well known is that Fechner himself was critical of this finding and reported an equal preference for the square ratio (1:1) in a population of blue-collar workers. His main worry about the findings concerned the potential influence of associations on the result, specifically that participants did not merely judge the rectangular form but also its resemblance to the familiar shapes of envelopes, books, and so on.

b. Empirical Aesthetics in the 20th century

After the pioneering days of Gustav Fechner and his colleagues, psychology (and philosophy) went through an era known as Behaviorism. Behaviorism effectively claimed that psychology, as a science, can only study observable behavior. Research on inner states and subjective experiences, which form the core interest of aesthetics, was shunned. This did not deter researchers like Edgar Pierce, Oswald Külpe, Lillien Jane Martin, Robert Ogden, Kate Gordon, and many others from continuing the study of people’s preferences for visual design, art, color, and particularly individual differences in such preferences.

Most of the work on empirical aesthetics in the early and mid-20th century has not had a remarkable impact on the field. Worth mentioning, however, is the early work of Edward Thorndike, and later of Hans Eysenck, on individual differences in aesthetic responsiveness and creativity. Most other studies during this time period focused on determining what kinds of object properties—specifically consonance and dissonance of tones, as well as colors—are most rewarding to specific groups of people.

Another notable exception to the mostly forgotten early research on aesthetics is Rudolph Arnheim’s work. He looked at aesthetic experiences through the lens of Gestalt psychology’s principles of organization: balance, symmetry, composition, and dynamical complexity (the trade-off between order and complexity). Arnheim saw aesthetic experiences as a product of the integration of two sources of information: the structural configurations of the image and the particular focus of the viewer that depends on her experience and disposition. One should also note that the writings of the art historian and critic Ernst Gombrich during the same time period have informed modern empirical aesthetics.

A look at the institutional level also reveals that empirical aesthetics continued to evolve during the 20th century. A division of Esthetics was among the 19 charter divisions when the American Psychological Association (APA) was founded. This 10th division of the APA was renamed “Psychology and the Arts” in 1965. Its size was modest then, relative to other divisions, and has stayed so throughout the years.

c. First Renaissance: Berlyne’s Aesthetics and Psychobiology

After what in retrospect appears like a relative drought during behaviorism, empirical aesthetics re-emerged with Daniel Berlyne and the foundation of the International Association of Empirical Aesthetics (IAEA). The IAEA was founded at the first international congress in Paris in 1965 by Daniel Berlyne (University of Toronto, Canada), Robert Francès (Université de Paris, France), Carmelo Genovese (Università di Bologna, Italy), and Albert Wellek (Johann-Gutenberg-Universität Mainz, Germany).

The most visible effort of establishing the “studies in the new experimental aesthetics” is the so-named book edited by Berlyne (1974) which contains a collection of study reports, many conducted by Berlyne himself. In addition, Berlyne had earlier published the book Aesthetics and Psychobiology (1971) which is often cited as the main reference for Berlyne’s hypotheses on the relationship between object properties and hedonic responses.

Central to Daniel Berlyne’s own ideas on aesthetic experiences is the concept of arousal. Berlyne postulated that arousal is relevant to aesthetics in that an intermediate level of arousal would lead to the greatest hedonic response. Arousal itself is conceptualized as the result of “collative,” psychophysical, and ecological variables. The best-known and most-investigated determinants of arousal are an object’s complexity and novelty. Berlyne’s theory thus links an object’s properties, such as complexity, to their effects on the observer (arousal) and then to the aesthetic response (pleasantness, liking). The concreteness of the proposed links and variables has led many researchers to test his theory. The results have been mixed at best and therefore Berlyne’s arousal theory of aesthetic appreciation has been mostly abandoned.

d. Early Modern Empirical Aesthetics

After Berlyne’s work had again highlighted that aesthetic responses can be studied with the methods and rigor of modern experimental psychology, research and theory development in the field of empirical aesthetics continued slowly but steadily for about another 20 years. This phase of empirical aesthetics was primarily concerned with linking certain stimulus properties (mostly of images) to preference or liking judgments.

One notable theoretical step forward after Berlyne’s Aesthetics and Psychobiology was Colin Martindale’s connectionist model that viewed aesthetic pleasure as a function of the strength of so-called “cognitive units” activation. Martindale (1988) maintained that “[the apprehension of a work of art of any sort will involve activation of cognitive units … the pleasure engendered by a work of art will be a positive monotonic function of how activated this entire ensemble of cognitive units is. The more activated the ensemble of units, the more pleasurable an observer will find the stimulus to be.” Combined with the assumption that more prototypical objects activate stronger cognitive units, this led to the hypothesis that more prototypical, meaningful objects are aesthetically preferred. The results of Martindale’s experiments were in line with this view and foreshadowed the development of contemporary theories that emphasize processing fluency and meaningfulness as sources of aesthetic pleasure.

e. Second Renaissance: The Emergence of Neuroaesthetics

The introduction of modern brain-imaging techniques has changed the face of psychology forever. The introduction of functional magnetic brain imaging (fMRI) to empirical aesthetics was no exception. The first fMRI experiments that focused on aesthetic experiences were conducted in the early 2000s, and the term “neuroaesthetics” subsequently emerged. The boundary between neuroaesthetics and empirical aesthetics has since been blurred and even studies that are strictly speaking not “neuro”-aesthetic—because they do not measure brain activity—are often labeled as such.

Neuroaesthetics in its initial phase asked a simple question: Which area(s) of the brain is responsive to experiences that are beautiful? The answer, across a variety of stimuli such as paintings, music, and math equations, seemed to be: the orbitofrontal cortex (OFC). This brain area at the bottom of the front-most part of the brain had previously been associated with the anticipation of various other pleasurable things, such as food and money.

Findings like these spurred one of the questions that still lies at the core of empirical aesthetics: What—if anything—makes aesthetic experiences special? Some scholars, like Martin Skov and Marcos Nadal, are skeptical that they are at all. They base their view on the findings from neuroscience mentioned above: The signature of brain activity that is linked to intensely pleasurable aesthetic experiences does not seem to differ from the one that is linked to other pleasures, such as food or winning money. Others continue to make the case for aesthetics being special. For instance, Winfried Menninghaus and his colleagues argue that “aesthetic emotions” are distinct from everyday emotions in that they always relate to an aesthetic evaluation, an aesthetic virtue, and a notion of pleasure, and that they predict liking. This debate about whether and how aesthetic experiences are special persists and has been spurred by the first findings of studies in neuroaesthetics. At the same time, this debate is not a new one and is present in the writings of intellectuals such as William James and George Santayana.

f. Empirical Aesthetics in the 21st Century

Empirical aesthetics embraces all the different approaches that have shaped its history. Both theoretical and empirical work follow a multi-methodological approach that takes stimulus properties, observer characteristics, environmental conditions, and neurological processes into account. The amount of empirical data and reports is rapidly growing.

Empirical aesthetics in the 21st century continues to work on and clarify research questions present since its beginnings. For instance, Marco Bertamini and his colleagues clarified in 2016 that the preference for objects with curved contours is, in fact, due to an increased liking of roundness and not merely a dislike of angularity. At the same time, the field also adds new research questions to its agenda, notably the question about the generalizability of previous findings beyond the laboratory setting. The emergence of tablets, portable eye trackers and EEG systems has greatly facilitated data collection in natural environments, such as museums. At the same time, virtual reality settings enable more controlled experiments in at least simulations of more naturalistic environments than isolated cubicles.

On an institutional level, the import of empirical aesthetics has been acknowledged in the form of new institutions with an explicit focus on empirical aesthetics. Among them are the Max Planck Institute of Empirical Aesthetics in Frankfurt, Germany, founded 2012; the Penn Center for Neuroaesthetics, Pennsylvania, USA, founded 2019; and the Goldsmiths University’s MSc Program for Arts, Neuroaesthetics and Creativity, which started in 2018.

On the level of theory development, several models of art appreciation have emerged in the 21st century. One of the most cited models was developed by Helmut Leder in 2004 and later modified by him and Marcos Nadal (see Further Readings). The most comprehensive model developed so far is the Vienna integrated model of top-down and bottom-up processes in art perception (VIMAP) that was mainly proposed by Matthew Pelowski and also co-authored by Helmut Leder as well as other members of their research group. It is worth noting that both of these, as well as many other theoretical models, focus on visual art.

Theories about the aesthetic appreciation of music have been developed independently from those about visual arts. Since the late 2010s, the idea that music is liked because it affords the right balance between predictability and surprise has become popular. It relies on the notion of predictive coding, the view that our perceptual system constantly tries to predict what it will encounter next, and that it updates its predictions based on the observed differences between prediction and reality. This difference is called prediction error. The main thesis of the predictive coding view is that small prediction errors and/or a decrease of prediction errors over time are rewarding. In other words, we are hard-wired to enjoy the process of learning, and aesthetic experiences are but one kind of experience that enables us to do so.

A predictive coding account for visual aesthetic experiences was formulated for visual experiences by Sande Van de Cruys and Johan Wagemans, too. It has also been present and dominating views on creativity, art, and the experience of beauty in the computer sciences based on a model developed by Juergen Schmidhuber in the 1990s. However, to date, Schmidhuber’s ideas are little known to psychologists and neuroscientists and his theory remains uncited in the empirical aesthetics literature. This may, however, change, as interdisciplinary collaborations between psychologists, neuroscientists, and computer scientists become more frequent.

2. Subfields of Empirical Aesthetics

a. Perceptual (Visual) Aesthetics

Empirical aesthetics was pioneered by the psychophysicist Gustav Fechner. Psychophysics is the study of the relation between stimulus properties and human perception. Whilst applicable to all senses, most of the psychophysics research (in humans) has focused on vision. True to its roots, most of the past research on empirical aesthetics has also focused on which stimulus properties lead to particular aesthetic perceptions and judgments, and most of it concerns visual object properties.

Most work on perceptual aesthetics aims to uncover which stimulus properties are, on average, liked most. The best-supported findings along these lines are that curvature is preferred to angularity and that symmetry is preferred over asymmetry. In addition, people show a preference for average as opposed to unusual objects, in particular for faces. In the realm of color, green-blue hues are liked better than yellow ones in the great majority of the world. As opposed to popular rumors, a preference for the golden ratio has not found empirical support.

Apart from the above-listed widely supported findings, researchers in empirical aesthetics are studying a diverse number of other visual object properties that are hypothesized to be linked to aesthetic preference. Among these, the spatial frequency distribution has been of particular interest. The spatial frequency distribution of an image is a measure for how many sharp to blurry contours are present in an image; high spatial frequencies correspond to sharp and low spatial frequencies to blurry edges. Some evidence shows that art images with a spatial frequency distribution mimicking the one found in nature photography is preferred.

Researchers also investigate how fractal dimensionality influences aesthetic preferences. Fractal dimensionality refers to the degree to which the same pattern repeats itself on a smaller scale within the same image. An image of tree branches, for instance, has a high fractal dimensionality because the same pattern of twigs crossing one another is repeated in a similar way, no matter how far one ‘zooms into’ the overall image. In contrast, an image of differently shaped clouds in the sky has a lower fractal dimensionality because the visible pattern changes considerably depending on how far one ‘zooms into’ the image. Fractal dimensionality studies have revealed a certain intra-individual stability in preference for relatively high or low fractal dimensionality across different stimuli and even sensory domains.

Another quantifiable image property that is linked to people’s preferences is the so-called image self-similarity. Self-similarity and fractal dimensionality are related constructs but computed differently. They do follow a similar logic in that they compare a cut-out portion of the reference image to the reference image and then take the cut-out portion as the next reference image. Self-similarity can be conceived as an objective measure of complexity. In that sense, this line of research walks in the footsteps of Berlyne’s ideas. However, it also faces the same problem that Berlyne did. On one hand, it aims to measure object properties objectively and relate those to people’s aesthetic responses. On the other hand, it also wants to relate these objective measures to their immediate subjective impression. In the case of self-similarity, researchers are both interested in how well self-similarity metrics map onto subjectively perceived complexity, and at the same time they use self-similarity as a measure of complexity to relate this ‘objective’ complexity metric to subjective aesthetic evaluations. Neither of these relationships—self-similarity to complexity; self-similarity to aesthetic ratings—is a perfect one. Thus, the question of how all possible associations—self-similarity to subjective complexity, self-similarity to subjective rating, or subjective complexity to subjective rating—work together, and what portion of the aesthetic response can truly be attributed to the objective measure alone, remains open.

Other scholars are less concerned with objective stimulus properties and, instead, focus on the relation between different subjective impressions of the same stimulus. Coming back to the example of Berlyne’s hypothesis: Stimulus complexity is omitted (or merely introduced as a means of plausibly altering arousal), and the main relation of interest is the one between subjective arousal and subjective pleasure or liking. Studies that investigate exactly this relation between subjective arousal and aesthetic pleasure have overall been unable to support Berlyne’s claim that intermediate arousal causes the greatest aesthetic pleasure.

However, other relationships between aesthetic ratings have proven stable. Pleasure, liking, and beauty ratings are so closely related to one another that a differentiation between them is empirically speaking close to impossible. Research on people with a self-reported impairment in the ability to feel pleasure (anhedonia) additionally shows that people who cannot feel pleasure in general or in response to music are also much less likely to experience beauty from images or music respectively. This strong link between pleasure and beauty has also been taken as a further argument for the claim that (hedonic) aesthetic responses are undistinguishable from other hedonic responses (see Neuroaesthetics below).

Study results like the ones above draw a picture of the population average. However, researchers are increasingly aware of and interested in documenting individual differences in aesthetic judgments. They quantify the relative contribution of individual versus shared taste by asking a large number of observers to rate the same set of images, at least some of which are rated several times by the same observer. In this way, they determine what fraction of the rating can be predicted at all (factoring out inconsistencies of the same observer rating the same image) and what fraction can, in turn, be predicted by others’ ratings (shared taste) or not (individual taste). The contribution of shared taste seems so far smaller than the one of individual taste with a 50% contribution to face attractiveness and a mere 10% contribution to the liking of abstract art.

Thus, inter-individual differences in aesthetic responses are prominent. The different explanations for their occurrence can all be summarized under one common concept: prior exposure. The effects of prior exposure to a certain kind of stimulus are sometimes studied in the form of expertise within a certain art realm. Such studies compare the responses of, for example, architects and laypersons to photographs of buildings. Another way of studying effects of prior knowledge is the comparison of how people perceive and evaluate art from their own culture versus a foreign culture. Prior knowledge can also be experimentally introduced by showing the same observer the same image(s) repeatedly and thus making her more and more familiar with the stimulus set. Like the popular claim that the golden ratio is preferred, the notion of a “mere exposure” effect—that repeatedly presented stimuli are liked better—has not found consistent empirical support. A meta-analysis pooling results from more than 250 studies suggests that exposure increases positive evaluations, among them liking and beauty, up to a point after which further exposure becomes detrimental. Across all studies, this point seems to occur after about 30-40 exposures but the number varies depending on the kind of evaluation, other experimental details, and the kind of population studied.

Beyond the concern of understanding inter-individual differences, one of the big goals of empirical aesthetics remains to find processing principles that do generalize across all observers. To some extent, this kind of thinking was already present in Berlyne’s early writings when he posited that intermediate subjective arousal leads to the highest pleasure. Whilst this connection between subjective arousal and pleasure has not found consistent support, the notion of pleasure is still a central one in the quest of finding a general processing principle in empirical aesthetics. Intense pleasure is associated with intense beauty, with high liking, and a greater preference. This is true not only in the visual but also in the auditory, gustatory, and tentatively even the olfactory domain. Studies that assess anhedonia, the inability to experience pleasure, also find that this absence of pleasure also leads to impoverished beauty judgments.

The great majority of these findings (as well as the ones reported below) were obtained in a laboratory setting or online. That means that people experienced images, sounds, and some other objects in a highly controlled setting or on their own devices, almost always via a screen. For those scholars that are primarily interested in people’s response to art, these settings pose considerable concerns about ecologic validity: Can we really infer how someone will experience seeing a grand master’s work in real life from her responses to a miniature replica on a screen in a little cubicle? An entire line of research tries to answer this question and identify the similarities and differences between how people perceive and evaluate art in the laboratory versus in museums or at live performances.

b. Neuroaesthetics

Neuroaesthetics is different from other subfields of empirical aesthetics in terms of its methodology, not its subject. Neuroaesthetics is the science of the neural correlates of aesthetic experiences and behavior—that is, the brain activity and structures associated with them. Researchers use a variety of tools to measure brain activity: functional magnetic resonance imaging (fMRI); electroencephalography (EEG); magnetoencephalography (MEG); and more. In addition, diffusion tensor imaging (DTI) can provide insights into the strength of the anatomical connection between different areas of the brain. Due to the relatively poor temporal resolution compared to EEG and MEG, fMRI experiments predominantly use static objects, like images. In contrast, EEG and MEG methods are popular amongst researchers who are interested in stimuli that dynamically change over time, such as music and film. Neuroaesthetics has also begun to use non-invasive methods for stimulating and suppressing brain activity, such as transcranial direct-current stimulation (tDCS).

One of the best-supported findings from neuroimaging studies in aesthetics is that the experience of intensely pleasurable or beautiful objects increases activity in the reward circuitry of the brain, most notably the orbitofrontal cortex (OFC). Even though different studies with varying kinds of objects presented—such as music, paintings, stock images—find slightly different patterns of brain activations, increased activation in the OFC is observed in the vast majority of studies. This finding is of great significance because the same network of brain regions is active during the anticipation or reception of various other kinds of rewards, like food and money, too.

There is one line of studies that does point towards a difference between intensely pleasurable art experiences and other kinds of rewarding experiences. Edward Vessel and his colleagues find that when people view the most aesthetically moving art images, areas of the brain that are part of the “default mode network” (DMN) are activated. The DMN is usually associated with not being engaged with a stimulus or task, and hence, in the absence of another object, with self-reflection. The co-activation of perceptual-, reward-, and default-mode-regions is therefore unusual. According to these researchers, it is the best contender for explaining what makes aesthetic experiences special. This claim has to be taken with a grain of salt; they have yet to show that this co-activation does not occur during highly moving, non-aesthetic experiences.

Neuroaesthetics is, in principle, also concerned with changes in the different chemical substances involved in brain function, so-called neurotransmitters, associated with aesthetic experiences. In practice, inferences about the contribution of neurotransmitters are only rarely possible from the data, and direct manipulations of their concentration is even rarer.

c. Music Science

The study of music (and other sounds) in empirical aesthetics deserves separate mention from the research that concerns vision and other sensory modalities. While research on aesthetics in all but the auditory domain is often published and discussed in general outlets, research on music has a number of dedicated journals, such as Psychomusicology. Psychological theories of aesthetics also tend to focus on static stimuli, neglecting many of the variables of interest for those primarily interested in music, specifically those related to dynamic changes of the stimulus over time.

It is most likely due to the fact that music lends itself to studying changes of percepts over time that the idea of prediction and prediction errors are most prominently present in music science compared to other specialty fields of empirical aesthetics. The intuition is the following: A sequence of tones is liked if the next tone sounds subjectively better than the tone the listener had anticipated. The discrepancy between the expected and actually perceived pleasure (or reward) of an event has been termed “reward prediction error” in reward learning theories. However, this reward prediction error is not the only one being discussed in music science. Some researchers have also shown that ‘factual’ prediction errors can also predict how much one likes a sequence of tones. Here, the intuition is that a sequence of tones is liked if the listener is able to make a reasonable prediction about the next tone but is nonetheless surprised by the actually presented tone. From this point of view, people like musical sequences that elicit low uncertainty but at the same time relatively high surprise. Of note for this line of research is that the quantification of its core measures—uncertainty and surprise—can be automated by an algorithm first introduced in the mid-2000s: The Information Dynamics of Music (IDyOM) system provides a statistical learning model that can calculate both uncertainty and surprise scores for each note in a series of standardized musical notes. Its application in contemporary studies has provided results that are in line with a prediction-error account of aesthetic pleasure.

Music science is also a special field because there is a unique condition called musical anhedonia. People with musical anhedonia do not feel pleasure from music even though they are able to enjoy other experiences, like food, sex, or visual art, and have normal hearing abilities. This very circumscribed condition has enabled researchers to tackle a number of questions about the development, neural basis, and purpose of the human ability to produce and enjoy music. Contemporary insights from this line of research suggest that the functional connection between brain regions that guide auditory perception and regions that are associated with processing rewards of all kinds is key for music enjoyment. The picture is still complicated by the fact that a disruption of this connectivity cannot only lead to musical anhedonia but also to an extremely high craving for music, so-called “musicophilia.”

Music science is marked not only by its interest in understanding and predicting what kind of music people like. A considerable research effort also goes into understanding whether and how music can communicate and elicit a wide range of emotions from happiness to anger to sadness. It is a matter of debate whether music elicits the same kind of emotions in the listener as the emotional events that typically provoke them. An additional point of controversy is whether certain structural properties of music, such as the key it is played in, are consistently associated with the identification or experience of a certain emotion. What does seem to be clear, however, is that certain pieces of music are consistently rated as sounding like they express a particular emotion or even a particular narrative. At the same time, playing specific musical pieces has also been shown to be an effective means of changing people’s self-reported happiness, sadness, or anger. The idea that music can serve as a tool for emotion regulation—either by evoking or mitigating emotions in the listener—forms the core of many practical applications of music science.

A last phenomenon that deserves special mention within the realms of music and emotion is the so called “sad-music paradox”, the phenomenon that people report that they like to listen to sad music when they are already in a sad mood themselves. This apparent contradiction between the hedonic tone the music expresses (sad, negative) and the judgment of the listener (liked, positive) poses a problem for those claiming that music elicits the same genuine emotion in its listener that it expresses. The question why people report to like listening to sad music when sad has yet to be answered. It is worth noting, though, that little research has to date recorded the actual emotional response of sad people listening to sad versus neutral or happy music.

d. Other Subfields

Even though the three areas of concentration mentioned above represent the majority of the research on aesthetics, a few more distinct areas deserve to be mentioned.

One art form that should not go unmentioned is literature. Aesthetic responses to text are less frequently studied by empirical aesthetics than images or music, even though some scholars occasionally use short poems—preferably haikus—in their studies. The bias towards very short poetic forms of literature reveals one of the main reasons why the study of aesthetic responses to literature is not as common: It takes time to read a longer passage of text, and empirical scientists ideally want their participants to experience and respond to a large number of different objects. Overall, there is little data available on who likes what kind of written words and why. Arthur M. Jacobs has nonetheless developed a theory on aesthetic responses to literature, called “Neurocognitive Poetics Model.” It suggests that literary texts can be analyzed along a 4 x 4 matrix, crossing 4 levels along the text dimension (metric, phonological, morpho-syntactical and semantic) with 4 levels of author and reader related dimensions (sub-lexical, lexical, inter-lexical, and supra-lexical). Scholars who focus on the investigation of literature are also the ones who, in empirical aesthetics, come closest to addressing the paradox of fiction. The most common way of thinking about it in the field is that readers (or listeners or viewers) empathize with fictive characters and that they do experience genuine emotions during their aesthetic experience, much as if they would witness the characters having the experiences depicted in real life. There is some evidence from neuroimaging studies that at least shows that similar brain structures that are involved in genuine emotional responses are involved in processing fictive emotional content. At the same time, it is often presumed that people are nonetheless aware of the fact that they are emotionally responding to fiction, not reality, and that this allows them to distance themselves from eventual negative responses. Winfried Menninghaus and his colleagues have developed this line of thought into the “distancing-embracing model of the enjoyment of negative responses in art reception”.

The second art form that deserves mentioning but suffers from being less frequently studied is dance. There have been relatively few studies that experimentally investigated what kinds of dance movements elicit what kinds of aesthetic responses, potentially due to the fact that the production of well-controlled variations of dance sequences is labor-intensive and does not offer the same amount of experimental control as, for instance, the manipulation of static images. There are, however, efforts to link the relatively small and isolated study of dance to the rest of empirical aesthetics by, for instance, linking the variability of the velocity of a dancer’s movements as a measure of complexity to the aesthetic pleasure the dance sequence elicits.

Finally, a related, complex, and dynamic class of experiences that accordingly suffers from the same scarcity of data as dance is movies. Even though some scholars have used sections of movies to study correlations between people’s neural responses to identical but complex stimuli, we know relatively little about people’s aesthetic responses to movies. Self-report studies indicate that aesthetic evaluations of movies are highly idiosyncratic and that laypeople do not seem to agree with critics.

A separate area of research that can fall under the umbrella of empirical aesthetics, broadly conceived, is the one of creativity and, closely related to that, the study of art production. This field is interested in creativity both as a personality trait as well as an acquired skill or temporary act. The question as to how far creativity can be viewed as a stable trait of a person is one of the questions of interest to the field.

3. Relations to Other Fields

a. Relations to Other Fields of Psychology and Neuroscience

Like most areas of psychology, the boundaries of empirical aesthetics are porous. Aesthetic judgments become a subject of social psychology when they concern people. Aesthetic experiences become a subject of affective psychology when conceived as emotional responses. Evolutionary psychology perspectives have been used to explain various aesthetic preferences and why people create art. Aesthetic preferences have an undeniable effect on decision making. The list can be extended to include any number of subsections of psychology.

One connection between empirical aesthetics and other fields that has been emphasized more is the one to general neuroscience research on reward processing, learning, and decision making. The link to this area of neuroscience has become apparent starting with the first fMRI studies on aesthetic appreciation that showed that the same brain regions involved in processing various other kinds of rewards are active while experiencing aesthetically pleasing, beautiful stimuli. Nonetheless, it is rare that decision-making studies use aesthetically pleasing objects as rewards. One major hurdle that seems to prevent the integration of aesthetically pleasing experiences as one kind of reward into the fields of decision-making and reward-learning is the fact that the reward obtained from an aesthetic experience has yet to be expressed in a mathematically tractable way.

Consumer psychology is a field that has established close ties with empirical aesthetics. Product appearance and other aesthetic qualities matter when it comes to our decisions about which products to buy. At first glance, consumer research also seems to tie empirical aesthetics to the field of neuroeconomics and decision-making. In practice, however, there is a marked difference between the application-oriented experimental research that dominates consumer psychology and the theory- and model-oriented research in neuroeconomics and decision-making. Consumer psychology aims to document the effects of certain aesthetic variables in well-defined environments and has thus provided evidence that aesthetic variables do influence decision-making in environments that have relatively high ecologic validity. Neuroeconomics and the study of decision-making from the cognitive and computational psychology perspective, in contrast, aim to develop mathematically precise models of decision-making strategies with a particular interest in optimal strategies. This focus necessitates tightly controlled laboratory settings in which aesthetic variables are omitted.

Empirical aesthetics also has a natural connection to social psychology when it comes to the aesthetic evaluation of people, specifically in terms of attractiveness. The implications of such subjective impressions about a person have been well-documented even before the re-emergence of aesthetics as a part of psychology. The “what is beautiful is good” heuristic and the Halo-effect are the best-known findings from this line of research. Attractiveness research has been partially conducted without much regard to empirical aesthetics and rather as a natural extension of person evaluation and face perception. However, attractiveness is also increasingly studied by scholars primarily interested in empirical aesthetics. In this sense, the study of facial attractiveness represents a two-way connection between empirical aesthetics and social-cognitive psychology.

Empirical aesthetics connects with evolutionary psychology in at least two major ways. One, evolutionary theories are popular for explaining aesthetic preferences for natural environments and faces, human bodies, or faces. This line of investigation ties together empirical aesthetics, social psychology, and evolutionary psychology with regard to human and non-human mate choice. Two, arguments against the existence of a specialized ‘aesthetic circuitry’ in the brain also rest on an argument of evolutionary implausibility: It seems unlikely that such a circuit should have evolved when the use of existing reward systems would be able to perform the same functions.

b. Relationship to Computational Sciences

Even though it has become increasingly hard to draw a line between computer science, computational (cognitive) psychology, and neuroscience, it is worth mentioning computational approaches separately from other frameworks in psychology and neuroscience when it comes to empirical aesthetics.

For one, attempts to use machine learning algorithms to classify objects (again, mostly images) based on their aesthetic appeal to humans are by no means always related to psychological research. In a broader sense, algorithms that create aesthetically appealing images have been developed for commercial purposes long before scientists started to systematically harness the tools of machine learning to study aesthetics.

Second, a dominant framework for how to think about the engagement with art and beauty emerged in computational sciences in the 1990s—arguably before mainstream psychology and neuroscience had turned computational, too. As briefly discussed in the History section, this framework was most prominently popularized by Juergen Schmidhuber.

Third, computer scientists have an additional interest in not only understanding the appreciation and creation of aesthetic experiences by humans, but also in striving to ‘artificially’ create them. Deep-learning algorithms have become famous for beating master-players of chess and Go. Now, computer scientists also try to use them to create art or apply certain art styles unto other images.

c. Relationship to Aesthetics in Philosophy

Like any science, empirical aesthetics is deeply rooted in its philosophical precedents. Even more so than in other sciences, scholars in empirical aesthetics acknowledge this in their writing. Theories and experiments are sometimes partially based on classic philosophical perspectives on aesthetics. Contemporary aesthetic philosophy, however, is rarely mentioned. This disconnect between modern empirical and philosophical aesthetics is mostly due to the fact that the scope of empirical aesthetics remains close to the narrow, older definition of aesthetics as “theory of beauty” and art, whereas aesthetics in modern philosophy has shifted its focus towards new definitions of art, expression, and representation.

During the early days of empirical aesthetics, and psychology as a science in general, the divide between psychology and philosophy used to be less pronounced. Ernst Gombrich and Rudolf Arnheim, for instance, have influenced both fields.

Psychologists and neuroscientists are uninterested in what counts as “art” and what does not count as “art,” in stark contrast to their philosopher colleagues. They are also much less concerned about the normative definition of the aesthetic responses they intend to measure. This core difference between the psychological and philosophical approach to aesthetics is rooted in the diverging goals of the two fields. Empirical aesthetics aims to describe what people deem an aesthetic experience, in contrast to most philosophers who seek to prescribe what should be a proper aesthetic experience.

When it comes to aesthetic emotional responses, the divide between philosophy and psychology lies on a different level. While the paradox of fictional emotions is one that leaves many philosophers doubting that one does indeed show true emotional responses to fictional texts or music, psychologists rarely doubt that people experience emotions when experiencing art in whatever form. On the contrary, a lot of psychological research on emotions relies on the presumption that texts, music, movies, and so on, do elicit genuine emotions since those media are used to manipulate an observer’s emotional state and then study it. And while philosophers question the rationality of these emotions, presuming they exist, psychologists do not question whether or not an emotional response is rational or not. Again, the contrast between philosophy and psychology seems to originate from the different approach towards people’s self-reports. Psychologists take those self-reports, along with neural and physiological data, as evidence of the state of the world, as their subject of study. Philosophers question how far these self-reports reflect the concept in question. Importantly, however, both philosophers and psychologists still do ponder a related question: Are aesthetic emotions special? Or are they the same emotions as any others and just happen to have been elicited by an aesthetic experience?

This historically important link between morality and aesthetics, especially beauty, in philosophy is rarely made in psychology. Apart from the above mentioned “what is beautiful is good” phenomenon, psychologists do not usually link aesthetic and moral judgments. However, there is evidence that facial and moral beauty judgments are linked to activation in overlapping brain regions, specifically the orbitofrontal cortex (OFC). It should be noted, though, that this orbitofrontal region is the same one that is even more generally implied in encoding the experienced or anticipated pleasure of objects of many different kinds. In addition, the so-called “Trait Appreciation of Beauty” framework proposed by Rhett Diessner and his colleagues explicitly contains a morality dimension. This framework is, however, not widely used.

The topic of aesthetic attitude is viewed from very different angles by psychologists and philosophers. In experiments, psychologists use instructions to elicit a change in their participant’s attitude to study the effect of such an attitude change on how objects are perceived. They, for instance, tell people to either judge an image’s positivity as part of a brochure on hygiene behavior or as a piece of abstract art. Research along these lines has found that people do indeed change their evaluations of images depending on whether it is presented as art or non-art. Neuroaesthetics studies have also investigated whether neural activation during image viewing changes depending on the instructions that are given to people, such as to look at them neutrally or with a detached, aesthetic stance. These researchers have indeed uncovered differences in how and when brain activity changes due to these different instructions.

Psychology has so far stayed silent on the issue of whether aesthetic representations, art, can contribute to knowledge. With the exception of some research on the effect of different modes of representing graphs in the context of data visualization, there does not seem to be an interest in exploring the potential contribution of aesthetic factors to learning. However, the inverse idea—that the potential for learning may be a driving force for seeking out aesthetic experiences—seems to be gaining some traction in empirical aesthetics of the 21st century.

It is worth noting, too, that some philosophers combine theories that were developed in the philosophical tradition with experimental methods. Some of these philosophers conduct this empirical philosophy of aesthetics in collaboration with psychologists. This kind of collaboration is in its infancy but does show similar promise as in the field of moral philosophy and psychology.

4. Controversies

a. Is Aesthetics Special?

The major controversy in empirical aesthetics concerns the very core of its existence: Is there anything special about aesthetic experiences and behaviors that distinguishes them from others? For example: Is the pleasure from looking at the Mona Lisa any different from the pleasure of eating a piece of chocolate? Some scholars argue that the current data show that the same neural reward circuit underlies both experiences. In addition, they argue that it would be evolutionarily implausible for a special aesthetic appreciation network to evolve as well as that there is evidence that even non-mammalian animals exhibit behavior that can be classified as signals of aesthetic preferences. Scholars who take the opposing view argue that aesthetic experiences do have properties that distinguish them from other pleasant experiences, especially that they also include unpleasant sensations such as fear or disgust. They also point out the symbolic and communicative function of art that goes beyond the mere evocation of aesthetic pleasure.

b. What Should Empirical Aesthetics Mainly be About?

Empirical aesthetics as a field is far from having found a consensus about its identity. The center of an ongoing tension within the field is the relative balance between a focus on the arts, including all kinds of responses associated with it, versus a focus on aesthetic appreciation, including all kinds of objects that can be aesthetically appreciated. It is therefore unsurprising that most research in the past has occurred at the intersection of both topics, which is to say that it has dealt with aesthetic preferences for artworks or sensory properties that can at least be considered fundamental properties of artworks.

At the two other ends of the extreme, scholars criticize each other for presenting data that is irrelevant for the field. Proponents of an empirical aesthetics of the arts criticize studies that use stock photographs or image databases like the International Affective Picture System because these kinds of objects supposedly cannot elicit genuine aesthetic responses. Proponents of an empirical aesthetics of appreciation in general criticize studies that use only a narrow selection of certain artworks because these supposedly cannot generalize to a broad enough range of experiences to yield significant insights.

c. Population Averages vs. Individual Differences

Another big controversy in the field has accompanied it since its early beginnings. Should we study population averages or individual differences? This question arises within almost any field in psychology, but it has created a particularly marked division of research approaches within empirical aesthetics. The ones studying population averages and object properties criticize the other side by saying that their subjective measures are fundamentally flawed. The ones focusing on individual differences point out that object properties can often only account for a small proportion of the observed responses.

Most contemporary researchers still operate on the level of understanding and predicting average responses across a pre-defined population, mostly Western-educated, rich populations in industrialized democracies. In contrast to Berlyne, however, this choice is often not based on the conviction that this approach is the best one. It is, instead, often the only feasible one based on the amount of data that can be obtained from a single participant. The fewer data points per participant, the less feasible it is to make substantial claims about individual participants. The very nature of aesthetic experiences and responses—that is, that an object needs to be experienced for a certain time; that judgments may not always be made instantaneously; that one cannot make repeated, independent judgments of the same object; that aesthetic judgments may be compromised as the experiment progresses due to boredom and fatigue; that one cannot assume stability of aesthetic responses over longer delays of days or even hours—complicates the collection of many data points for a single participant.

Still, a paradigm shift seems to be taking place, slowly. In the early 21st century, more and more studies have at least reported to what extent their overall findings generalize across participants, or to what extent aesthetic judgments were driven by individual differences versus common taste. In addition, some have reported stable preferences for certain stimulus characteristics across modalities or object kinds within a given participant.

5. References and Further Reading

  • Berlyne, D. E. (Ed.). (1974). Studies in the New Experimental Aesthetics: Steps toward an Objective Psychology of Aesthetic Appreciation. Washington D. C.: Hemisphere Publishing Corporation.
  • Brielmann, A. A., & Pelli, D. G. (2018). Aesthetics. Current Biology, 28(16), R859-R863.
  • Brown, S., Gao, X., Tisdelle, L., Eickhoff, S. B., and Liotti, M. (2011). Naturalizing aesthetics: brain areas for aesthetic appraisal across sensory modalities. Neuroimage, 58(1), 250-258.
  • Chatterjee, A., and Vartanian, O. (2014). Neuroaesthetics. Trends Cogn. Sci., 18, 370–375.
  • Fechner, G. T. (1876). Vorschule der Aesthetik. Leipzig: Breitkopf & Hartel.
  • Graf, L. K. M., and Landwehr, J. R. (2015). A dual-process perspective on fluency-based aesthetics: the pleasure-interest model of aesthetic liking. Pers. Soc. Psychol. Rev., 19, 395–410.
  • Ishizu, T., and Zeki, S. (2011). Toward a brain-based theory of beauty. Plos ONE, 6, e21852.
  • Leder, H., and Nadal, M. (2014). Ten years of a model of aesthetic appreciation and aesthetic judgments: the aesthetic episode — developments and challenges in empirical aesthetics. Br. J. Psychol., 105, 443–446.
  • Montoya, R. M., Horton, R. S., Vevea, J. L., Citkowicz, M., & Lauber, E. A. (2017). A re-examination of the mere exposure effect: The influence of repeated exposure on recognition, familiarity, and liking. Psychological bulletin, 143(5), 459.
  • Nadal, M. and Ureña, E. (2021). One hundred years of Empirical Aesthetics: Fechner to Berlyne (1876 – 1976). In M. Nadal & O. Vartanian (Eds.), The Oxford Handbook of Empirical Aesthetics. New York: Oxford University Press, https://psyarxiv.com/c92y7/
  • Palmer, S. E., Schloss, K. B., and Sammartino, J. (2013). Visual aesthetics and human preference. Annu. Rev. Psychol., 64, 77–107.
  • Pelowski, M., Markey, P. S., Forster, M., Gerger, G., and Leder, H. (2017). Move me, astonish me… delight my eyes and brain: The Vienna integrated model of top-down and bottom-up processes in art perception (VIMAP) and corresponding affective, evaluative, and neurophysiological correlates. Physics of Life Reviews, 21, 80-125.

 

Author Information

Aenne Brielmann
Email: aenne.brielmann@tuebingen.mpg.de
Max Planck Institute for Biological Cybernetics
Germany

Doxastic Conservatism

We are creatures with clear cognitive limitations. Our memories are finite and there is a limit to the kinds of things we can store and retrieve. We cannot, for example, remember the justification or evidence for many of our beliefs. Moreover, in response to our limited cognitive resources, we generally tend to maintain our beliefs and are reluctant to change them. A clear case of this psychological tendency to preserve beliefs obtains when people are informed of the inadequacy of the original grounds of their beliefs. Their reluctance to change their beliefs shows that they are sensitive to the fact that changing them incurs cognitive costs, thus straining their limited resources.

Certain views in epistemology have sought to put a rational gloss on this phenomenon of belief perseverance by suggesting the thesis of doxastic conservatism, according to which the fact that one believes a proposition provides some measure of justification for that belief. This initial picture has, however, become more complicated by further claims made on behalf of the thesis to the effect that it also has the potential to resolve certain outstanding problems in epistemology, (such as how perception is a source of reason), skeptical worries about induction, and the problem of easy knowledge. Examination of these claims reveals that they involve more than one thesis of conservatism. Moreover, it appears that the epistemic role that is attributed to the conservative thesis is often played by superficially similar claims which derive their epistemic significance not from what the thesis regards as the source of justification but from other substantial properties that are attributed to beliefs. This article presents and examines some of the main accounts of the thesis of doxastic conservatism as well as the arguments that are suggested in their support.

Table of Contents

  1. Doxastic Conservatism: The Debate
  2. Varieties of Doxastic Conservatism
  3. Differential Conservatism
  4. Perseverance Conservatism
  5. Generation Conservatism
    1. Arguing for and Against GC
      1. The Transcendental Argument
      2. The Argument from Charity
    2. Modifying GC
      1. Conservatism as the Principle of Credulity
      2. Conservatism as Believing in the Absence of Defeaters
      3. Conservatism as a Dynamic Strategy
  6. Conclusion
  7. References and Further Reading

1. Doxastic Conservatism: The Debate

Doxastic conservatism refers to a variety of theses which, in different ways, emphasize the stability of one’s belief system by requiring the subject to refrain from revising his or her beliefs when there are no good reasons for a revision. We are all too familiar with the fact that our undeniable cognitive limitations restrict the set of things we can store or retrieve. Often, we lose track of the justification relations among our beliefs and the reasons behind them, which is why, as well documented experiments have shown, we tend to retain many of our beliefs despite being informed of the inadequacy of their original grounds. Against this background of limited cognitive resources and the costs that the changing of one’s mind incurs, doxastic conservatism (DC) presents itself as a viable blueprint for regulating our belief-forming processes by recommending the adoption of those hypotheses that minimize the revision of our belief system, thereby ensuring its stability. The advocates of DC, however, do not limit its virtues to the minimization of such cognitive costs; they sometimes make ambitious claims on its behalf, highlighting its ability to serve a number of epistemological projects such as the justification of memory beliefs and the resolution of various skeptical problems (McGrath 2007, McCain 2008, and Poston 2012).

However, before proceeding to delineate the contours of the conservatism thesis as well as its purported virtues, an important terminological remark is in order. The conservative thesis that is the subject of this article is generally known in the literature under the rubric of epistemic conservatism. However, since there are several different theses in epistemology all using the same or a similar label, it is best to call it doxastic rather than epistemic conservatism to distinguish it from such theses as phenomenal conservatism and epistemic conservatism, the latter of which is found in the context of the liberalism (dogmatism)/conservatism debate. According to phenomenal conservatism (Huemer 2001), if it seems to you that p, then, in the absence of defeaters, you thereby have at least some degree of justification for believing p. When phenomenal conservatism is restricted to perceptual experience, the thesis is known as dogmatism (Pryor 2000). According to dogmatism (liberalism), perceptual experience (e) gives one justification to believe its content if it appears to one that p and one has no reason to suspect that any skeptical alternative to p is true. So, in this liberal theory, experience on its own can confer justification on the belief in its content. Against this liberal view stands the conservative view, notably defended by Crispin Wright (2004), according to which to be warranted or justified in holding a perceptual belief, we must have some antecedent justification or entitlement to believe certain fundamental presuppositions such as the existence of the world, the reliability of our perceptual system, and so on. With this word of caution out of the way, discussion of conservatism can proceed without the risk of confusing it with similarly labeled doctrines.

Doxastic conservatism has informed philosophical views as diverse as Quine’s and Chisholm’s. According to Quine (1951), belief revision must be subject to a number of conditions, most notably the overall simplicity of the resulting belief system and the need to preserve as many earlier beliefs as possible. For Chisholm (1980), however, conservative principles play an important role in his defense of epistemological foundationalism. Despite the popularity of the conservatism thesis, however, it is difficult to identify a single thesis as representing its content. Sometimes DC is presented as the claim that the holding of a belief confers some positive epistemic status on its content, sometimes it is said to regulate our decision to continue to hold a belief, and sometimes it is said to help us to decide between a number of evidentially equivalent alternative hypotheses. Although it is easy to see a common motivation behind all these different versions of DC, one should not lose sight of their differences, because the considerations that are usually offered in their defense often involve different concerns. This article begins by distinguishing between three main varieties of doxastic conservatism, namely, differential, perseverance, and generation conservatism. It then examines the plausibility of the arguments given in their support. This investigation pays special attention to the alleged payoffs of DC. In particular, it will inquire whether it is DC, on its own, that has such epistemic potentials or whether the latter are the result of other claims that superficially resemble DC.

2. Varieties of Doxastic Conservatism

As noted above, the thesis of doxastic conservatism actually covers a family of views that are all presented as conservative theses. These non-equivalent versions of conservatism differ from each other not only because their advocates often reject one version while upholding another, but also because the arguments that are put forward in their favor are actually tailored to defend one particular version to the exclusion of another. For example, Lawrence Sklar (1975) defends what he calls methodological conservatism, which guides a cognizer who comes to know of a hypothesis that is evidentially equivalent to the one that she has already adopted. However, Sklar rejects another version of conservatism, according to which holding a belief confers some measure of justification on that belief, for being too strong. It is, however, this latter version of conservatism, defended by Chisholm (1980), that is upheld as the standard version of doxastic conservatism.

Another exponent of conservatism, Gilbert Harman (1986), is more concerned with uncovering the principles that regulate our continuing to believe a proposition in the absence of contrary reasons. Although Harman sometimes appeals to the standard version of conservatism, on his official account conservatism is the view according to which “one is justified in continuing fully to accept something in the absence of a special reason not to” (1986, p. 46). Accordingly, to evaluate the thesis of epistemic conservatism, it would be more appropriate to begin by distinguishing the following types of the conservative theses (Vahid 2004):

Differential Conservatism (DiC) One is justified in holding to a hypothesis (belief) despite coming to know of evidentially equivalent alternatives.
Perseverance Conservatism (PC) One is justified to continue to hold a belief as long as there is no special reason to give it up.
Generation Conservatism (GC) Holding a belief confers some measure of justification on the belief.

When evaluating these theses for their epistemic worth, it should be borne in mind that some of the virtues mentioned in connection with their plausibility are pragmatic in nature. We are told that, due to our cognitive limitations, changing our mind for no good reason is a waste of our time, energy, and resources, and that, therefore, following the conservative principles would minimize such costs and save us from dwindling our resources. Whatever the practical merits of conservatism, such virtues, on their own, due to their pragmatic nature, fail to ensure that epistemically warranted beliefs would result from adherence to the conservative principles as canons of theory choice. Thus, we need to see what it is exactly that makes conservatism an epistemic, rather than a pragmatic, thesis, and this is best achieved by examining the merits of each conservative principle on its own.

3. Differential Conservatism

According to Sklar, “[a]ll [differential conservatism] commits us to is the decision not to reject a hypothesis once believed simply because we become aware of alternative, incompatible hypotheses which are just as good as, but no better than, that believed” (1975, 378). Sklar’s defense of DiC is a sort of transcendental argument involving what he calls an anti-foundationalist “local theory of justification.” Unlike the foundationalist theory of justification, in which basic beliefs depend on no other beliefs for their justification, the local theory takes epistemic justification to be relative to a body of assumed, unchallenged background beliefs. These background beliefs are supposed to play the role of evidence. They can play such a role, says Sklar, because their own status is not at the time under scrutiny. Such an account, however, is consistent with the existence of incompatible total belief structures all being locally justified. The only way to rule out this possibility is to invoke differential conservatism and hold on to what we already believe despite becoming aware of competing belief structures.

This argument, however, leaves one with a lacuna about the status of the background beliefs in the relevant belief structures. In order to confer justification on the target belief, these background beliefs must themselves be justified. However, with Sklar’s rejection of foundationalism, such beliefs can only acquire their epistemic worth either by cohering with the rest of one’s belief system or by relying on GC, according to which the mere holding of a belief confers a positive epistemic status on it. The latter option is not available to Sklar because he rejects GC for being too strong. The only alternative seems to be to adopt a coherence theory of justification and claim that the unchallenged background beliefs acquire their justified status by belonging to a coherent belief system. Collapsing the local theory into a holistic coherence theory of justification, though, renders DiC redundant. It is indeed no accident that, in defending the local theory, Sklar is forced to respond to the alternative coherent systems objection that usually arises for the coherentist accounts of justification. Either way, the transcendental argument is unsuccessful.

Considering the following scenario highlights another problem with DiC. Suppose two subjects, S1 and S2, faced with the task of explaining some body of data (e), come up with two incompatible hypotheses, H1 and H2 respectively, that can equally account for e. Both can be said to be justified in believing their respective hypothesis. Suppose, however, that S2 also learns or independently discovers that H1 equally accounts for e, and for some non-evidential, perhaps aesthetic, reason gives up his previous belief and instead comes to believe that H1. By DiC, S2 should have stuck with H2 and his belief that H1 is therefore not justified. If so, we have a case here where two tokens of the same belief (H1) are based on the same grounds, but while one is justified, the other is not, and this undermines the thesis of epistemic supervenience according to which the justification of a belief supervenes on certain non-epistemic properties of that belief, such as being reliably produced, being adequately grounded, or being part of a coherent belief system. It is worth noting that this problem only affects DiC on non-permissivist views according to which, for any body of evidence e and any proposition p, there is at most one kind of doxastic attitude towards p that is permitted by e.

One might object that since e equally justifies both H1 and H2, and the two hypotheses cohere equally well with S2’s beliefs, S2 is neither justified in believing H1 nor justified in believing H2 (McCain 2008). Of course, with these further stipulations about the justificatory role of coherence and the strength of evidence, S2 will not be justified. This would be a different conservative thesis (see below for discussion), however, and not the thesis that Sklar defends, which clearly states that one can retain one’s justification for believing a hypothesis despite coming to know of incompatible but evidentially equivalent hypotheses.

Finally, DiC may lack intuitive plausibility. Suppose a subject S comes to believe that H1 on the ground that it explains some data e. It is plausible to think that S’s awareness of a competing but equally explanatory hypothesis H2 should require some doxastic attitude adjustment vis-à-vis her belief that H1. The thought is that by finding out that H2 equally accounts for e and that H1 and H2 cannot be both correct, S, being a rational agent, should conclude that she may have assessed e inappropriately. Of course, this does not mean that S’s belief that H1 is false. However, given her fallibility, awareness of the second-order evidence regarding the possibility of her inappropriate assessment of e should prompt S to be more circumspect in her attitude towards H1. It follows that the rational credibility of this belief is thereby eroded to some extent. If S continues to come across further competing explanations of e such as H3, H4, and so forth, the evidential impact of such collective evidence would become quite impressive to the extent of significantly undermining the epistemic status of S’s original belief that H1. This conclusion is clearly at odds with DiC’s recommendation that S ought to stick with H1 regardless of the competing hypotheses that she may come across.

Marc Moffett (2007) has argued, however, that awareness of competing but equally explanatory hypotheses—underdetermination—does not constitute a defeater for holding on to our beliefs if we help ourselves with something like GC. In other words, if we accept that merely having a belief constitutes a prima facie justification for that belief, then it follows that if one believes that p at t, one should not abandon this belief at t unless one has adequate reason to do so. Accordingly, Moffett denies that knowledge that one’s belief is rationally underdetermined by evidence undermines our entitlement to that belief, for, given that underdetermination is a widespread phenomenon, we would be forced to adopt a theoretically neutral psychological standpoint in a great portion of our cognitive endeavors, which is implausible. Apart from the problem that this maneuver is at odds with Sklar’s rejection of GC, it is unclear whether such considerations constitute epistemic, rather than prudential or moral, reasons for holding on to beliefs.

4. Perseverance Conservatism

The driving force behind this version of conservatism, defended most notably by Harman (1986), has been the phenomenon of “lost justification” or “lost evidence.” As noted before, failure to keep track of our evidence, itself the result of our cognitive limitations, is usually taken to explain the so-called phenomenon of “belief perseverance in the face of evidential discrediting.” Experimental results have shown that people exhibit a psychological tendency towards the perseverance of their beliefs when apprised of the unreliability of the original source of those beliefs because they fail to recall that it was the discredited evidence that was initially responsible for their beliefs.

Harman discerns two competing theories of the rationality of belief perseverance, the foundations theory and the coherence theory. The former requires that one have a special reason to continue to hold a particular belief if one is to be justified in that belief, while the latter, by contrast, only requires the absence of any special reason to revise the belief in question for one to be justified to continue to hold it. Since the foundations theory requires that one keep track of one’s original reasons, Harman concludes that, in the face of the phenomenon of lost evidence, it is the coherence theory that is normatively correct, and the conservative thesis is simply the expression of its normative import. Although Harman sometimes understands conservatism as the thesis that “a proposition can acquire justification simply by being believed” (1986, p.30), it is obvious that it is not GC that he has in mind. For if having a bare belief is to be sufficient for its justification, the phenomenon of lost evidence or justification would be rendered impossible since the belief itself would ground its justification.

Harman’s official account of conservatism, along the lines of PC, maintains that one is justified in continuing to hold a belief as long as there are no good reasons against it. Although PC can account for the rationality of belief perseverance, alternative explanations of such rationality can undermine its credibility. Before addressing this issue, it is worth considering an objection made to Harman’s argument from lost justification by David Christensen (1994), as it would further clarify what PC really involves. Christensen thinks that one can explain the phenomenon of belief perseverance without appealing to any conservative principle. Suppose, for example, that I currently hold the belief that the population of India exceeds that of the United States, though I have forgotten what the source of my belief was. To show that conservatism need not play any role in explaining the rationality of this belief, Christensen offers what he takes to be a similar case, where it is completely implausible to invoke any conservative principle. Suppose I flip a coin which lands out of my sight, and I decide to believe that it has landed tails up without checking to see whether it has. It is obviously implausible to take that belief as justified, but it seems this is what conservatism invites us to do.

Despite their structural similarity, the two cases, says Christensen, differ in the following respect: “In both cases, I have a belief for which I am unable to cite any grounds…Yet in one case, maintaining the belief seems quite reasonable; while in the other… unreasonable” (1994, 74). Christensen claims that one cannot explain this difference in terms of the applicability of conservatism in one case and its inapplicability in another. Rather, it is to be explained by the role that background beliefs play in the two cases. In the India example, I have some background beliefs—for example, that I have acquired the belief from a reliable source, that despite India and the United States being favorite topics in my family, I have never been contradicted, and so forth—that convince me that my belief is correct. Moreover, it is precisely such beliefs that maintain the rationality of my continuing to hold the belief about India’s population. However, no similar beliefs are present in the coin example, which is why I am not justified in holding the belief that I do.

Vahid has, however, criticized this argument on several grounds (2004). It could be, for instance, that the coin example and the India example are only superficially similar. It is true that, in both cases, I have a belief for which I can no longer recall any evidence, but this is true for different reasons in the two cases. In the India case, I have forgotten the original source of my belief, but in the coin case there is simply no reason to report. So the coin case is not really an instance of the phenomenon of forgotten evidence. More damaging, even the India example does not seem to be a case of forgotten evidence. For what seems to be forgotten in that example is merely the name or identity of the source of my belief, and that is irrelevant to the question of the rationality of continuing to hold to that belief. After all, Christensen himself maintains that among the things I know in this case is that “I was once told [about India’s population] by… some… reliable source, and I (quite rationally) accepted the source’s word for it” (1994, p.73). To put it differently, the evidence that I have forgotten in this example concerns the identity of the source of my belief that India is more populous than the United States. However, as far the belief itself is concerned, I have enough information to render it justified.

As noted above, PC’s credibility in accounting for the rationality of belief perseverance can be undermined if there are alternative explanations of the phenomenon of lost justification. Here is one alternative explanation (Vahid 2004). Suppose we take the property of justification to be an objective property that beliefs possess when they are adequately grounded. It is also customary in the epistemology literature to distinguish between “being justified” or “having justification” and the “activity of justification” (Alston 1989). The idea is that just as one can be good, say, without being able to show that one possesses that virtue, one’s belief can have (and retain) the property of justification if it was initially based on adequate grounds, and there are currently no defeaters, without one being able to show that it is justified. With these distinctions in force, one could say that what the phenomenon of lost justification threatens is not the justification of one’s belief, but one’s ability to show that one is justified in holding that belief. The plausibility of this explanation depends, however, on the fate of some of the contentious issues in the internalism-externalism debate in epistemology.

5. Generation Conservatism

This section presents what is generally regarded as the standard version of conservatism, namely, GC. Given its mainstream status, GC has been more extensively discussed than the other versions of conservatism. Unlike the other two versions of conservatism, where what is at issue is the epistemic status of belief when one has lost track of its grounds or when one has been apprised of the evidentially equivalent competing hypotheses to that belief, generation conservatism (GC) is concerned with whether the very formation of a belief bears on its epistemic status. As Chisholm characterizes GC, the principle says that “anything we find ourselves believing may be said to have some presumption in its favor—provided it is not explicitly contradicted by the set of other things that we believe” (1980, pp. 551-552). Despite being the mainstream version of conservatism, GC also turns out to be its most controversial version.

There is no doubt that GC, if true, is a powerful epistemic tool for resolving a number of standing problems in epistemology, such as the problem of skepticism, the puzzle over the epistemic status of memory beliefs, and others. It has also been put to use to address a number of other challenges, like the problems facing internalism. For example, Smithies (2019) has argued for what he calls “phenomenal mentalism” according to which epistemic justification is determined only by our phenomenally individuated mental states, which include not only our conscious experiences but also our consciously accessible beliefs. More specifically, he defends a synchronic version of phenomenal mentalism, which takes epistemic justification to be determined by the phenomenal states you have now. This particular version of internalism is, however, vulnerable to the problem of forgotten evidence, as when a subject no longer remembers the ground of her justified belief (Goldman 1999). Along with other authors (McGrath 2007), Smithies thinks that GC provides a neat solution to this problem.

GC has also been the subject of many criticisms including, among other things, the “boost” and “conversion” objections as well as arbitrariness worries (for the latter, see below). The conversion problem says that when one has adequate evidence for two contrary hypotheses H1 and H2, one should withhold judgment. However, according to GC, by believing either of the hypotheses, the subject can convert her evidential situation from unjustified to justified, which is unacceptable. According to the boost problem, GC allows a subject to boost her justification for believing a proposition (p) by simply forming the belief. Suppose that S has evidence that supports p to some degree n. If GC is true, S can boost this support relation by simply believing that p. Following Feldman (2014), McCain (2020) has responded to this objection by rejecting the “additivity of evidence” principle, which says that if an agent S acquires new evidence that supports p while retaining any old evidence, then, in the absence of defeaters for p, S becomes better justified in believing p. The argument against this principle appeals to the possibility of redundant evidence; that is, evidence that makes no difference in levels of justification. Accordingly, there can be cases where S acquires new evidence for p without becoming better justified in believing p (Feldman 2014).

Another objection is that GC seems to conflict with the causal accounts of the basing relation (Frise 2017). The obtaining of the basing relation is what marks the transition from propositional to doxastic justification, and it is widely believed that causation must play a role in any viable account of the basing relation. The problem is that since a belief cannot cause itself at a time, the beliefs that GC claims are justified fail to satisfy the causal requirements for the basing relation (see McCain 2020 for a response). Finally, there is also the argument that GC lacks in intuitive plausibility. The idea is that it seems difficult to see how the bare fact of believing a proposition could confer justification on that belief. To get a sense of this unease, consider Christensen’s coin example again. I flip a coin which lands out of my sight and I form the belief that it has landed tails up without bothering to see if this is the case. GC seems to rule that this belief is justified, which is implausible. A response is that belief makes an epistemic contribution only when the subject lacks evidence for or against it (Poston 2012). If this is correct, though, it follows that a subject who believes that p in the absence of evidence for or against p would be placed in a position by GC to assert the following Moore-paradoxical sentence: <p but I have no evidence for p>. The awkwardness of allowing mere belief to be a source of justification does not disappear so easily (see Poston 2014, ch.2 for a response).

An important note, however, is that although constructing counterexamples to GC appears to be easy, it may be easier to beg the question against the proponent of GC. Consider, for example, an argument involving a possible scenario on the basis of which Foley (1982) rejects GC as being too strong. Consider subject S, who comes to believe a proposition H, which is not explicitly contradicted by anything else she believes, while, given her circumstances, it is more reasonable for S to believe not-H. By GC, S is justified in believing H. However, this scenario is under-described. For it is either the case that S has no evidence for H or not-H, in which case GC can be plausibly applied to S’s belief that H to ground her justification in believing that H, or it may be that, as Foley stipulates, S has better reasons for not-H than she has for H, in which case GC is no longer applicable because it takes the bare fact of believing a proposition to endow it with justification only in cases where there are no reasons against it. In such circumstances, it is not clear how, on pain of begging the question against the proponent of GC, one is to take believing not-H as being more reasonable than believing H.

In response to the above problems, the proponents of conservatism usually try to introduce certain modifications in their theses to make them more appealing. Although their official view is still the claim that it is the mere belief that confers justification on its content, closer observation reveals that these modified accounts often rely on such external factors as “seemings,” coherence, and evidence about our “general reliability” in order to enable their conservative theses to discharge their epistemic role. Before turning to such versions, though, it is important to consider some of the arguments that have been suggested for and against GC.

a. Arguing for and Against GC

i. The Transcendental Argument

A rather common transcendental argument for GC starts with the methodological question of how we are supposed to conduct our inquiries. One policy is to follow the Cartesian advice of dispensing with all our beliefs except those that are certain. This would seem to ensure that our beliefs satisfy the epistemic goal of believing truth and avoiding falsehood. Abandoning all our beliefs and trying to rebuild them from scratch, though, is close to cognitive suicide. The only way forward, then, is to work with what we have got and rely on our perspective in order to achieve the epistemic goal. Our perspective, however, consists not only of our mundane beliefs but also of our beliefs about which methods or belief-forming processes are reliable for regulating the formation of those mundane beliefs. There is no Archimedean point from which we can determine which of our belief-forming processes are reliable. Accordingly, there is no way to regulate our belief-forming activities except by relying on our antecedent convictions that constitute our perspective on the world. For those convictions to discharge this epistemic role, however, they must possess some epistemic worth to start with. Otherwise, our beliefs would fail to be justified. This means that mere belief, as GC claims, can confer some measure of justification on its content.

This argument can be resisted. Even if one may now appeal, a la GC, to the justified status of one’s background beliefs about the reliability of belief-forming processes in order to rationally and actively reaffirm a particular belief resulting from such processes, it is not clear that the belief in question was actually based on such background beliefs upon its inception (Podgorski 2016). So the justification now associated with the reaffirmation of the belief may not be the justification that was lost in the process. Another worry is that the motivations behind this argument might result in absurd consequences if the argument is repeatedly applied. Suppose, having formed a justified belief p by relying on my background beliefs, I take myself to have fulfilled the epistemic goal of truth and so consider the belief as justified. The type of conservatism that results from this argument would then seem to require that I hold on to the belief in the absence of a challenge from within my perspective. As Roger White argues, the same conservative motivations would also require me to avoid such challenges:

If I allow myself to critically rethink my commitment to p, there is a chance that I might   conclude that I was mistaken. But from my perspective now, to change my mind as to          whether p would be to be led into error. Of course I do not want that. So it is better for me to avoid all possible challenges and cling dogmatically to my current convictions. No one, I    take it, wants to endorse this sort of attitude. (2007, pp. 125-26)

Finally, there is the problem of arbitrariness that seems to arise from applying GC to our day-to-day cognitive dealings (White 2007 and Podgorski 2016). The idea is that it is possible for two or more subjects to have radically different perspectives, including different habits of thought that enjoy the same degree of coherence. According to GC, all such subjects are justified in their beliefs. Any insistence on the privileged status of one particular belief system over others invites the charge that the norms of epistemic impartiality are being violated. Moreover, if all belief systems are equally rational, it is not clear why one should not feel free to adopt the perspective of others.

ii. The Argument from Charity

Another argument (Vahid 2004) in defense of GC takes its inspirations from the consideration involving Davidson’s claim that belief ascription is constrained by the principle of charity. Very roughly, Davidson takes the evidence for the semantic theory to consist in the conditions under which speakers hold sentences true. The holding of a sentence true by a speaker is, however, a function of both what she means by that sentence as well as what she believes. This means that belief cannot be inferred without prior knowledge of the meaning, and meaning cannot be deduced without the belief. This is where Davidson appeals to the principle of charity. The idea is that we can solve the problem of the interdependence of belief and meaning “by holding the belief constant as far as possible while solving for meaning. This is accomplished by assigning truth conditions to alien sentences that make native speakers right when plausibly possible, according, of course, to our own view of what is right” (Davidson 1984, p.137). Thus understood, we may view the application of the principle of charity as involving the maximization of truth by the interpreter’s own lights.

Now, if belief ascription is to be constrained by charity and the latter is characterized by the aim of maximizing truth and minimizing falsity in the speaker’s belief system, then this would seem to endow the ascribed belief with some presumption of rationality, since justification (rationality) is also generally understood in terms of promoting the truth goal. A belief is justified in so far as it promotes the epistemic goal of believing what is true and not believing what is false. It should be noted, however, that since charity begins at home, the ascriber’s (interpreter’s) beliefs are as much subject to the constraint of charity as are the beliefs of the subject (interpretee) to whom beliefs are ascribed. One problem with this approach is that since charity requires the assignment of truth conditions to the interpretee’s sentences according to the interpreter’s view of what is right, the kind of rationality that emerges from this belief ascription process is one that is perspective-dependent and thus very weak. Although this seems to comport well with Chisholm’s own estimate of GC as constituting the “lowest” degree of epistemic justification (1980, 547), it will not be of much interest to those philosophers who think that conservatism is a substantial epistemic thesis.

The possible failure of the above arguments to fully substantiate the epistemic credentials of GC has not, however, deterred its proponents from coming up with modified versions of the thesis that no longer suffer from the problems that have afflicted the original version. Before considering these modified versions of GC, it pays to consider an argument against GC that denies that ordinary people are even capable of holding bare beliefs, and so concludes that, since the evaluation of GC requires the possibility of such beliefs, it is practically impossible to evaluate such a thesis.

Daniel Coren (2018) has claimed that the nature of bare belief makes it impossible to evaluate GC. He takes it that a bare belief is supposed to be a belief that is stripped of all personal memory and epistemic context. Although Coren regards such beliefs as logically conceivable, he thinks that for us, human agents, they are practically inconceivable. Even in the case of forgotten evidence, says Coren, it is not the case that one’s beliefs stand entirely on their own without any support from what the agent can recall. There are always some epistemic contexts within which our beliefs can be located. Coren realizes that there are some seemingly plausible examples where beliefs seem to lack such contexts as when, as a result of hypnosis or a bang on my head, I come to believe that, say, the number of stars in the universe is a prime number.

In response, he denies that such cases of belief-formation (involving hypnosis or brain injury) are practically possible. He says that he can imagine guessing or wanting the number of stars to be a prime number. However, being an ordinary agent, he cannot imagine being in such an extraordinary, non-human state of believing that the number of stars is prime while having no supporting reasons. To conclude, Coren’s main objection to conservatism is that since we are not “able to imagine having a belief in total isolation from other beliefs…[we cannot] evaluate the question of whether having a bare belief would have any positive epistemic status” (2018, p.10).

While Coren’s skeptical observations about the possibility of bare beliefs is a useful antidote to the often loose and fast way that conservatives play with such beliefs, it may not be advisable to base a critique of conservatism completely on empirical claims such as the practical impossibility of beliefs resulting from hypnosis or the formation of beliefs that are not sensitive to epistemic reasons. After all, there are views that countenance forming beliefs on the basis of pragmatic reasons (Leary 2017 and Rinard 2018). Moreover, not all processes resulting in belief need to be of a cognitive variety. Beliefs can come and go as a direct result of brain injury. Only beliefs that are the result of a cognitive process carry the distinction of being responsive to reasons.

More importantly, there are at least two ways of understanding Coren’s analysis of a bare belief in terms of a “belief in total isolation from other beliefs” depending on whether “isolation” is understood as conceptual or epistemic isolation. It is surely true that beliefs cannot be held in conceptual isolation. For, being conceptually structured, beliefs are said to be inferentially integrated with other beliefs such that their combination can yield further beliefs as consequences (Stich 1978). This is what lies behind Jerry Fodor’s (1983) denial that cognitive systems, unlike perceptual systems, are modular and informationally encapsulated. A conservative, though, need not challenge the preceding observations. What she claims is rather that beliefs can acquire their epistemic status in epistemic isolation from other beliefs. Although beliefs always appear in holistic networks with other beliefs, it is possible for some of them, says the conservative, to acquire their justification in epistemic isolation. At the risk of begging the question against such conservatives, the impossibility of having beliefs in total epistemic isolation from other beliefs cannot serve as a premise in an argument against conservatism.

b. Modifying GC

It was pointed out that the problems arising from the attempts to substantiate the standard formulation of GC have prompted some conservatives to suggest alternative formulations of the conservative principle that are no longer vulnerable to those difficulties. These versions of GC either involve appending further conditions to GC or seek to radically revise some of its assumptions.

i. Conservatism as the Principle of Credulity

An early attempt to modify GC by adding further restrictions to it is due to William Lycan (1988). Lycan presents his version of GC as the “principle of credulity” according to which the “seeming” plausibility (truth) of beliefs is sufficient for their acceptance. He takes this principle to underlie the justification of what he calls “spontaneous beliefs,” namely, beliefs that are directly produced by such sources as perception, memory, and so forth. Lycan’s’s choice of the beliefs that are rendered justified by his principal of credulity, namely perceptual and memory beliefs, raises, however, the suspicion that what we are dealing with here is in fact an instance of the thesis of phenomenal conservatism (discussed earlier), according to which if it seems to you that p, then, in the absence of defeaters, you are justified in believing that p. For the phenomenal conservative such seemings (perceptual, memorial, intuitive, and so forth) constitute the general source of justification for the beliefs to which they give rise, whereas for the upholders of GC it is the mere fact of believing of a proposition that endows that proposition with some epistemic worth.

Leaving this point to one side, Lycan’s recognition of the over-permissiveness of his principle in justifying what he regards as “wild” spontaneous beliefs, such as religious and superstitious beliefs, prompts him to add a number of restrictions on it, such as consistency with previously justified explanatory beliefs as well as the availability of an explanation by the agent’s belief system of how his or her spontaneous beliefs are produced. He requires that:

[Our] total body of beliefs and theories yield an idea of how… spontaneous belief[s] [were] produced in us. Finally suppose that according to this idea or explanation, the mechanism that produced the belief was (as we may say) a reliable one, in good working order. Then, I submit, our spontaneous beliefs are fully justified. (1988, p.168)

Although such constraints might exclude Lycan’s “wild” beliefs, they seem to be so stringent that it is unlikely that they can even be satisfied in the case of spontaneous perceptual and memory beliefs. With Lycan’s principal of credulity, we have, once more, an example of a seemingly conservative principal whose epistemic engine is driven not by the mere holding of a belief itself but by factors external to that belief.

ii. Conservatism as Believing in the Absence of Defeaters

Lycan’s principle of credulity is not the only way of evading the problems that are associated with GC. Kevin McCain (2008) suggests another way of getting around such problems that involves appending GC with two defeating conditions along the following lines:

PEC If S believes that p and p is not incoherent, then S is justified in retaining the belief that p and S remains justified in believing that p so long as p is not defeated for S.
Defeating Condition (C1) If S has better reasons for believing that not-p than S’s reason for believing that p, then S is no longer justified in believing that p.
Defeating Condition (C2) If S’s reasons for believing p and not-p are equally good and the belief that not-p coheres equally as well or better than the belief that p does with S’s other beliefs, then S is no longer justified in believing that p.

McCain’s account of conservatism involves references to both the fact of believing something as well as the absence of defeaters. He is quite explicit, though, that while S’s belief that p provides S with justification, the belief itself “is not counted among S’s reasons” (2008, p.187). He claims that the role that that the belief plays in its justification is akin to the role that the absence of defeaters plays in the justification of a belief. Perhaps what McCain has in mind here is that believing that p is merely an enabler for the reason that is supposed to justify the belief. It is not, however, clear what, once the belief itself is excluded from the realm of reasons, is supposed to play that role when one appeals to the conservative thesis. Closer examination of McCain’s account reveals, however, that this role is played by such notions as evidence and coherence in the account’s defeating conditions. (C1) requires that evidence for not-p not be stronger than evidence for p, and (C2) requires an asymmetry in the coherence of p and not-p with the rest of the agent’s beliefs. Thus, far from relying on mere belief to confer justification on its content, it is the strength of evidence for the belief as well as the coherence of the target belief p with the rest of the subject’s belief repertoire that are doing the main epistemic work. This conclusion can receive further support when we turn to McCain’s claims, on behalf of his account, that it can resolve skeptical and other standing problems of epistemology.

To illustrate, consider some of the ambitious claims that McCain makes on behalf of PEC. He thinks, for example, that his version of conservatism is able to neutralize the challenge presented by Cartesian-style skeptical arguments that seek to undermine our beliefs in the external world by proposing alternative hypotheses, such as dreaming, being a brain in a vat, and so forth, that can also explain our evidence (perceptual experience). According to McCain, his conservative thesis explains why the skeptical argument fails, for none of the skeptical hypotheses provide a defeater that satisfy C1. Evidence in favor of these hypotheses is no better than evidence in favor of the belief about the external world. Neither is C2 satisfied because our belief in the external world coheres better with our overall set of beliefs, including our commonsense beliefs such as, “It is raining now,” “I slept last night,” and so forth.

The crucial premise in McCain’s argument is that “[a]lthough these commonsense beliefs are closely related to the belief that there is an external world, they are not directly dependent upon the belief that that there is an external world. We do not form the belief that there is an external world and then infer from them the belief that ‘the sun is shining today,’ etc.” (2008, pp. 189-190). It is true that this is not how we form our commonsense beliefs. However, McCain makes an important assumption when clarifying his defeating condition C2. C2 prohibits that not-p cohere equally as well or better than the belief that p does with S’s other beliefs. This prompts the question of what sort of beliefs should be included in that repertoire when one is assessing whether the belief that p is justified. On pain of guaranteeing that C2 would not be met, McCain thinks that, as a necessary condition, neither p nor “any belief q that is directly dependent upon the belief that p for its justification should…be included in S’s set of beliefs in regard to [C2]” (2008, p.187). As this remark clearly indicates, beliefs that are to be included in the agent’s belief system are supposed not to depend on the target belief p in an epistemic, rather than an inferential, sense. They are supposed not to be “directly dependent upon the belief that p for [their] justification.” In other words, McCain is assuming that commonsense beliefs can be justified independently of whether or not one is justified in believing that there is an external world, and this is where his conservatism is helping itself to an assumption from outside of conservatism’s sphere.

To explain, just as his version of conservatism owed its epistemic bite to the involvement of such notions as evidence and coherence, its purported broader epistemic uses in resolving long standing epistemic disputes equally derives its power not from the thesis itself but from some substantive epistemological theory, namely dogmatism (liberalism), in its background. According to dogmatism, absent defeaters, experience is sufficient, on its own, to confer justification on the belief in its content. This accounts for why McCain thinks that commonsense beliefs (in the subject’s belief set) can be justified without being dependent for their justification on belief about the external world. By contrast, a rival theory such as Crispin Wright’s conservatism maintains that experiences can only justify one’s commonsense beliefs provided one is already warranted in believing that there is an external world. It is clear then that what is really doing the epistemic work in McCain’s response to the skeptical argument is not his conservatism as such but the dogmatist account of perceptual justification that is presupposed in that account. With such a view in place, there would be no need to appeal to doxastic conservatism.

iii. Conservatism as a Dynamic Strategy

 Another approach, from Abelard Podgorski (2016), recommends a dynamic strategy in response to the problems discussed so far. Podgorski agrees with many of the objections that have been leveled against conservatism, such as bootstrapping and arbitrariness. Accordingly, he seeks to present an alternative conservative view that, while incorporating the basic motivations for GC, is not susceptible to its weaknesses. His proposal involves a dynamic interpretation of the rational relevance of two types of considerations: those that bear on the question of whether p and those that bear on the question of whether to make up one’s mind about p. He intends the dynamic slant to make it clear that the relevant norms governing our epistemic life are those that govern processes, rather than states, in particular the process of considering whether p, for some proposition p. Two such norms are distinguished: those that regulate when to initiate the process of considering whether p is true and those that govern the rational operation of that process.

Accordingly, what is distinctive of dynamic conservatism is that it appeals not to the norms that generate an agent’s mental states at particular times but to the norms that govern the process of considering whether p. To see how the dynamic approach intends to secure conservatism, Podgorski introduces the following norm for regulating the initiation of consideration.

Inconsiderate One is not always rationally required to initiate consideration whether p when one believes that p and one’s evidence does not make p worth believing (from one’s perspective).

If Inconsiderate is true, one may permissibly fail to reconsider belief in p while one’s evidence does not make p worth believing. Now, just as there are things that bear on whether or not something is worth believing, there are things that bear on whether a question is worth opening for consideration. For example, it is worth considering whether p if it is important that my belief p is true, or my evidential situation regarding p is significantly better now than it was before or will be in the future. On the other hand, it is less worth considering whether p if, say, it does not matter whether my belief p is true, or if my evidential situation regarding p is worse than it was when I last considered p. Costs involving time and cognitive effort can also negatively bear on if it is worth considering whether p. Since Podgorski takes not considering to have a default permissible status, he concludes that “[w]e are not required to consider a question until we have some special positive reason to do so. So agents will be rational in maintaining any given belief for at least as long as they do not encounter such a reason” (2016, pp. 366-67).

Podgorski claims that, like standard conservatism, dynamic conservatism is also sensitive to the fact that changing beliefs incurs cognitive costs. It also explains the phenomenon of lost justification or evidence, for as long as an agent lacks reasons to reconsider her belief, she may, even having lost her evidence, persist in her belief. However, by rejecting the claim that bare belief can confer justification on its content, dynamic conservatism differs from the standard version. On the dynamic view, the belief that is held by an agent need not be worth having, because “by rejecting state-oriented norms demanding the worthiness of our beliefs, we allow that there are periods of time where what a belief has going for it simply does not matter for an agent’s rationality” (2016, p. 372).

However, it may be that by rejecting the main tenet of GC, dynamic conservatism becomes too modest to have an epistemic impact when it comes to regulating our belief-forming processes. Consider, for example, a case of forgotten evidence where, having forgotten the original evidence, there happens to be some reason to effectively reconsider our belief, since its truth turns out to be important in that context. Here, evidentialists usually appeal to second-order evidence (about, say, the general reliability of memory, and so forth). As we have seen, however, this sort of evidence is legitimate only if the belief in question was originally based on that evidence. Therefore, while this evidence can be used to ground the rationality of one’s active reaffirmation of that belief, it fails to explain one’s rationality when one does not perform such affirmation. Dynamic conservatism, however, seems to fall short in the opposite direction.

It can explain the rationality of the agent’s holding on to his belief in the cases of forgotten evidence, where he lacks a special reason to reconsider that belief. It also explains such cases without locating the source of this rationality in the agent’s current or past evidence. It seems to fall short of accounting for the agent’s rationality in the sort of cases described above, where there is reason to reconsider one’s belief. As Podgorski admits, “[t]he cases the [dynamic] view does not endorse as rational are those where an agent actively reaffirms their belief without relevant second-order information. And these are the cases where it is least intuitive that doxastic inertia is rational. Nevertheless, if these are taken to be core cases, it must be admitted that this is a genuine disadvantage of the dynamic approach” (2016, p. 370).

6. Conclusion

The doxastic conservatism debate develops out of the attempts to show that our tendency to maintain and preserve our beliefs beyond the evidence at our disposal is a rational phenomenon. Conservatism presents itself as a normative thesis with the potential to resolve a number of outstanding issues in epistemology. It turns out, however, that there is not just one single conservative principle, but a variety of such theses. Further discussions of doxastic conservatism may focus on these contenders and how they relate to the properties of belief relevant to the epistemic evaluation of doxastic states when one encounters evidentially equivalent alternatives, the perseverance of doxastic states in the absence of specific reasons to change them, and whether features of one’s doxastic state can add to the justification of the beliefs that constitute it.

7. References and Further Reading

  • Adler, J. 1990. “Conservatism and Tacit Confirmation,” Mind 99: 559-570.
  • Alston, W. 1989. Epistemic Justification, Cornell University Press.
  • Chisholm, R. 1980. “A Version of Foundationalism” in Wettstein, et al (eds.), Midwest Studies in Philosophy V, University of Minnesota Press.
  • Christensen, D. 1994. “Conservatism in Epistemology,” Nous 28: 69-89.
  • Christensen, D. 2000. “Diachronic coherence and epistemic impartiality,” The Philosophical review 109, 3: 349-371.
  • Coren, D. 2018. “Epistemic conservatism and bare beliefs,” Synthese https://doi.org/10.1007/s11229-018-02059-8.
  • Davidson, D. 1984. “Radical Interpretation,” reprinted in Inquires into Truth and Interpretation, Oxford: Clarendon.
  • Feldman, R. 2014. “Evidence of evidence is evidence” in Matheson, J. and R. Vitz (eds.), The Ethics of Belief, New York: Oxford University Press: 284–300.
  • Fodor, J. 1983. The Modularity of Mind. MIT Press.
  • Foley, R. 1982. “Epistemic Conservatism,” Philosophical Studies 43: 165-182.
  • Foley, R. 1987. The Theory of Epistemic Rationality, Harvard University Press.
  • Frise, M. 2017. “Internalism and the problem of stored beliefs,” Erkenntnis 82: 285–304.
  • Fumerton, R. 2007. “Epistemic conservatism: Theft or honest toil?” Oxford Studies in Epistemology 2: 63–86.
  • Goldman, A. 1979. “Varieties of Epistemic Appraisal,” Nous 13: 23-38.
  • Goldman, A. 1986. Epistemology and Cognition, Harvard University Press.
  • Goldman, A. 1999. “Internalism Exposed,” The Journal of Philosophy 96: 271-293.
  • Goldstick, D. 1971. “Methodological Conservatism,” American Philosophical Quarterly 8: 186-191.
  • Harman, G. 1986. Change in View, MIT Press.
  • Huemer, Michael. 2001. Skepticism and the Veil of Perception, Rowman and Littlefield.
  • Kvanvig, J. 1989. “Conservatism and its Virtues,” Synthese, 79:143–163.
  • Lycan, W. 1988. Judgement and Justification, Cambridge University Press.
  • McCain, K. 2008. “The Virtues of Epistemic Conservatism,” Synthese, 164:185–200.
  • McCain, K. 2020. “Epistemic Conservatism and the Basing relation,” in Carter, A. and P. Bondy (eds.) Well-Founded Belief, Routledge.
  • McGrath, M. 2007. “Memory and Epistemic Conservatism,” Synthese, 157: 1–24.
  • Moffett, M. 2007. “Reasonable Disagreement and Rational Group Inquiry”, Episteme 4, 3: 352-367.
  • Podgorski, A. 2016. “Dynamic Conservatism,” Ergo, 3: 349–376.
  • Poston, T. 2012. “Is There an ‘I’ in Epistemology?” Dialectica 6, 4: 517-541.
  • Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism, Palgrave Macmillan.
  • Pryor, J. 2000. “The Skeptic and the Dogmatist,” Nous, 34: 517-49.
  • Sklar, L. 1975. “Methodological Conservatism,” Philosophical Review LXXIV: 186-191.
  • Smithies, D. 2019. The Epistemic Role of Consciousness, Oxford University Press.
  • Stich, Stephen. 1978. “Beliefs and Subdoxastic States,” Philosophy of Science 45: 499– 518.
  • Quine, W.V.O. 1951, “Two Dogmas of Empiricism” in From a Logical Point of View, 2nd ed. New York: Harper & Row.
  • Vahid, H. 2004. “Varieties of Epistemic Conservatism,” Synthese 141: 97-122.
  • Vogel, J. 1992. “Sklar on Methodological Conservatism,” Philosophy and Phenomenological Research 52: 125-131.
  • White, R. 2007. “Epistemic Subjectivism,” Episteme 4: 115-129.
  • Wright, C. 2004. “Warrant for Nothing (and Foundations for Free)?” Aristotelian Society Supplementary Volume, 78, 1:167–212.

 

Author Information

Hamid Vahid
Email: hamid36vahid@gmail.com
Institute for Research in Fundamental Sciences
Iran

The Indeterminacy of Translation and Radical Interpretation

The indeterminacy of translation is the thesis that translation, meaning, and reference are all indeterminate: there are always alternative translations of a sentence and a term, and nothing objective in the world can decide which translation is the right one. This is a skeptical conclusion because what it really implies is that there is no fact of the matter about the correct translation of a sentence and a term. It would be an illusion to think that there is a unique meaning which each sentence possesses and a determinate object to which each term refers.

Arguments in favor of the indeterminacy thesis first appear in the influential works of W. V. O. Quine, especially in his discussion of radical translation. Radical translation focuses on a translator who has been assigned to translate the utterances of a speaker speaking a radically unknown language. She is required to accomplish this task solely by observing the behavior of the speaker and the happenings in the environment. Quine claims that a careful study of such a process reveals that there can be no determinate and uniquely correct translation, meaning, and reference for any linguistic expression. As a result, our traditional understanding of meaning and reference is to be thrown away. Quine’s most famous student, Donald Davidson, develops this scenario under the title of “radical interpretation.” Among other differences, radical interpretation is distinguished from Quine’s radical translation with regard to its concentration on an interpreter constructing a theory of meaning for the speaker’s language. Such a theory is supposed to systematically entail the meaning of the speaker’s sentences. Nonetheless, radical interpretation too cannot resist the emergence of indeterminacy. According to the thesis of the indeterminacy of interpretation, there always will be rival interpretations of the speaker’s language, and no objective criterion can decide which interpretation is to be chosen as the right one.

These views of Quine and Davidson have been well received by analytic philosophers particularly because of their anti-Cartesian approach to knowledge. This approach says knowledge of what we mean by our sentences and what we believe about the external world, other minds, and even ourselves cannot be grounded in any infallible a priori knowledge; instead, we are rather bound to study this knowledge from a third-person point of view, that is, from the standpoint of others who are attempting to understand what we mean and believe. What the indeterminacy of translation/interpretation adds to this picture is that there can never be one unique, correct way of determining what these meanings and beliefs are.

The article begins with Quine’s arguments for the indeterminacy of translation, then introduces Davidson’s treatment of indeterminacy by focusing on his semantic project and the scenario of radical interpretation. Then the discussion turns to David Lewis’s version of radical interpretation, Daniel Dennett’s intentional stance, and the way Lewis and Dennett treat the indeterminacy of interpretation.

Table of Contents

  1. Quine’s Naturalized Epistemology and Physicalism
    1. Quine’s Naturalism
    2. Quine’s Physicalism
  2. Quine’s Arguments for the Indeterminacy of Translation
    1. Quine’s Radical Translation Scenario
    2. The Argument from Below: The Inscrutability of Reference
    3. The Argument from Above: The Indeterminacy of Translation
      1. Confirmational Holism and Underdetermination
      2. Underdetermination and Indeterminacy
  3. Davidson’s Semantic Project
    1. Davidson’s Use of Tarski’s Truth-Theory
    2. Davidson’s Radical Interpretation Scenario
  4. Davidson on the Indeterminacy of Interpretation
  5. Lewis on Radical Interpretation
    1. Lewis’s Constraints on Radical Interpretation
    2. Lewis on the Indeterminacy of Interpretation
      1. Putnam’s Model-Theoretic Argument and Lewis’s Reference Magnetism
  6. Dennett’s Intentional Stance
    1. Indeterminacy and the Intentional Stance
  7. References and Further Reading

1. Quine’s Naturalized Epistemology and Physicalism

Quine has famously argued that the reference of any language’s term and the meaning of any language’s sentence is indeterminate. When a speaker uses terms like “rabbit”, “tree”, and “rock”, it can never be settled to what specific object she is referring.When she utters “that’s a rabbit”, “that’s a tree”, “tigers are fast”, and the like, it will always remain indeterminate what she really means by them. These claims can be called the “skeptical conclusions” of Quine’s arguments for the indeterminacy of translation.

The first preliminary point to note is that this sort of skepticism is not epistemological but constitutive. Quine’s claim will not be that it is difficult to know what someone means by her words, or that we may lack the sort of epistemic powers, skills, or tools required to ascertain such meanings. His claim is that there is no determinate meaning and reference to know at all: there is no fact as to what a sentence means and what a term refers to. This is what Quine means by the claim that meaning and reference are indeterminate.

Quine has two famous arguments for these conclusions: (1) the argument from below, which is also called the argument for the “inscrutability of reference”, “indeterminacy of reference” and “ontological relativity” (Quine 1970), and (2) the “argument from above”, which is also called the argument for the “indeterminacy of translation” (Quine 1970) or “holophrastic indeterminacy” (Quine 1990a).  The two arguments are discussed below after first considering the grounds on which Quine builds his arguments since the arguments rely on a variety of important positions, among which Quine’s version of naturalism is significant.

a. Quine’s Naturalism

According to Quinean naturalism, there is no such thing as first philosophy which, independently of natural science, can offer unquestionable knowledge of the world; rather, philosophy is to be viewed as continuous with science, especially physics (Quine,1981). On this view, we are bound to investigate the world, human beings included, from the standpoint of our best scientific theory. In our study of the world, we should take a “third-person” point of view rather than a Foundationalist Cartesian one. The Cartesian position advocates a priori and infallible knowledge, on the basis of which our knowledge of the external world and other minds can be established. Something is knowable a priori if it can be known independently of any specific experience of the external world, and such knowledge is infallible if it is immune to doubt or uncertainty. For Descartes, such knowledge cannot be dependent on, or inferred from, science because science relies on what we can perceive via our senses, and we can never trust our senses: they can deceive us. The Cartesian view, therefore, looks for a source of knowledge that is free from such doubts. Quine, especially in his famous article “Two Dogmas of Empiricism” (Quine 1951), argues that any hope of finding such an a priori basis for knowledge is illusory because, among other reasons, the analytic/synthetic distinction cannot be preserved.

Analytic statements are traditionally held to be true in virtue of the meaning of their constituent parts. Anyone who knows English and thus knows what “bachelor” and “unmarried” mean would know that the sentence “bachelors are unmarried” is true. Synthetic statements (such as “it’s raining”) are those which are true not solely on the basis of the meaning of their terms, but also on the basis of what goes on in the world. Many philosophers believed that if a statement is analytic, it is also necessarily true, and what it expresses is knowable a priori. In “Two Dogmas of Empiricism”, Quine argues that there is no non-circular way of defining the notion of analyticity. If so, what then does form the bedrock of our knowledge of the world? Quine’s answer is natural science.

This is part of what provides Quine with enough reason to call his philosophy “naturalistic”. If epistemology is defined as the study of knowledge, then Quine insists that epistemology must be naturalized: it must follow the methods of science (Quine 1969b). However, what does a scientist do? A scientist investigates the connection between her theory and the (sensory) evidence or data she collects from the world. She makes observations, forms hypotheses about the future behavior of certain objects or the occurrence of future events, and checks whether they are supported by further evidence. Investigating the link between evidence and theory, and the support the latter can receive from the former, is the best we can do in our study of a subject matter. We can never stand outside of our theory and survey the world; we are bound to work from within (Quine 1981). Philosophers interested in the study of reality, knowledge, morality, mind, meaning, translation, and so forth, have no choice but to proceed in the same way, that is, to explore the link between the relevant flow of evidence and their best (scientific) theory about them. This explains why Quine is also called a “physicalist”.

b. Quine’s Physicalism

Quine’s view of physicalism has changed during his philosophical career. The clearest characterization of it has been offered by Quine himself: “Nothing happens in the world … without some redistribution of microphysical states” (Quine 1981, 98). According to this view, in the absence of some relevant physical change, there can be no real change in any subject matter. Let’s use the notion of “facts of the matter”. Our scientific theory of the world works if the world can be viewed as consisting in specific things, that is, if there are certain facts of the matter about them. For instance, the theory works if there are molecules, electrons, trees, neutrinos, and so forth; it tells us that molecules have certain features, move in such and such a way, and are made of such and such elements. Quine’s physicalism implies that facts about any subject matter are to be fixed by the totality of such facts about the world, and the totality of facts about world is fixed by our choice of a total theory of the world. For example, if one claims that temperature is real, and thus there are facts about temperature, such facts are to be determined once the relevant physical facts are fixed, which are, in this case, facts about molecules’ average kinetic energy at a certain time. According to Quine’s physicalism, we can legitimately talk about facts about temperature because once we know the amount of molecules’ average kinetic energy, we know all there is to know about temperature. In this sense, the physical facts have fixed the facts about temperature. Therefore, we can characterize Quine’s physicalism as the view implying that either the totality of physical facts determines the facts about a subject matter, or there is simply no fact about that subject matter at all. This view will play a vital role in Quine’s arguments for the indeterminacy of translation.

One of our most central questions in the philosophy of language concerns what determines the meaning of a linguistic expression. We can already guess that, for Quine, any answer to this question must be offered from a naturalistic point of view. We should see what science can tell us about our linguistic practices, especially that of meaning something by an expression. The indeterminacy of translation arises from such a Quinean way of treating the questions about meaning and reference.

2. Quine’s Arguments for the Indeterminacy of Translation

For the moment, assume that Quine can successfully establish the skeptical conclusion that there is no fact of the matter about the correct translation of any expression. When we talk about translating terms, we talk about pairing two terms which have the same reference. For instance, if you look at a standard German to English dictionary, you can find “snow” is the translation of the German word “Schnee”. Both of these terms refer to a certain sort of thing: snow. Moreover, when we talk about translation in the case of sentences, we talk about the process of pairing two sentences, such as “snow is white” and “Der Schnee ist weiss”, in terms of having the same meaning. But, if neither “snow is white” nor “Der Schnee ist weiss” can be said to have any determinate meaning, it follows that we cannot say that one is the correct translation of the other simply because there is no such thing as one unique meaning that they share, and vice versa, that is, if none can be said to be the correct translation of the other, there is then no unique meaning which can be claimed to be shared by them. This shows that meaning can be studied in terms of translation. If Quine can lead us to skepticism about the existence of correct translations, he has thereby led us to skepticism about the existence of determinate meanings.

Quine invites us to consider the way in which we learn our first language. Through such a process, we learn how to use our language’s terms correctly, especially “Mama”, “Milk”, “Fire” and so on, which can be treated as one-word sentences. We gradually become competent in detecting different parts of sentences and understanding how these parts can be put together to form more complex expressions. We finally gain mastery of the use of our language so that others in our speech-community can treat us as reliable users of it. Quine thinks that, instead of talking about such a complex process of learning a first language, we can “less abstractly and more realistically” talk about translation (Quine 1960, 27). Imagine that a translator finds herself in the middle of Amazon Jungle and faces a member of a tribe nearby whose language is entirely unknown to her. In order to start communicating with this native speaker, she should start translating his utterances. For each expression in the native’s language, she should find an expression in her own language which has the same meaning: she starts making a dictionary for that language. Since the language is radically unknown, our translator is called a “radical translator”.

a. Quine’s Radical Translation Scenario

Imagine that where our radical translator and the native meet, a rabbit scurries by, and the native utters “Gavagai”. The translator treats “Gavagai” as a one-word sentence. Considering the presence of the rabbit and the native’s response, the translator writes down “Lo, a rabbit” as the hypothetical translation of “Gavagai”. Her reason is this: in a similar situation, she would utter “Lo, a rabbit”. This translation is currently only hypothetical because one observation alone would not be enough for the translator to decide whether “Lo, a rabbit” is the correct translation of “Gavagai”. She continues checking this hypothetical translation with further evidence. For instance, suppose that she has been successful in detecting that the native’s word “Evet” corresponds to “Yes” and “York” corresponds to “No”. Suppose again that a different rabbit with a different color is observable and the translator points to it and asks: “Gavagai?” Assume that the native responds by “Evet”. In this situation, Quine says that the native assents to “Gavagai” in the presence of the rabbit. On a different occasion, an owl is present and the translator asks the same question “Gavagai?” The native responds by “York” this time. In this situation, the native dissents from “Gavagai”.

The native’s behavioral responses, that is, his assent to, or dissent from, a sentence on specific occasions, are pivotal for Quine’s project because they form the “evidential basis” for translation. For two reasons the translator cannot have access to anything more than this sort of evidence. First of all, the native’s language is radically unknown to the translator: she has no prior information whatsoever about what the native’s words mean and what the native believes. This by itself puts a considerable limitation on the sort of evidence available to her. Secondly, Quine was a physicalist. For Quine, physicalism, in the case of translation, manifests itself in a sort of behaviorism. The reason is that the relevant physical facts about translation are facts about the observable behavior of the speaker, that is, the native’s assents and dissents. To be more precise, the translator can appeal only to the native’s dispositions to verbal behavior. As Quine famously puts it, “there is nothing in linguistic meaning…beyond what is to be gleaned from overt behavior in observable circumstances” (Quine 1987, 5). Therefore, when Quine talks about “evidence”, he talks about behavioral evidence, and when he talks about “facts”, he talks about the native’s observable behavior.

Suppose that the translator, after making several observations, has become confident that “Lo, a rabbit” is to be considered as the correct translation of “Gavagai”. Another important notion is introduced by Quine at this point. We can now say that “Gavagai” and “Lo, a rabbit” are stimulus synonymous, or have the same stimulus meaning (Quine 1990a). The claim that “Gavagai” and “Lo, a rabbit” have the same stimulus meaning is equivalent to the claim that what prompts the native to assent to (or dissent from) “Gavagai” also prompts the translator to assent to (or dissent from) “Lo, a rabbit”.  What causes the native to assent to “Gavagai” and the translator to assent to “Lo, a rabbit” is the presence of a rabbit. Therefore, the stimulus meaning of “Gavagai” is the set of all the stimulations which prompt the native to assent to, or dissent from, “Gavagai”. Similarly, the stimulus meaning of “Lo, a rabbit” is the set of all the stimulations which cause the translator to assent to, or dissent from, “Lo, a rabbit”. Since the stimulations were the same in this case, that is, the presence of a rabbit, we can conclude that “Gavagai” and “Lo, a rabbit” have the same stimulus meaning. But why does Quine talk about stimulations, rather than objects? Instead of talking about rabbit stimulations, one may complain, he could simply say that rabbits prompt the native to assent to “Gavagai”.

Quine’s insistence on treating stimulations rather than objects as central has its roots in his adherence to naturalism. For him, what is scientifically worth considering about meaning and reference is the pattern of stimulations since, as Quine puts it, “it is a finding of natural science itself, however fallible, that our information about the world comes only through impacts on our sensory receptors” (Quine 1990a, 19). What science tells us, in this case, is that the native and the translator, via observing the rabbit in view, would have a visual stimulation, or some “pattern of chromatic irradiation of the eye” (Quine 1960, 31). For Quine, we can assume that the native would be prompted to assent to “Gavagai” by the same irradiations which prompt the translator to assent to “Lo, a rabbit”. Even if we linguists wanted to talk about the rabbit itself, we had no other way but to rely on what our sensory receptors receive from touching it, seeing it, and the like.

Having reviewed the scenario of radical translation, consider Quine’s first argument for indeterminacy, that is, his argument for the inscrutability of reference.

b. The Argument from Below: The Inscrutability of Reference

With the notion of stimulus meaning at hand, we can introduce Quine’s more technical notion of “observation sentences”, which also has an important role to play in his arguments. Our radical translator starts her translation by focusing on the native’s sentences which are about the immediate happenings in the world. Quine calls sentences like “Lo, a rabbit”, “it’s raining”, “that’s a tree”, and the like, “observation sentences.” Observation sentences themselves belong to the category of “occasion sentences”, the sentences that are true on some occasions and false on others. For instance, the sentence “it’s raining” as uttered by the speaker at time t is true if it is raining around her at t. The truth-value of occasion sentences, that is, their truth or falsity, depends on whether the speaker is prompted to assent to, or dissent from, them on specific occasions. Thus, the stimulus meaning of occasion sentences is highly sensitive to the occasion of speech and may change with regard to some additional information the speaker may receive. (On the contrary, “standing sentences” are much less sensitive to the occasion of speech, such as “rabbits are animals”.) Observation sentences are those occasion sentences which are more stable with regard to their stimulus meaning, in the sense that almost all members of a speech-community can be said to have more or less similar dispositions to assent to, or dissent from, them on specific occasions. Our translator is primarily concerned with translating the native’s observation sentences. Her aim is to match the native’s observation sentences, such as “Gavagai”, with the observation sentences of her own language, such as “Lo, a rabbit”, by way of discovering whether these sentences have the same stimulus meaning, that is, whether the native’s and the translator’s assents to, or dissents from, them are prompted by the same sort of stimulations. To simplify the example, assume that the native utters “Yo, gavagai”.

Quine’s principal question is this: Given that “Yo, gavagai” and “Lo, a rabbit” have the same stimulus meaning, would this fact justify claiming that the terms “gavagai” and “rabbit” are the correct translations of one another? Quine’s answer is negative. One term is the correct translation of another if both refer to the same thing, or if both have the same reference. But, as Quine argues, the fact that “Yo, gavagai” and “Lo, a rabbit” are stimulus synonymous cannot show that the native’s term “gavagai” and the translator’s term “rabbit” have the same referent. In order to see why, imagine that there is a second translator translating the observation sentences of another member of the native tribe. Suppose that when, for the first time, the native utters “Yo, gavagai” in the presence of a rabbit, our second translator, before writing down “Lo, a rabbit” as the translation of “Yo, gavagai”, hesitates for a moment. Having taken into account the cultural and other differences between him and the native, he decides to take “Lo, an undetached rabbit-part” as his hypothetical translation of “Yo, gavagai”, on the basis of the idea that, perhaps, the natives believe that there are only particulars in the world, not whole objects. The translator thinks that he would have assented to “Lo, an undetached rabbit-part” if he had had such a belief about the world. Our translator, however, does not need to be worried because if he is wrong, he will soon find some evidence to the contrary leading him to throw away such a hypothetical translation and replace it with “Lo, a rabbit”. He goes on, just like our first translator, and checks the native’s assents and dissents with regard to “Yo, gavagai” on different occasions.

The problem is that the same sort of evidence which led our first translator to translate “Yo, gavagai” into “Lo, a rabbit”, equally well supports the second translator’s translation, “Lo, an undetached rabbit-part”. The reason is simple: whenever a rabbit is present, an undetached rabbit-part (such as its ear) is also present. The problem becomes worse once we realize that there can be an infinite number of such alternative translations, such as “Lo, another manifestation of rabbithood”, “Lo, a rabbit time-slice”, and so forth. All such translations are mutually incompatible but are compatible with all evidence there is with regard to the native’s verbal behavior. Nothing in the native’s assents to “Yo, gavagai” in the presence of rabbits can discriminate between such rival translations. The two translators have come up with different dictionaries, that is, different sets of translations of the native’s terms, in each of which a different translation has been offered for the native’s term “gavagai”. In one, it has been suggested that “gavagai” refers to what “rabbit” refers to because, for the first translator, “Lo, a rabbit” and “Yo, gavagai” have the same stimulus meaning. In another, it has been suggested that “gavagai” stands for what “an undetached rabbit-part” refers to because “Lo, an undetached rabbit-part” and “Yo, gavagai” have the same stimulus meaning. Which of these translations is to be chosen as the correct one? To which object does “gavagai” refer after all?

Quine famously claims that there is no objective basis for deciding which translation is right and which is wrong. There are indefinitely many mutually different translations of a term, which are compatible with all possible facts about stimulus meaning. “Yo, gavagai”, “Lo, a rabbit”, “Lo, an undetached rabbit-part”, and so on, are all stimulus synonymous. And obviously such facts do not suffice to determine the reference of the term “gavagai”: all the stimulations which prompt the native to assent to “Yo, gavagai” prompt assent to “Lo, a rabbit”, “Lo, an undetached rabbit-part”, and so forth. This implies that for “gavagai” there can be indefinitely many referents, and there would be nothing objective on the basis of which we can determine which one is the real referent of “gavagai”. As a result, the reference of the native’s term “gavagai” becomes inscrutable. Also, since the same problem can potentially arise for any term in any language, reference is inscrutable in general.

To see why this conclusion is skeptical, recall Quine’s physicalism: either the physical facts fix the semantic facts by picking out one unique translation of the native’s term as the correct one, or they fail to fix such semantic facts, in which case it should be concluded that there are no such facts at all. The physical facts, in the case of translation, were the facts about the native’s assents and dissents, and they failed to determine the reference of the term “gavagai”. There is, therefore, no fact as to what a term refers to. Again, this sort of skepticism is not epistemological: the claim is not that there is a hidden fact within the physical facts which, if we had the epistemic power to discover it, would solve our problem. Quine’s claim has rather an ontological consequence: since it remains forever indeterminate to what things in the world people refer by their terms, it is entirely indeterminate how they slice the world. This is the reason why the inscrutability of reference leads to “ontological relativity”: it would never be determinate whether, for the native, the world consists in whole enduring rabbits or only rabbit-parts.

This argument has been subject to various criticisms. Most of them target the “gavagai” example, but Quine does not think that such criticisms succeed. For instance, many may think that the solution to the above indeterminacy problem is simple: Why not simply ask the native? Assume that we have found out how to translate the native’s words for “is the same as”. The problem will be solved if the linguist points to the rabbit’s ear and simultaneously to the rabbit’s foot and asks the native, “Is this gavagai the same as that gavagai?” If the native responds positively by “Evet”, then “gavagai” refers to the whole rabbit because the rabbit’s ear and the rabbit’s foot are two different rabbit-parts. Quine’s response is that you simply begged the question by presupposing that the translation of the native’s expression for “is the same as” (whatever it is in the native’s language) is determinate. But, what if its translation be “is part of the same rabbit”? In this case, when we asked, “Is this gavagai the same as that gavagai?”, what we were asking was: “Is this gavagai part of the same rabbit as that gavagai?” The native’s previous positive response is now compatible with the assumption that by “gavagai” the native refers to an undetached rabbit-part because the ear and the foot are indeed parts of the same rabbit.

For Quine, the problem is deeper than this: the “gavagai” example has been just a convenient way of putting it. Nonetheless, many philosophers of language have found this response unconvincing. There is an interesting debate about it between the proponents and the opponents of Quine’s argument from below. To mention some of the most famous ones, Gareth Evans (Evans 1975) and Jerry Fodor (Fodor 1993, 58-79) have attempted to modify and push the general sort of objection introduced above. Mark Richard (Richard 1997) and especially Christopher Hookway (Hookway 1988, 151-155) have argued that Quine is right in his claim that this strategy would inevitably fail because we can always offer alternative translations of the native’s terms which remain compatible with any such modifications. Although these alternative translations may seem too complex, odd, or unnatural, what would prevent us from taking the native to believe in them?

c. The Argument from Above: The Indeterminacy of Translation

Having been disappointed with such debates about his “gavagai” example, Quine claimed that, for those who have not been satisfied with the argument from below, he has a very different, broader, and deeper argument: the “argument from above”. It is this second argument that Quine prefers to call his argument for “the indeterminacy of translation” (Quine 1970). One reason is that his previous argument for the inscrutability of reference at most results in the conclusion that there are always alternative translations of the native’s sentences because facts about stimulus meaning cannot fix the reference of sub-sentential parts of the sentences. The truth-value of the sentences, however, remains the same since if “Lo, a rabbit” is true because of the dispositions to assent to, or dissent from, it in the presence of a rabbit, then “Lo, an undetached rabbit-part” would also be true on the same basis. Quine argues that there can be rival translations of the native’s whole sentences such that the same sentence can be true in one and false in another.

The argument from above rests on the thesis of the “underdetermination of theory by evidence” and its relation to the indeterminacy thesis. Quine’s argument can have a very simple characterization: insofar as a theory is underdetermined by evidence, the translation of the theory is also indeterminate. In an even simpler way, Quine’s claim is that underdetermination together with physicalism results in the indeterminacy of translation. Contrary to its simple characterization, however, the argument is more complex than the argument from below because it is not based on any interesting example by which the argument can be established step by step; it is rather based on much theoretical discussion. To begin with, what does Quine mean by “underdetermination of theory by evidence”?

i. Confirmational Holism and Underdetermination

Quine’s thesis of underdetermination of theory by evidence claims that different theories of the world can be empirically equivalent (Quine 1990b). This thesis stems from Quine’s famous “confirmational holism” (or, as it is sometimes called, “epistemological holism”). Confirmational holism appears more vividly in “Two Dogmas of Empiricism”, where Quine famously states that “our statements about the external world face the tribunal of sense experience not individually, but only as a corporate body” (Quine 1951, 38). Let’s see what this claim implies.

A scientific theory consists of a variety of sentences, from observation sentences to theoretical ones. Observations sentences were particularly important because their stimulus meaning was directly linked to immediate observables. There are, however, theoretical sentences whose stimulus meaning is less directly linked to observables, such as “neutrinos have mass” or “space-time is curved”. Another part of such a theory consists in what are sometimes called “auxiliary hypotheses or assumptions” (Quine and Ullian 1978, 79). These are statements about, for instance, the conditions of the experiments, the experimenters, the lab, when the observations have been made, and so forth. We can take total science, or our total theory of the world, as “a man-made fabric which impinges on experience only along the edges. … [T]otal science is like a field of force whose boundary conditions are experience” (Quine 1951, 39). Such a theory is like a web with observation sentences at its outer layers and logic and mathematics at its core.

Quine’s confirmational holism implies that a single statement in isolation cannot be confirmed by any observation, evidence, or data because there would always be other factors involved in making such a decision. Suppose that some newly found evidence contradicts your theory. According to confirmational holism, the emergence of such a conflict between the theory and the evidence does not necessarily force you to abandon your theory and start constructing a new one. Rather you always have a choice: you can hold onto any part of your theory, provided that you can make some complementary changes, or proper compensations, elsewhere in your theory so that the theory can preserve its consistency. In this way, the conflicting evidence can be handled by manipulating some of the auxiliary hypotheses. Compensations can potentially be made in many different ways and thus different parts of the theory can be saved. Each alteration, however, can result in a different theory. The important point to note is that although these theories are different, they are empirically equivalent because they are all compatible with the same body of evidence. In this case, your theory is underdetermined by that set of evidence. More generally, for any set of data, no matter how big it is, there can always be different theories which are compatible with that set.

There are different characterizations of underdetermination. Strong underdetermination, which Quine initially works with in his argument from above, states that our total theory is underdetermined even by the totality of all possible evidence. Quine also believed that there can be empirically equivalent theories which are logically incompatible (Quine 1970). Two theories are logically incompatible if the same sentence is true in one and false in another. But, in his later works, he almost gives up on this claim and takes all such theories to be empirically equivalent and logically compatible, though they are now counted as rival ones if they cannot be reduced to one another term by term and sentence by sentence (Quine 1990b). Moreover, your theory can be viewed as underdetermined by all data so far collected; it may be taken to be underdetermined by all data collectable from the past to the future, though some factors may remain unnoticed to you. In all such cases, underdetermination survives. For suppose that your theory A is compatible with all data collected from the past to the present. Other theories can be made out of A by changing different parts of it (and making proper compensations.) The result of (at least some of) such changes would be different theories. The theory A is thus underdetermined by such a set of data (Quine 1990c).

It is also important to note that the underdetermination thesis is an epistemological thesis, not a skeptical one with ontological consequences. Suppose that we have come up with a total theory of the world, within which the totality of truths about the world is now fixed. This theory too is naturally underdetermined by all possible data so that there will be rival theories which are compatible with all such possible data. This fact, however, does not result in the skeptical conclusion that there is thereby no fact of the matter about the world. It only implies that there are always different ways of describing it. The reason has its roots in Quine’s naturalism again, according to which there is no such a thing as a free-from-theory stance from which you can compare such theories. You are always working from within a theory. Although there are always rival ones, once you choose one underdetermined theory, all facts of the matter about the world are considered fixed within it. From within your favored theory, you expect no additional underdetermination to emerge with regard to what your theory says there are in the world. Now, what is the relation between underdetermination and indeterminacy?

ii. Underdetermination and Indeterminacy

Quine’s claim was that insofar as a theory is underdetermined by evidence its translation is indeterminate. The question is how we reach skepticism about translation from underdetermination of theory. This is an important question because underdetermination resulted in an epistemological problem only: even if all possible evidence is at hand, there always are rival theories which are compatible with such a set of evidence. For Quine, linguistics is part of natural science. Thus, it seems that, in the case of translation too, we should face nothing more serious than a similar epistemological problem, that is, the underdetermination of translation by evidence: even if we have the set of all behavioral evidence, there will always remain rival translations of the native’s sentences. This conclusion does not result in the skeptical conclusion that there is no fact of the matter about correct translation. Thus, one may complain that Quine would not be justified in claiming that, in the case of translation, we have to deal with the skeptical problem of indeterminacy. This is the objection which Chomsky (Chomsky 1968) makes to Quine’s indeterminacy thesis.

For Chomsky, we all agree that, for any set of data, there can always be different theories implying it. But the underdetermination of our scientific theory does not lead to any skepticism about the world: we do not claim that there is no fact of the matter about, for instance, tables and trees. Why should there be any difference when the case of our study becomes that of translation? Quine famously replies that the distinction between underdetermination and indeterminacy is what “Chomsky did not dismiss … He missed it” (Quine 1968, 276). For Quine, indeterminacy and underdetermination are parallel, but only up to a certain point. It is true that, in the case of translation too, we have the problem of underdetermination since the translation of the native’s sentences is underdetermined by all possible observations of the native’s verbal behavior so that there will always remain rival translations which are compatible with such a set of evidence. To this extent, Quine agrees with Chomsky. Nonetheless, he believes that indeterminacy is parallel but additional to underdetermination. When do these two theses differ in the case of translation?

Quine’s answer has its roots in his naturalistic claim that our best scientific theory is all there is to work with: it is the ultimate parameter. Even our total theory of the world would be underdetermined by the totality of all evidence. Quine’s point is that once you favor an underdetermined theory, the totality of truths about the world is thereby fixed. Take such a theory to be A. According to Quine, even within A, translation still varies and thereby remains underdetermined. Translation is thus doubly underdetermined: an additional underdetermination reoccurs in the case of translation. But, as previously indicated, this recurrence of underdetermination cannot be accepted by Quine since within our theory, we expect no further underdetermination to emerge. Recall Quine’s physicalism: if no fact about correct translation can be found in the set of all the physical facts about the world, we should conclude that there is simply no such fact. Having chosen the theory A, all facts of the matter about the world are fixed, and if despite that, translation still varies, we should conclude that the totality of facts about the world has failed to fix the facts about correct translation. As Quine famously says, translation “withstands even … the whole truth about nature” (Quine 1968, 275). Therefore, there is no fact of the matter about correct translation, which establishes the skeptical conclusion that Quine was after. This is the reason why the argument from above was characterized as claiming that underdetermination plus physicalism results in indeterminacy. “Where indeterminacy of translation applies, there is no real question of right choice; there is no fact of the matter even to within the acknowledged under-determination of a theory of nature” (Quine 1968, 275).

The last question to answer is how it is that, within our total theory A, the totality of facts fails to fix the facts about correct translation. In order to see how Quine reaches this skeptical conclusion, imagine that our translator is given the task of translating the native’s total theory. The translator starts by translating the observation sentences of the native’s theory. Suppose that the translator’s theory is A and her aim is to match the observation sentences of the native’s theory with the observation sentences of A. What is the translator’s justification for deciding whether the observation sentences of her theory are matched with the observation sentences of the native theory? It is, as before, the fact that the observation sentences have the same stimulus meaning. Assume that the translator has matched up all such observation sentences. This is just to say that facts about translation are thereby fixed: the observation sentences are paired in terms of having the same stimulus meaning. Thus, it seems that our translator can now justifiably take A to be the unique, correct translation of the native’s theory: from the fact that all the observation sentences are matched up, she concludes that the native believes in the same theory as she does. But can she really make such a claim? Quine’s answer is negative.

The reason for Quine’s negative answer can be put as follows. Suppose that there is a second translator who, like the first translator, holds A for himself and aims to translate the native’s theory. As with our first translator, he matches the observation sentences of A with the observation sentences of the native’s theory in terms of the sentences’ having the same stimulus meaning. Having done that, however, he decides to attribute theory B to the native. The difference between A and B is this: they are different but empirically equivalent theories. Both theories share the same observation sentences but differ with regard to, for instance, some of their auxiliary assumptions, theoretical sentences, and the like. Neither the first nor the second translator really believes in B; they both find B to be too odd, complex, or unnatural to believe. Nonetheless, while the first translator attributes A to the native, the second translator, for whatever reason, attributes B to him. Quine’s crucial claim is that although the translators’ theory is A, that is, although they are both working from within one theory, they are still free to attribute either A or B to the native as the translation of his theory. There is no objective criterion on the basis of which they can decide which of A or B is the theory which the native, as a matter fact, believes since both A and B are alike with regard to the totality of facts about stimulus meaning. Therefore, as Quine’s physicalism implied, we should conclude that there is no fact of the matter as to which of A or B is to be chosen as the correct translation of the native’s total theory. Despite the fact that the totality of facts is fixed within A, the translators still have freedom of choice between rival translations of the native’s theory. This underdetermination with regard to rival translations is additional to our old underdetermination of theory by evidence. The translation of the native’s theory is thereby indeterminate. This argument is called the “argument from above” since it does not start by investigating how the reference of sub-sentential parts of sentences is fixed; it rather deals with the whole theory and the translation of its whole sentences.

As with the argument from below, the argument from above too has been subject to a variety of objections. Chomsky’s objection (Chomsky 1968) has been reviewed, but it is worth briefly reviewing the general form of two further objections. Robert Kirk (Kirk 1986) objects that Quine’s argument from above is not successful because it has to rely on the conclusion of the argument from below. In other words, it faces a dilemma: either it presupposes the argument from below, in which case it would be a question-begging argument because the argument from above was supposed to be an independent argument, or it does not presuppose the argument from below, in which case it fails to establish Quine’s desired skeptical conclusion. The reason for the latter is that Quine’s claim that the translator’s only justification for matching the observation sentences is that they have the same stimulus meaning does not, in any combination with underdetermination, result in the indeterminacy of translation, unless we read this claim as implying that these matchings form the totality of facts about translation and that they fail to pin down one unique translation of the native’s theory. By doing so, however, we have already reached the indeterminacy of translation without even using the underdetermination thesis at all.

A different sort of objection has been made (Blackburn 1984), (Searle 1987) and (Glock 2003), according to which the skeptical conclusions of Quine’s argument (no fact of the matter about meaning and translation and indeterminacy at home too) leads to an entirely unacceptable conclusion: a denial of first-person authority. It can be intuitively conceded that speakers have first-person authority over the meaning of their own utterances and the content of their own mental states, such as their beliefs. They know what they mean and believe, and they know this differently from the way others, like the translators, know such meanings and beliefs. But, if radical translation starts at home, then indeterminacy arises at home, too. This means that, for the speaker too, it would be indeterminate what her own words mean. This implication is highly counterintuitive.

Let’s end our discussion of Quine by removing a potential confusion about the skeptical conclusions of Quine’s arguments. Although Quine claims that, as a matter of fact, translation is indeterminate, he does not claim that, in practice, translation is impossible. After all, we do translate other languages and understand what others mean by their words. This means that we should distinguish between two claims here. First, Quine has argued that the traditional conception of meaning and translation is to be abandoned: insofar as our concern is theoretical and philosophical, there is no such a thing as one correct translation of another sentence. But, from a practical and pragmatic point of view, translation is perfectly possible. The reason is that although there is no objective criterion on the basis of which we can pick out one correct translation of a sentence, we have good pragmatic reasons to choose between the rival ones. The translator translates the native’s utterances with “empathy”. She treats the native as a rational person who, like her, believes that there are rabbits, trees, and so forth, rather than only rabbit-parts or tree time-slices. This is a maxim, which can be called Quine’s version of  “the principle of charity” (Quine 1973). Our translator would choose “rabbit” as the translation of “gavagai” simply because this translation makes the communication between her and the native smoother. But, these norms cannot tell which translation is, as a matter of fact, correct. Although this maxim is also known as “the principle of charity”, it was not Quine who coined the term (though he started using a version of it in Word and Object (Quine 1960), and its role gradually became more important in his later works.) It was Neil L. Wilson (Wilson 1959, 532) who called a similarity to the aforementioned maxim “the principle of charity”, as Quine himself mentions. More or less similar to Wilson, Quine used it to emphasize that if the translator notices that her translation of the native’s sentences is resulting in a beyond-normal range of strange or “silly” translations, it is more likely that something is wrong with her translation than the native himself. The translator needs to choose those methods that lead to the attribution of the largest number of true sentences to the native. We are to maximize the agreement between us and the native with regard to holding true statements. As we will see, however, Davidson’s use of this principle is more extensive and substantial than Quine’s.

3. Davidson’s Semantic Project

Although Donald Davidson was inspired by Quine’s project of radical translation, he preferred to focus on what he calls “radical interpretation” (Davidson 1973a), (Davidson 1974b). Radical interpretation manifests itself in Davidson’s endeavor to uncover, from a theoretical point of view, how speakers’ ability to speak, and to understand the speech of others, can best be modeled or described. While Quine was interested in studying how the process of translation can proceed and what can be extracted from it regarding meaning determination and linguistic understanding, Davidson’s interest is wider. He is concerned with how a theory of meaning can be constructed for a language, a theory which can systematically entail the truth-conditions of all sentences of that language. His view of meaning is thus truth-conditional, according to which we understand a sentence, or what it means, by knowing under what condition the sentence would be true (Davidson 1967). For instance, the sentence “it’s raining” is true if and only if it is raining and false if it is not. We say that the truth-condition of the sentence is that it is raining. Someone who understands this sentence knows under what condition it would be true. If we succeed in constructing such a theory of meaning, which correctly specifies the truth-conditions of all sentences of a language, we have interpreted it, and we can, theoretically speaking, treat the speakers of that language as if they know such a theory, as if they speak in accordance with it and understand each other on that basis.

There are important differences between translation and interpretation. One difference is that, in the process of translating, our aim is to pair sentences of our language with sentences of the native’s on the basis of having the same meaning. In interpretation, our aim is to give the truth-conditions of the native’s sentences by using sentences of our own language. Obviously, the concept of truth has an important role to play in such a view. It is supposed to help us to clarify the concept of meaning. Davidson takes Alfred Tarski’s Truth-Theory, or Tarski’s definition of truth, to be the best tool for building his truth-based theory of meaning (Davidson 1967), (Davidson 1973b).

a. Davidson’s Use of Tarski’s Truth-Theory

For Davidson, any adequate theory of truth, if it is supposed to work as a theory of meaning entailing the right sort of truth-conditions for all sentences of a language, must meet certain constraints. One of the most important ones is to satisfy Tarski’s Convention-T, according to which our theory must entail all and only true instances of what is called Tarski’s “T-schema”:

(T) “s” is true in L if and only if p.

Our theory must entail true sentences in the form of (T) for all sentences of the native’s language L. Here, the native’s language is called the “object-language”: the language for the sentences of which our theory entails truth-conditions. Our own language is called the “meta-language,” the language whose sentences are used to specify such truth-conditions. In (T), the sentence in the quotation marks, “s”, mentions a sentence in the native’s language, and “p” is a sentence from our language that is used to give the truth-condition of the mentioned sentence.

Suppose that the object-language is German and the mentioned sentence is “Der Schnee ist weiss”. Which sentence in our language should be chosen to replace “p” in order to give the truth-condition of the German sentence? An important point to note here is that Tarski’s intent was to define truth (or, the truth-predicate, “is true”) for the object-language. In order to do so, he used the notion of translation, or sameness in meaning, and claims that what should be replaced by “p” must be either “s” itself (if the object-language is part of the meta-language) or the translation of “s” (if the object-language and the meta-language are different). Thus, the sentence which is put in place of “p” should be “snow is white”. Having done that, we come up with the following instance of the T-schema, or the following “T-sentence”:

(T1) “Der Schnee ist weiss” is true in German if and only if snow is white.

Tarski believed that each of such T-sentences yields a particular definition of truth since it defines truth for a particular sentence. A conjunction of all such instances will provide us with a definition of the concept of truth for the object-language. As a historical point, we should note that Tarski’s own goal was to define truth for a formal (that is, wholly translated into the first-order predicate logic) language, “L”, in a meta-language which contained L together with a few additional terms. He was very doubtful whether such a project could be consistently applied to the case of natural languages at all, mostly because natural languages can lead to a variety of paradoxes, such as the liar paradox. Although admitting that Tarski was suspicious of extending such a project to the case of natural languages, Davidson nevertheless attempts to carry out this project. He suggests that truth can potentially be defined for one natural language (as an object language) in another (as the meta-language). This is the reason why the examples used in this section are from natural languages, such as English, rather than from purely formal ones.

Tarski’s theory works recursively: it entails T-sentences like (T1) systematically from a finite set of axioms, as well as a finite set of rules for how different sub-sentential parts, or simple expressions, can be put together to form more complex expressions. The axioms’ job is to assign certain semantic properties to different parts of sentences: they assign reference to terms and satisfaction conditions to predicates. For instance, for (T1), we would have the following two axioms:

(A1) “Der Schnee” refers to snow.

(A2) “ist weiss” is satisfied by white things.

Nonetheless, Davidson, who wants to specify meaning in terms of truth, cannot simply follow Tarski in claiming that the two sentences appearing in the T-sentences must have the same meaning; doing so presupposes the concept of meaning. Therefore, Davidson makes a weaker claim: our theory must produce all and only true T-sentences. That is to say, “p” should be replaced by a sentence that is true if and only if the object-language’s sentence is true. But it is easy to see why this constraint would not be strong enough to succeed in having the truth-theory entail the right sort of truth-conditions for the sentences of the object-language. Suppose that the object-language is English and so is the meta-language. In this case, the following T-sentences are both true:

(T2) “Snow is white” is true if and only if snow is white.

(T3) “Snow is white” is true if and only if grass is green.

The above T-sentences are true simply because both “snow is white” and “grass is green” are true. (The same would be true if we had “Der Schnee ist weiss”, rather than “snow is white”, on the left-hand side of (T2) and (T3) because our assumption is that “Der Schnee ist weiss” is a true sentence in German.) However, our theory must entail correct truth-conditions. Assume that the theory entailing (T2) is F and the theory entailing (T3) is Ψ. Both F and Ψ meet Davidson’s requirement of entailing only true T-sentences and must be considered as correct theories. But we know that Ψ is false and certainly does not give the correct truth-condition of “snow is white”. On the other hand, we cannot take this fact for granted. Thus, we still need more constraints on our theory.

Insofar as the discussion of radical interpretation is concerned, the most important constraint which Davidson imposes on his theory is that the totality of the theory’s T-sentences must optimally fit the evidence with regard to the speaker’s responses (Davidson 1973a). This is an empirical constraint: the theory should be treated as an empirical theory which is employed by an interpreter to specify the meaning of the speaker’s utterances. It should be constructed, checked, and verified as an interpretive theory which produces interpretive T-sentences and axioms. By an “interpretive theory” Davidson means the theory that entails correct truth-conditions for the speaker’s sentences, considering the evidence the interpreter has access to in the process of radical interpretation.

b. Davidson’s Radical Interpretation Scenario

Imagine that someone is sent again to the jungle to meet our native speaker but, this time, he is given the task of interpreting Davidson-style the native’s utterances, that is, finding appropriate sentences in his own language that can be used to correctly specify the truth-conditions of the native’s sentences. In order to do so, he is required to construct a theory which entails the truth-conditions of the native’s sentences. Call him the “interpreter”. The Davidsonian interpreter, like the Quinean translator, starts his interpretation from scratch: he has no prior knowledge of the native’s language or her mental states.

Like radical translation, radical interpretation primarily focuses on the native’s observation sentences. A difference between Quine and Davidson emerges at this point. Although Quine took stimulations, or “proximal stimulation”, to be basic, Davidson takes the ordinary objects and events in the world, or “distal stimuli”, to be basic (Davidson 1999a). Another important difference between these two projects concerns the sort of evidence the interpreter is allowed to work with. Quine limited it to purely behavioral evidence, that is, the speaker’s assent or dissent. Davidson agrees that what the interpreter can ultimately rely on is nothing but observing the native’s verbal behavior, but since he rejects behaviorism, he claims that we can allow the interpreter to have access to what he calls the “holding-true attitudes” of the speaker (Davidson 1980). These are attitudes which the speaker possesses towards her own sentences; when the speaker utters, or assents to, a sentence on a specific occasion, she holds the sentence to be true on that occasion. For Davidson, the interpreter knows this much already, though he emphasizes that from this assumption it does not follow that the interpreter thereby has access to any detailed information about what the speaker means and believes.

Suppose that our native speaker, S, utters the sentence “Es regnet” at time t. The evidence the interpreter can work with would have the following form:

(E) S holds true “Es regnet” at time t if and only if it is raining at time t near S.

For Davidson, belief and meaning are interdependent. When a speaker utters a sentence, she expresses her thoughts, especially her beliefs. This interdependence between meaning and belief manifests itself in his emphasis on the role of the speaker’s holding-true attitudes in his project since, according to Davidson, a speaker holds a sentence to be true partly because of what those words mean in her language and partly because of the beliefs she has about the world. This means that if we know that the speaker holds a sentence to be true on an occasion and we know what she means by it, we would know what she believes, and if we know what she believes, we can infer what she means by her utterance.

Our radical interpreter, therefore, has a difficult job to do. He should determine the meaning of the native’s utterances and, at the same time, attribute suitable beliefs to her. This leads to what is called the “problem of interpretation”, according to which the interpreter, on the basis of the same sort of evidence, like (E), has to determine both meaning and belief. Obviously, one of these two variables must be fixed; otherwise, interpretation cannot take off. Davidson attempts to solve this problem by appealing to his version of the principle of charity (Davidson 1991). According to this principle, as employed by Davidson, the interpreter must do her best to make the native’s behavior as intelligible as possible: she ought to aim at maximizing the intelligibility (and not necessarily the truth) of the native’s responses in the process of interpreting them The interpreter takes the native to be a rational agent whose behavior is intelligible and whose  patterns of beliefs, desires, and other propositional attitudes, are more or less similar to ours. Obeying such a rational norm does not necessarily result in attributing true beliefs to the subject all the time; sometimes attributing a false belief to the subject may make his behavior more intelligible and comprehensive. This reveals another difference between Davidson and Quine with regard to their use of such a maxim or principle. More is said about this in the next section.

For Davidson, when it is raining around the native and she utters “Es regnet”, the interpreter takes her to be expressing the belief that it is raining. Charity helps to fix one of the above two variables, that is, the belief part. On the basis of the evidence (E), and with the help of the principle of charity, the interpreter can come up with the following hypothetical T-sentence:

(T4) “Es regnet” is true in S’s language if and only if it is raining.

As with the case of radical translation, here too (T4) is to be treated as hypothetical for now because one observation would not be enough for the interpreter to confirm that (T4) gives the correct truth-condition of the native’s sentence. The process of interpretation is a holistic process: terms like “regnet” and “Schnee” appear in many different sentences of the native’s language. The interpreter must go on and check if (T4) remains true when the native utters “Es regnet” on different occasions. As the interpretation proceeds, the interpreter gradually comes to identify different sub-sentential parts of the native’s sentences and thereby constructs specific axioms which assign reference to the terms and satisfaction conditions to the predicates of the native’s language (such as (A1) and (A2) above). In this case, the interpreter would be able to verify whether “Schnee” in the native’s language refers to snow or grass. The interpreter would then be able to throw away the true-but-not-interpretive T-sentences like the following:

(T5) “Der Schnee ist weiss” is true if and only if grass is green.

The reason is that the interpreter has checked, in several cases, that the native uses “Schnee” where there is snow, not grass, and that “… ist weiss” is used by the native where there are white things, not green things. The correct truth-condition of the native’s sentence seems to be snow is white, not grass is green.

At the end of this long process, the interpreter eventually comes up with a theory of interpretation, which correctly interprets the native’s verbal behavior. It systematically entails correct truth-conditions for all sentences of the native’s language. But, does this mean that indeterminacy has no chance to emerge in this process?

4. Davidson on the Indeterminacy of Interpretation

Davidson believes that some degree of indeterminacy survives in the process of radical interpretation. First of all, the inscrutability of reference cannot be avoided. Our words and things in the world can be connected in different ways, and we may never be able to tell which way of connecting words with things is right (Davidson 1997). If one way works, an infinite number of other ways will work as well, though some of them may seem strange or too complex to us. Davidson gives an example.

Predicates are said to have satisfaction conditions: they are satisfied, or are true of, certain things only. For instance, the predicate “… is red” is satisfied by red things, not blue things. Nonetheless, it seems that, for any predicate, we can find other predicates which have the same sort of satisfaction condition. In this case, the truth-value of the sentences in which such predicates appear would remain unchanged: if they are true (or false), they remain true (or false). But, since the totality of evidence is the same for all such cases, no evidence can help to decide which satisfaction condition is right. For example, suppose we have the following axioms:

(A3) “Rome” refers to Rome.

(A4) “… is a city in Italy” is satisfied by cities in Italy.

From these axioms, we can reach the following T-sentence:

(T6) “Rome is a city in Italy” is true if and only if Rome is a city in Italy.

This is a true T-sentence. Now consider an alternative set of axioms:

(A5) “Rome” refers to an area 100 miles to the south of Rome.

(A6) “… is a city in Italy” is satisfied by areas 100 miles south of cities in Italy.

From these, we can reach (T7):

(T7) “Rome is a city in Italy” is true if and only if the area 100 miles south of Rome is an area 100 miles south of a city in Italy.

The point is that if (T6) is true, (T7) is also true, and vice versa. Not only this, but there can be indefinitely many such mapping relations. The reference of “Rome” is thereby inscrutable: there is no way to determine which reference for “Rome”, and which satisfaction condition for “… is a city in Italy”, is to be chosen as the correct one. As before, such terms appear in a potentially indefinite number of sentences and thus, the inscrutability of reference affects the whole language. One interpreter may come up with a theory which takes “Rome” to be referring to Rome, while another may come up with a theory which takes it to be referring to an area 100 miles to the south of Rome. Both theories work just well, provided that each interpreter sticks to her own theory. Obviously, we cannot freely switch between different methods of interpretation. Rather, once it is fixed within one theory that “Rome” refers to Rome, the term has this reference in any sentence in which it occurs.

Davidson takes this sort of indeterminacy to be harmless and leading to no skepticism about meaning since it is similar to the innocuous familiar fact that we can have different ways of measuring the temperature, height, or weight of an object (Davidson 1997). When we want to tell what the temperature of an object is, we face different scales for measuring it. What we should do is to decide whether we want to use the Fahrenheit scale or the centigrade one. For Davidson, the inscrutability of reference should be understood in a similar way: there are always different methods of interpreting a speaker’s verbal behavior; what we must do is to choose between such rival methods and hold onto it. These different theories of interpretation are all compatible with the native’s behavioral evidence. But as there is no contradiction regarding the existence of different scales of measuring temperature, there would be no contradiction regarding the existence of different methods of interpreting the speaker’s behavior.

A second sort of indeterminacy also emerges in the process of radical interpretation which, contrary to the inscrutability of reference, can affect the sentences’ truth-values. Suppose that you are interpreting someone who often applies “blue” to the objects which most people apply “green” to, such as emeralds. Davidson’s claim is that you, as her interpreter, face two options. (1) You can attribute to her true beliefs about the world by taking her to be in agreement with you regarding the things there are in the world and the properties they possess: you can treat her as believing that there are emeralds and they are green. But, since she applies “blue” to these things, you have to take her to mean something different by the term “blue”. You do so with the purpose of making her behavior intelligible. Thus, you interpret her utterance of “blue” to mean green, not blue. On this option, the speaker’s utterance of “that emerald is blue” is true because you have treated her as believing that emeralds are green and as meaning green by “blue”. Thus, what she means by her utterance is that that emerald is green. (2) The second option is to interpret her to mean the same thing as you mean by “blue”, that is, to be in agreement with you on what the correct use of “blue” is. For both of you, “blue” applies to blue things and thus means blue. Again, since she applies “blue” to the things to which you apply “green”, you take her to have some different (false) beliefs about the world in order to make her behavior intelligible. Thus, while you interpret her utterance of “blue” to mean blue, you attribute to her the false belief that emeralds are blue. On this option, however, the speaker’s utterance of “that emerald is blue” is false because what she claims is that that emerald is blue. According to Davidson, you, as the interpreter of her speech, can choose any of the above two options and, as before, it is enough that you continue interpreting her in that way. There is nothing in the world that can help you to decide which method is right (Davidson 1997). Rather, all there is for the interpreter to appeal to is the rational norms dictated by the principle of charity, which, as we can see now more clearly, may even result in attributing some false beliefs to the subject in order to make her behavior more intelligible. Therefore, Davidson believes that the inscrutability of reference and the indeterminacy with regard to the attributions of meaning and belief to the speaker arise in the process of radical interpretation too.

There is, however, a final point which is worth noting with regard to Davidson’s treatment of indeterminacy. For him, indeterminacy does not entail that there are no facts of the matter about meaning (Davidson 1999b). Rather, he treats indeterminacy as resulting in an epistemological problem only, with no ontological consequences. His reason is that if the overall pattern of the speaker’s behavior is stable, there can be alternative ways of describing it, that is, there can always be different theories of interpretation. Again, just as there were different ways of measuring temperature, there can be different ways of interpreting the speaker’s behavior. As we did not question the reality of temperature, we should not question the reality of meaning. Davidson, thus, departs from Quine on this matter: while Quine thought that the indeterminacy of translation has destructive ontological consequences, Davidson thinks that the indeterminacy shows only that there can be different ways of capturing facts about meaning.

In what follows, the article considers two important analytic philosophers who have been inspired by Davidson’s and Quine’s projects, David Lewis and Daniel Dennett.

5. Lewis on Radical Interpretation

David Lewis, in his famous paper “Radical Interpretation” (Lewis 1974), agrees with Davidson on the general claim that the aim of the process of radical interpretation is to determine what a speaker, say, Karl, means by his utterances and what he believes and desires. Lewis also agrees with Davidson and Quine that radical interpretation starts from scratch: at the outset, the interpreter has no prior information about Karl’s beliefs, desires, and meanings. Not only this, but our knowledge of Karl is also taken by Lewis to be limited to the sort of knowledge we can have of him as a physical system. Thus, on this latter point, he leans to Quine, rather than Davidson, and he invites us to imagine that we interpreters have access to the totality of physical facts about Karl. Lewis’s question is, what do such facts tell about Karl’s meanings, beliefs, and desires?

Lewis characterizes the “problem of radical interpretation” as follows. Suppose P is the set of all such facts about Karl viewed as a physical system, for instance, facts about his movements, his causal interactions with the world, his behavioral responses to others, the impact of physical laws on him, and so forth. Suppose also that we have two sets of specifications of Karl’s propositional attitudes, Ao and Ak. Ao is the set of specifications of Karl’s beliefs and desires as expressed in our language (for example, when we specify what Karl believes by using the English sentence “snow is white”), and Ak is the set of specifications of Karl’s beliefs and desires as expressed in Karl’s language (for example, given that Karl’s language is German, Karl’s belief (that snow is white) is expressed by the German sentence “Der Schnee ist weiss”). Finally, suppose that M is the set of our interpretations of Karl, that is, the specifications of the meaning of Karl’s utterances (for example, statements like “Karl means snow is white by his utterance of ‘Der Schnee ist weiss’”). Lewis’s question is: How are P, Ak and Ao related?

Some points about these sets are to be noted. (1) As with Davidson, Lewis is wishes to determine the truth-conditions of Karl’s sentences. So, the interpreter looks for correct interpretations such as “‘Der Schnee ist weiss’ as uttered by Karl is true if and only if snow is white”. (2) Following Davidson, Lewis also demands that these truth-conditions be entailed in a systematic way from a finite set of axioms. (3) Contrary to Davidson, however, Lewis puts a heavy emphasis on beliefs and desires, and claims that our most important goal in radical interpretation is to determine them first. This shift in emphasis leads us to two further points about the relation between the views of Lewis and Davidson. (a) Lewis agrees with Davidson that beliefs and desires play a significant role both in our treatment of the speaker as a rational agent and in our explanation of his behavior as an intentional action. For Davidson, a speaker is rational if she possesses a rich set of interrelated propositional attitudes, such as beliefs, desires, hopes, fears, and the like (Davidson 1982). An agent’s action can be explained as intentional if it can be properly described as caused by a belief-desire pair (Davidson 1963). For instance, to use Davidson’s example, suppose that someone adds sage to the stew with the purpose of improving its taste. This action is intentional if we can explain it as caused by the subject’s desire to improve the taste of the stew and the belief that adding sage would do that. (b) Nonetheless, Davidson did not take beliefs and desires (or, in general, any propositional attitudes) to be superior to meaning. He thought that meanings and beliefs are so interdependent that interpreters have to determine both at the same time. Lewis treats beliefs and desires as basic and claims that meanings can be determined only when the speaker’s beliefs and desires are determined first. This view is related to his analysis of success in our linguistic practices in terms of conventions and the crucial role speakers’ beliefs play in turning a sort of regularity-in-use into a convention (Lewis 1975). (3) Putting aside his delicate view of the notion of convention, the last point to note is that Lewis agrees with Quine rather than Davidson regarding the idea that the problem interpreters seek to answer in the process of meaning-determination is more than just an epistemological problem. The concern is not how P, the set of all physical facts about Karl, determines facts about Karl’s meanings, beliefs, and desires. Rather, one wants to know what facts P is capable of determining at all, that is, whether the totality of physical facts can fix the facts about what Karl means, believes, and desires.

Let’s see how the views of Lewis and Davidson particularly differ with regard to the constraints on the process of radical interpretation and the degree of indeterminacy which may survive after so restricting the process.

a. Lewis’s Constraints on Radical Interpretation

Lewis believes that the process of interpretation the interpreters needs to place more constraints than those introduced by Davidson. These extra constraints concern how meanings and propositional attitudes are related to one another, to the behavior of the speaker, and to the sensory stimulations. It is meeting these constraints that makes radical interpretation possible. (Lewis 1974) promotes six constraints on radical interpretation, only some of which are shared by Davidson:

(1) The Principle of Charity. The way Lewis characterizes this principle is slightly different from Davidson. According to this principle, in order to make Karl’s behavior most intelligible, the interpreter should interpret Karl’s behavior (as specified in P) by treating him as believing what he ought to believe and desiring what he ought to desire. Again, this does not mean that in order to make Karl’s behavior most intelligible, only true beliefs are to be attributed to him; rather, sometimes, treating him as holding some false beliefs may do much better in describing his behavior as rational, intelligible, and comprehensive. What Karl ought to believe and desire, from the interpreter’s point of view, is generally what she believes and desires (given by Ao). But, again, considering the particular circumstances under which Karl’s behavior is interpreted, as well as the available evidence, the values Karl accepts, and so forth, the interpreter should make room for attributing some errors or false beliefs to him.

(2) The Rationalization Principle. Karl should be interpreted as a rational agent. The beliefs and desires the interpreter attributes to Karl (in Ao) should be capable of providing good reasons for why Karl responds in the way he does. Nonetheless, it does not mean that there are thereby some sort of intentional (non-physical) facts about Karl. The facts about Karl are still limited to the physical ones specified in P. This principle rather implies that the interpreter attributes those desires and beliefs to Karl that not only make Karl’s behavior intelligible to us, but also provide him with reasons for acting in that way. For this reason, the rationalization principle and the principle of charity are different.

(3) The Principle of Truthfulness. Karl is to be considered a truthful speaker, that is, a speaker who is willing to assert only what he takes to be very probably true. This principle constrains the sort of desires and believes the interpreter is allowed to attribute to Karl (in Ao) in order to interpret his utterances (and specify their meaning in M).

(4) The Principle of Generativity. The truth-conditions which the interpreter assigns to Karl’s utterances (in M) must be finitely specifiable, uniform, and simple. The interpreter must do her best to avoid assigning too complex, odd, or unnatural meanings to Karl’s sentences, as well as the meanings that are not finitely and systematically inferable from a finite set of axioms.

(5) The Manifestation Principle. Karl’s attributed beliefs and desires should be capable of manifesting themselves in his behavior. Karl’s beliefs and other attitudes should be recognizable particularly in his use of his language. This means that when there is no evidence to the effect that Karl is self-deceived or lacks proper conception of what meaning and belief are, we should be able to extract beliefs and other propositional attitudes from Karl’s behavioral dispositions to respond to the world.

(6) The Triangle Principle. Karl’s meanings, beliefs, and desires should not change when they are specified in the interpreter’s language, whatever the interpreter’s language is. This principle may appear a bit puzzling. Suppose that Karl utters “Der Schnee ist weiss” and our interpreter, who speaks English, interprets it as follows:

(M1) “Der Schnee ist weiss” as uttered by Karl is true in German if and only if snow is white.

The truth-condition of Karl’s utterance is thus that snow is white. Suppose that another interpreter, Francesco, who speaks Italian, interprets Karl’s utterance as follows:

(M2) “Der Schnee is weiss”, proferita da Karl, è vera in Tedesco se e solo se la neve è bianca.

The truth-condition of Karl’s utterance is now given by the Italian sentence la neve è bianca. Lewis’s point is that what Karl means by his utterance and what belief he expresses by it must remain the same, no matter in what language they are specified. We can see this point by considering the way our English interpreter would interpret Francesco’s sentence “la neve è bianca”, used in (M2) to give the truth-condition of Karl’s utterance:

(M3) “La neve è bianca” as uttered by Francesco is true in Italian if and only if snow is white.

The truth-condition of Francesco’s sentence is that snow is white. Considering (M1) – (M3), one can see that what Karl expresses by his German sentence, that is, that snow is white, remains the same no matter in what language it is specified.

On the basis of these six principles, Lewis evaluates Davison’s project of radical interpretation. Davidson’s aim was to solve the problem of determining Karl’s beliefs and meanings at the same time. For Lewis, Davidson attempts to solve this problem by appealing to the Triangle Principle, the Principle of Charity, and the Principle of Generativity only. That is to say, what Davidson is concerned with is that the truth-conditions of Karl’s sentences are correctly specified in the interpreter’s language (the Triangle Principle), that such assignments of meaning are done with the purpose of maximizing the intelligibility of Karl’s behavior via attributing proper beliefs to him (the Principle of Charity), and finally, that the truth-conditions are inferred in a systematic way from a finite set of axioms (the Principle of Generativity). We temporarily fix beliefs, extract meanings, ask the interpreter to re-check her interpretation with further behavioral evidence, revise the beliefs if necessary, and re-check the interpretation. Lewis is not satisfied with Davidson’s method because, for him, Davidson has missed the other three principles.  Davidson especially fails to take into account the Principle of Truthfulness and the Rationalization Principle, which constrain Karl’s beliefs and desires and which consequently lead the interpreter to view Karl’s behavior as an intentional action in advance. Davidson has put too much emphasis on the language part, rather than the mental part.

Lewis’s method is almost the opposite. It starts by considering Karl’s behavior as what forms the evidential basis for interpretation, but it considers such behavior in the light of the Rationalization Principle and the Principle of Charity. Karl’s behavior is taken to be already rationalized. Karl’s behavior can be treated in this way if his behavior allows for attributions of those beliefs and desires to him that we interpreters ourselves often hold. Evidence to the contrary forces us to reconsider his behavior as rational. Karl’s linguistic behavior, his utterances, are simply part of the history of his behavior. On this basis, we are then allowed to assign truth-conditions to Karl’s utterances by employing the Principle of Truthfulness. That is, we view Karl as a rational agent who asserts a sentence only when he believes it is true. The Principle of Generativity constrains our attributions of truth-conditions to Karl’s sentences by demanding systematicity, coherence, and consistency in such attributions. Finally, if other principles are met, the Triangle Principle assures us that Karl’s meanings, beliefs, and desires remain the same when they are specified in the interpreter’s language.

The question, however, is whether these extra constraints can avoid the emergence of the inscrutability of reference and the indeterminacy of interpretation.

b. Lewis on the Indeterminacy of Interpretation

Lewis believes that indeterminacy, at least in its strong and threatening form, can be avoided, though some degree of mild or moderate indeterminacy would inevitably survive. His position changed from that in his earlier works on the topic, especially in “Radical Interpretation” (Lewis 1974). There Lewis admits that it is reasonable to think that there would probably remain some rival systems of interpretation which are compatible with the set of all behavioral evidence and which can be considered as correct. He uses Quine’s gavagai example to clarify the sort of indeterminacy which he thinks may appear in the process of radical interpretation. As he puts it, “[w]e should regard with suspicion any method that purports to settle objectively whether, in some tribe, ‘gavagai’ is true of temporally continuant rabbits or time-slices thereof. You can give their language a good grammar of either kind—and that’s that” (Lewis 1975, 21).

Nonetheless, even in this earlier period, Lewis emphasized that no “radical indeterminacy” can come out in radical interpretation. If a theory of interpretation meets all of the six conditions introduced above and it does so perfectly, then we should expect no radical indeterminacy to appear, that is, no rival theories of interpretation which all perfectly meet the six constraints but attribute radically different beliefs and desires to Karl. For Lewis, even if it can be shown somehow that such indeterminacy may emerge even when the six constraints are met, the conclusion of the attempt would not be that the interpreter should thereby accept the existence of such indeterminacy. Rather, what would be proved would be that all the needed constraints have not yet been found. Lewis, however, thinks that no convincing example has yet been offered to persuade us that we should take such a sort of radical indeterminacy seriously. He also denies that the existence of radical indeterminacy can be shown by any proof (Lewis 1974).

Lewis subsequently returned to the problem of indeterminacy due to Putnam’s argument in favor of radical indeterminacy.

i. Putnam’s Model-Theoretic Argument and Lewis’s Reference Magnetism

Putnam, in (Putnam 1977), reformulates Quine’s thesis of the inscrutability of reference in a way in which it could not be treated as a mild indeterminacy anymore. His argument, the “model-theoretic argument”, is technical and is not unpacked here. But consider its general conclusion. This argument attempts to undermine metaphysical realism, according to which there are theory-independent, language-independent, or mind-independent objects in the world to which our terms are linked in a certain way, and such a linkage makes our sentences about the world true. Such mind-independent objects can have properties which may go beyond the epistemic and cognitive abilities of human beings. For such realists, there can be only one true complete description of the world because the world is as it is, and things in it are the way they are and have the properties they have independently of how we can describe it. Now suppose that we have an epistemically ideal theory of the world, that is, a theory that meets all the theoretical and similar constraints we can impose on our theory, such as consistency, full compatibility with all evidence, completeness, and so forth. According to metaphysical realism, even such an ideal theory can come out false because, after all, it is the theory that is ideal for us, that is, ideal as far as an idealization of our epistemic skills allows. It is possible that this theory still fails to be the one which correctly describes the world as it really is. Putnam’s argument, however, aims to show that we interpreters can have different interpretations of the link between our words and things in the world which make any such ideal theory come out true.

As indicated in the above discussion of Quine, our theories of the world are nothing but a collection of interconnected sentences containing a variety of expressions. For Putnam, however, even if we can fix the truth-values (and even the truth-conditions) of all sentences of our language, it can still be shown that the reference of its terms would remain unfixed: there can always be alternative reference systems that are incompatible with one another but preserve the truth-values of the sentences of our language or theory. For instance, if we change the reference of “rabbit” from rabbits to undetached rabbit-parts, the truth-values of the sentences in which “rabbit” occurs would not change. This much was familiar from Quine’s argument from below. What Putnam adds is that realists have to concede that if there is only one correct theory, or description, of the world, this theory should thereby be capable of making the reference of our terms fixed: it should uniquely determine to what objects each term refers. Putnam’s question is how realists can explain such a reference-determining process. According to Putnam, no reference of any term can be determined; there can be many theories that come out true dependent on how we intend to interpret the systematic connection between words and things. Anything you may like can potentially be taken to be the reference of any term, without causing any change in the truth-values of whole sentences. Not only this, but introducing any further constraints on your theory would inevitably fail to solve the problem because any new constraint introduces (at most) some new terms into your theory, and the reference of such terms would be susceptible to the same problem of indeterminacy. Putnam’s point is that we cannot think of the world as bestowing fixed reference on our terms: “We interpret our languages, or nothing does” (Putnam 1980, 482).

This argument is taken seriously by Lewis because, not only was he an advocate of realism, he also supported “global descriptivism”, a view which Putnam’s argument undermines. For Lewis, we should interpret terms as referring to those things which make our theory come out true. If we attribute to the speaker the belief that Aristotle was a great Greek philosopher and if we concede, as Lewis does, that the content of this belief is expressible in the sentence “Aristotle was a great Greek philosopher”, then we should interpret “Aristotle” to refer to those things which make our theory containing it turn out true. For this to happen, it seems that “Aristotle” is to be interpreted as referring to a specific person, Aristotle. The sort of “causal descriptivism” which Lewis worked with implies that, first of all, there is a causal relationship between terms and their referents and, second of all, terms (like “Aristotle”) are so connected with their referents (such as Aristotle) by means of a certain sort of description, or a cluster of them, which we associate with the terms and which specifies how the terms and their referents are so causally linked. In this sense, “Aristotle” refers to Aristotle because it is Aristotle that satisfies, for instance, the definite description “Plato’s pupil and the teacher of the Alexander the Great.” Global descriptivism states that the reference of (almost) all the terms of our language is determined in this way.

Putnam’s argument undermines this view because if the reference of our terms is indeterminate, the problem cannot be dealt with simply by introducing or associating further descriptions with those terms because no matter what such descriptions and constraints are, they would be nothing but a number of other words, whose reference is again indeterminate. Such words are to be interpreted; otherwise, they would be useless since they would be nothing but uninterpreted, meaningless symbols. But if they are to be interpreted, they can be interpreted in many different ways. Therefore, the problem, as Lewis puts it, is that there can be many different (non-standard) interpretations which can make our epistemically ideal theory turn out true: “any world can satisfy any theory…, and can do so in countless very different ways” (Lewis 1983, 370). New constraints on our theory just lead to “more theory”, and it would be susceptible to the same sort of problem. After all, as Putnam stated, we interpret our language, so, under different interpretations, or models, our terms, whatever they are, can refer to different things, even to anything we intend. It would not really matter how the world is or what the theory says.

In order to solve this problem, Lewis introduces “reference magnetism” or “inegalitarianism”. According to this solution, the reference of a term is, as before, what causes the speaker to use that term and, more importantly, among the rival interpretations of that term, the eligible interpretation is the one which picks out the most natural reference for the term. Lewis’s view of natural properties and naturalness is complex. According to Lewis, the world consists of nothing but a throng of spatiotemporal joints, at which the world is carved up and various properties are instantiated. Such properties are “perfectly natural” properties (Lewis 1984). For Lewis, however, it is “a primitive fact that some classes of things are perfectly natural properties” (Lewis 1983, 347). It is helpful to use an example originally given by Nelson Goodman. Suppose that the application of the terms “green” and “blue” are governed by the following rules:

Rule1: “Green” applies to green things always. In this case, “green” means green.

Rule2: “Blue” applies to blue things always. In this case, “blue” means blue.

Suppose that a skeptic claims that the rules governing the application of “green” and “blue” are not those mentioned above. They are rather the following:

Rule1*: “Green” applies to green things up to a specific time t and to blue things after t. In this case, “green” does not mean green, but means grue.

Rule2*: “Blue” applies to blue things up until time t and to green things after t. In this case, “blue” does not mean blue, but means bleen.

If the speaker has been following Rule1*, rather than Rule1, the application of “green” to an emerald at t+ would be incorrect. Obviously, there can be an infinite number of such alternative rules, and as t can be any given time, no behavioral evidence can help to decide whether by “green” the speaker really meant green or grue.

Lewis’s reference magnetism implies that the correct interpretation of the speaker’s utterance of “green” chooses the most natural property as the eligible referent of the term, that is, the property of being green rather than being grue. As he puts it, “grue things (or worse) are not all of a kind in the same way that … bits of gold…are all of a kind” (Lewis 1984, 228-229). Similarly, the most eligible referent for “rabbit” is that of being a rabbit, rather than an undetached rabbit-part. In this way, the sort of radical indeterminacy which Putnam argued for would be blocked. Lewis, therefore, thinks that we can eventually avoid the threatening radical indeterminacy.

6. Dennett’s Intentional Stance

Dennett’s position, unlike Lewis’s, does not claim that indeterminacy can be so controlled. Dennett follows Quine and Davidson in viewing the third-person point of view as our best and only viable view for investigating the behavior of objects, systems, or organisms. His concern is to find an answer to the question whether we can attribute propositional attitudes to certain objects or systems, that is, whether we can interpret their behavior as intentional and hence treat them as “true believers”.

Dennett famously distinguishes among three sorts of such a third-person standpoint: the “physical stance”, the “design stance” and the “intentional stance” (Dennett 2009). Which stance works best depends on how functional it is for our purposes, that is, whether it offers a useful interpretation of the system’s behavior. Such an interpretation must enable us to explain and predict the system’s future behavior in the most practical way. The physical stance or “strategy” is the one which we usually work with in our study of the behavior of objects like planets or a pot on the burner. This stance is the method which is often employed in natural sciences. Scientists use their knowledge of the laws of nature and the physical constitutions of objects to make predictions about the objects’ behavior. This stance seems to be our only option to scrutinize the behavior of things which are neither alive nor artifacts.

Sometimes, however, the physical stance may not be the best strategy for interpreting the object’s behavior. Rather, adopting the design stance would be far more helpful. By choosing the design stance, we add the assumption that the object or the system is designed in a certain way to accomplish a specific goal. This is the stance which we use in our explanation and prediction of the behavior of, for example, a heat-seeking missile or an alarm clock. Although we may not have enough detailed information about the physical constituents of such objects and how they work, the design stance enables us to successfully predict their behavior. What about the case of the objects which manifest far more complex behavior, such as humans or an advanced chess-playing computer?

In such cases, the design stance would not be as effective as it was in the simpler cases mentioned above. At this point, the intentional stance is available. By adopting the intentional stance, we presume that the object or the system is a rational agent, that is, an agent assumed to possess propositional attitudes such as beliefs and desires. Having granted that, we decide what sort of attitudes we ought to attribute to the object, on the basis of which we can interpret its behavior as an intentional action. Considering the system’s complex interactions with the environment, we attribute specific beliefs and desires to it. Attributing the right sort of beliefs and desires to the system, in turn, enables us to predict how, on the basis of having those attitudes, it will act and, more importantly, how it ought to act: we offer an “intentional interpretation” of the system’s behavior.

If we wanted, we could potentially use the intentional stance in our study of the behavior of the missile or the alarm clock; but we do not need it: the design strategy worked just well in predicting their future behavior. In order to understand the behavior of such objects, adopting the intentional stance is simply unnecessary. Moreover, we do not usually want to count these things as believers, that is, as the possessors of a complex set of interrelated propositional attitudes. Therefore, we can define an “intentional system” as any system whose behavior can be usefully predicted by adopting the intentional stance. We treat such things as if they are rational agents which ought to possess a certain sort of beliefs, desires, intentions, goals, and purposes, in the light of their needs and complex capacities to perceive the world (Dennett 1987) and (Dennett 2009).

a. Indeterminacy and the Intentional Stance

The intentional interpretation of a system naturally allows for the emergence of the indeterminacy of interpretation. Recall that Dennett’s concern was to find out how practical and useful the offered interpretation is for the purpose of predicting the system’s behavior. In this case, we cannot expect to come up with one unique intentional interpretation of the system’s behavior, which works so perfectly that it leaves no room for the existence of any other useful intentional interpretations. There can always be alternative interpretations which work just as well in predicting the system’s behavior. Two equally predictive interpretations may attribute different sets of attitudes to an intentional system. For Dennett, whenever there are two competing intentional interpretations of a system, which work well in predicting the system’s behavior, none can be said to have any clear advantage to the other because in order to make such a choice we have no further, especially no objective, criterion to rely on. As he clarifies, “we shouldn’t make the mistake of insisting that there has to be a fact of the matter about which interpretation is ‘the truth about’ the topic. Sometimes, in the end, one interpretation is revealed to be substantially better, all things considered. But don’t count on it” (Dennett 2018, 59).

To be a believer is to have the sort of behavior which is predictable by adopting the intentional stance. There is nothing to make it impossible that, for the same pattern of behavior, we can have rival intentional-stance interpretations. It is important to note that, for Dennett, the fact that such rival interpretations exist does not imply that these patterns are unreal. They are real patterns of observable behavior. The point is that our interpretation of them and the beliefs and desires that we attribute to them would depend on the sort of stance we choose to employ (Dennett 1991). There is no deeper fact than the fact that we choose to look at a system from a specific point of view and that we do so with the aim of making the best workable prediction of its behavior. In order to decide between rival interpretations, which are all compatible with all the evidence there is with regard to the system’s behavior, we have no objective criterion to rely on because the interpretations are compatible with all the facts there are, that is, facts about the system’s behavior. As a result, the indeterminacy of interpretation emerges.

Dennett states that the intentional stance with its rationality constraints is there to “explain why in principle…there is indeterminacy of radical translation/interpretation: there can always be a tie for first between two competing assignments of meaning to the behavior … of an agent, and no other evidence counts” (Dennett 2018, 58). When you intend to organize the behavior of a system, you can organize it in different ways; and whether such a system can be viewed as an intentional system would depend on whether its behavior can be usefully predicted from the point of view of the particular interpretive intentional stance which you adopt.

7. References and Further Reading

  • Blackburn, Simon. 1984. Spreading the Word. Oxford: Oxford University Press.
  • Chomsky, Noam. 1968. “Quine’s Empirical Assumptions.” Synthese 19 (1/2): 53-68.
  • Davidson, Donald. 1963. “Actions, Reasons, and Causes.” The Journal of Philosophy 60: 685–99.
  • Davidson, Donald. 1967. “Truth and Meaning.” Synthese 17: 304–23.
  • Davidson, Donald. 1970. “Semantics for Natural Languages”. In Linguaggi nella Societa e nella Tecnica. Milan: Comunita.
  • Davidson, Donald. 1973a. “Radical Interpretation.” Dialectica 27: 314–28.
  • Davidson, Donald. 1973b. “In Defence of Convention T.” In Truth, Syntax and Modality, edited by H. Leblanc, 76–86. Dordretch: North-Holland.
  • Davidson, Donald. 1974a. “On the very Idea of a Conceptual Scheme.” Proceedings and Addresses of the American Philosophical Association 47: 5–20.
  • Davidson, Donald. 1974b. “Belief and the Basis of Meaning.” Synthese 27: 309–23.
  • Davidson, Donald. 1978. “Intending.” In Philosophy of History and Action, edited by Y. Yovel, 41–60. Dordretch: D. Reidel and The Magnes Press.
  • Davidson, Donald. 1979. “The Inscrutability of Reference.” The Southwestern Journal of Philosophy 10: 7–20.
  • Davidson, Donald. 1980. “Toward a Unified Theory of Meaning and Action.” Grazer Philosophische Studien 11: 1–12.
  • Davidson, Donald. 1982. “Rational Animals.” Dialectica 36: 317–28.
  • Davidson, Donald. 1991. “Three Varieties of Knowledge.” In A. J. Ayer: Memorial Essays, edited by A. P. Griffiths, 153–66. New York: Cambridge University Press.
  • Davidson, Donald. 1993. “Method and Metaphysics.” Deucalion 11: 239–48.
  • Davidson, Donald. 1997. “Indeterminism and Antirealism.” In Realism/Antirealism and Epistemology, edited by C. B. Kulp, 109–22. Lanham, Md.: Rowman and Littlefield.
  • Davidson, Donald. 1999a. “The Emergence of Thought.” Erkenntnis 51 (1): 7-17.
  • Davidson, Donald. 1999b. “Reply to Richard Rorty.” In The Philosophy of Donald Davidson, edited by L. E. Hahn, 595-600. US, Illinois: Open Court.
  • Dennett, Daniel. 1971. “Intentional Systems.” The Journal of Philosophy 68 (4): 87-106.
  • Dennett, Daniel. 1987 . The Intentional Stance. Cambridge, Mass.: MIT Press.
  • Dennett, Daniel. 1991. “Real Patterns.” The Journal of Philosophy 88 (1): 27-51.
  • Dennett, Daniel. 2009. “Intentional Systems Theory.” In The Oxford Handbook of Philosophy of Mind, edited by Brian P. McLaughlin, and Sven Walter Ansgar Beckermann, 340-350. Oxford: Oxford University Press.
  • Dennett, Daniel. 2018. “Reflections on Tadeusz Zawidzki.” In The Philosophy of Daniel Dennett, edited by Bryce Huebner, 57-61. Oxford: Oxford University Press.
  • Evans, Gareth. 1975. “Identity and Predication.” Journal of Philosophy LXXII(13): 343–362.
  • Fordor, Jerry. 1993. The Elm and the Expert: Mentalese and its Semantics. Cambridge, MA: Bradford.
  • Glock, Hans-Johann. 2003. Quine and Davidson on Language, Thought, and Reality. Cambridge: Cambridge University Press.
  • Goodman, Nelson. 1955. Fact, Fiction, and Forecast. Cambridge, MA.: Harvard UP.
  • Hookway, Christopher. 1988. Quine. London: Polity Press.
  • Kirk, Robert. 1986. Translation Determined. Oxford: Oxford University Press.
  • Lewis, David. 1974. “Radical Interpretation.” Synthese 23: 331-344.
  • Lewis, David. 1975. “Languages and Language.” In Minnesota Studies in the Philosophy of Science, Volume VII, edited by Keith Gunderson, 3–35. Minneapolis: University of Minnesota Press.
  • Lewis, David. 1983. “New Work for a Theory of Universals,.” Australasian Journal o f Philosophy 61 (4): 343–77.
  • Lewis, David. 1984. “Putnam’s Paradox.” Australasian Journal of Philosophy LXII : 221-36.
  • Putnam, Hilary. 1977. “Realism and Reason.” Proceedings and Addresses of the American Philosophical Association 50 (6): 483-498.
  • Putnam, Hilary. 1980. “Models and Reality.” The Journal of Symbolic Logic 45 (3): 464-482.
  • Quine, W. V. 1951. “Two Dogmas of Empiricism.” The Philosophical Review 60 (1): 20-43.
  • Quine, W. V. 1960. Word and Object. US, MA: MIT Press.
  • Quine, W. V. 1968. “Reply to Chomsky.” Synthese 19 (1/2): 274-283.
  • Quine, W. V. 1969a. Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Quine, W. V. 1969b. “Epistemology Naturalized.” In Ontological Relativity and Other Essays, by W. V. Quine, 69–90. New York: Columbia University Press.
  • Quine, W. V. 1970. “On the Reasons for Indeterminacy of Translation.” The Journal of Philosophy 67 (6): 178-183.
  • Quine, W. V. 1973. The Roots of Reference . Open Court.
  • Quine, W. V. 1981. Theories and Things. Cambridge, MA: Harvard UP.
  • Quine, W. V. 1987. “Indeterminacy of Translation Again.” The Journal of Philosophy 84 (1): 5-10.
  • Quine, W. V. 1990a. Pursuit of Truth. Cambridge, MA: Harvard UP.
  • Quine, W. V. 1990b. “Three Indeterminacies.” In Perspectives on Quine, edited by R. B. Barrett and R. F. Gibson, 1-16. Cambridge, Mass.: Basil Blackwell.
  • Quine, W. V. 1990c. “Comment on Bergström.” In Perspectives on Quine, edited by Roger Gibson and R. Barrett, 53-54. Oxford: Blackwell.
  • Quine, W. V. 1995. From Stimulus to Science. Cambridge, Mass.: Harvard University Press.
  • Quine, W. V. and J. S. Ullian. 1978. The Web of Belief. New York: McGraw-Hill.
  • Richard, Mark. 1997. “Inscrutability.” Canadian Journal of Philosophy 27: 165-209.
  • Searle, John. 1987. “Indeterminacy, Empiricism, and the First Person.” The Journal of Philosophy 84 (3): 123-146.
  • Wilson, Neil L. 1959. ” Substances without Substrata.” The Review of Metaphysics 12 (4): 521-539.

 

Author Information

Ali Hossein Khani
Email: hosseinkhani@ipm.ir
Institute for Research in Fundamental Sciences, and
Iranian Institute of Philosophy
Iran

Ethics of Artificial Intelligence

This article provides a comprehensive overview of the main ethical issues related to the impact of Artificial Intelligence (AI) on human society. AI is the use of machines to do things that would normally require human intelligence. In many areas of human life, AI has rapidly and significantly affected human society and the ways we interact with each other. It will continue to do so. Along the way, AI has presented substantial ethical and socio-political challenges that call for a thorough philosophical and ethical analysis. Its social impact should be studied so as to avoid any negative repercussions. AI systems are becoming more and more autonomous, apparently rational, and intelligent. This comprehensive development gives rise to numerous issues. In addition to the potential harm and impact of AI technologies on our privacy, other concerns include their moral and legal status (including moral and legal rights), their possible moral agency and patienthood, and issues related to their possible personhood and even dignity. It is common, however, to distinguish the following issues as of utmost significance with respect to AI and its relation to human society, according to three different time periods: (1) short-term (early 21st century): autonomous systems (transportation, weapons), machine bias in law, privacy and surveillance, the black box problem and AI decision-making; (2) mid-term (from the 2040s to the end of the century): AI governance, confirming the moral and legal status of intelligent machines (artificial moral agents), human-machine interaction, mass automation; (3) long-term (starting with the 2100s): technological singularity, mass unemployment, space colonisation.

Table of Contents

  1. The Relevance of AI for Ethics
    1. What is AI?
    2. Its Ethical Relevance
  2. Main Debates
    1. Machine Ethics
      1. Bottom-up Approaches: Casuistry
      2. Top-down Approaches: The MoralDM Approach
      3. Mixed Approaches: The Hybrid Approach
    2. Autonomous Systems
    3. Machine Bias
    4. The Problem of Opacity
    5. Machine Consciousness
    6. The Moral Status of Artificial Intelligent Machines
      1. The Autonomy Approach
      2. The Indirect Duties Approach
      3. The Relational Approach
      4. The Upshot
    7. Singularity and Value Alignment
    8. Other Debates
      1. AI as a form of Moral Enhancement or a Moral Advisor
      2. AI and the Future of Work
      3. AI and the Future of Personal Relationships
      4. AI and the Concern About Human ‘Enfeeblement’
      5. Anthropomorphism
  3. Ethical Guidelines for AI
  4. Conclusion
  5. References and Further Reading

1. The Relevance of AI for Ethics

This section discusses why AI is of utmost importance for our systems of ethics and morality, given the increasing human-machine interaction.

a. What is AI?

AI may mean several different things and it is defined in many different ways. When Alan Turing introduced the so-called Turing test (which he called an ‘imitation game’) in his famous 1950 essay about whether machines can think, the term ‘artificial intelligence’ had not yet been introduced. Turing considered whether machines can think, and suggested that it would be clearer to replace that question with the question of whether it might be possible to build machines that could imitate humans so convincingly that people would find it difficult to tell whether, for example, a written message comes from a computer or from a human (Turing 1950).

The term ‘AI’ was coined in 1955 by a group of researchers—John McCarthy, Marvin L. Minsky, Nathaniel Rochester and Claude E. Shannon—who organised a famous two-month summer workshop at Dartmouth College on the ‘Study of Artificial Intelligence’ in 1956. This event is widely recognised as the very beginning of the study of AI. The organisers described the workshop as follows:

We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer. (Proposal 1955: 2)

Another, later scholarly definition describes AI as:

the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. The term is frequently applied to the project of developing systems endowed with the intellectual processes characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience. (Copeland 2020)

In the early twenty-first century, the ultimate goal of many computer specialists and engineers has been to build a robust AI system which would not differ from human intelligence in any aspect other than its machine origin. Whether this is at all possible has been a matter of lively debate for several decades. The prominent American philosopher John Searle (1980) introduced the so-called Chinese room argument to contend that strong or general AI (AGI)—that is, building AI systems which could deal with many different and complex tasks that require human-like intelligence—is in principle impossible. In doing so, he sparked a long-standing general debate on the possibility of AGI. Current AI systems are narrowly focused (that is, weak AI) and can only solve one particular task, such as playing chess or the Chinese game of Go. Searle’s general thesis was that no matter how complex and sophisticated a machine is, it will nonetheless have no ‘consciousness’ or ‘mind’, which is a prerequisite for the ability to understand, in contrast to the capability to compute (see section 2.e.).

Searle’s argument has been critically evaluated against the counterclaims of functionalism and computationalism. It is generally argued that intelligence does not require a particular substratum, such as carbon-based beings, but that it will also evolve in silicon-based environments, if the system is complex enough (for example, Chalmers 1996, chapter 9).

In the early years of the twenty-first century, many researchers working on AI development associated AI primarily with different forms of the so-called machine learning—that is, technologies that identify patterns in data. Simpler forms of such systems are said to engage in ‘supervised learning’—which nonetheless still requires considerable human input and supervision—but the aim of many researchers, perhaps most prominently Yann LeCun, had been set to develop the so-called self-supervised learning systems. These days, some researchers began to discuss AI in a way that seems to equate the concept with machine learning. This article, however, uses the term ‘AI’ in a wider sense that includes—but is not limited to—machine learning technologies.

b. Its Ethical Relevance

The major ethical challenges for human societies AI poses are presented well in the excellent introductions by Vincent Müller (2020), Mark Coeckelbergh (2020), Janina Loh (2019), Catrin Misselhorn (2018) and David Gunkel (2012). Regardless of the possibility of construing AGI, autonomous AI systems already raise substantial ethical issues: for example, the machine bias in law, making hiring decisions by means of smart algorithms, racist and sexist chatbots, or non-gender-neutral language translations (see section 2.c.). The very idea of a machine ‘imitating’ human intelligence—which is one common definition of AI—gives rise to worries about deception, especially if the AI is built into robots designed to look or act like human beings (Boden et al. 2017; Nyholm and Frank 2019). Moreover, Rosalind Picard rightly claims that ‘the greater the freedom of a machine, the more it will need moral standards’ (1997: 19). This substantiates the claim that all interactions between AI systems and human beings necessarily entail an ethical dimension, for example, in the context of autonomous transportation (see section 2.d.).

The idea of implementing ethics within a machine is one of the main research goals in the field of machine ethics (for example, Lin et al. 2012; Anderson and Anderson 2011; Wallach and Allen 2009). More and more responsibility has been shifted from human beings to autonomous AI systems which are able to work much faster than human beings without taking any breaks and with no need for constant supervision, as illustrated by the excellent performance of many systems (once they have successfully passed the debugging phase).

It has been suggested that humanity’s future existence may depend on the implementation of solid moral standards in AI systems, given the possibility that these systems may, at some point, either match or supersede human capabilities (see section 2.g.). This point in time was called ‘technological singularity’ by Vernon Vinge in 1983 (see also: Vinge 1993; Kurzweil 2005; Chalmers 2010). The famous playwright Karl Čapek (1920), the renowned astrophysicist Stephen Hawking and the influential philosopher Nick Bostrom (2016, 2018) have all warned about the possible dangers of technological singularity should intelligent machines turn against their creators, that is, human beings. Therefore, according to Nick Bostrom, it is of utmost importance to build friendly AI (see the alignment problem, discussed in section 2.g.).

In conclusion, the implementation of ethics is crucial for AI systems for multiple reasons: to provide safety guidelines that can prevent existential risks for humanity, to solve any issues related to bias, to build friendly AI systems that will adopt our ethical standards, and to help humanity flourish.

2. Main Debates

The following debates are of utmost significance in the context of AI and ethics. They are not the only important debates in the field, but they provide a good overview of topics that will likely remain of great importance for many decades (for a similar list, see Müller 2020).

a. Machine Ethics

Susan Anderson, a pioneer of machine ethics, defines the goal of machine ethics as:

to create a machine that follows an ideal ethical principle or set of principles in guiding its behaviour; in other words, it is guided by this principle, or these principles, in the decisions it makes about possible courses of action it could take. We can say, more simply, that this involves “adding an ethical dimension” to the machine. (2011: 22)

In addition, the study of machine ethics examines issues regarding the moral status of intelligent machines and asks whether they should be entitled to moral and legal rights (Gordon 2020a, 2020b; Richardson 2019; Gunkel and Bryson 2014; Gunkel 2012; Anderson and Anderson 2011; Wallach and Allen 2010). In general, machine ethics is an interdisciplinary sub-discipline of the ethics of technology, which is in turn a discipline within applied ethics. The ethics of technology also contains the sub-disciplines of robot ethics (see, for example, Lin et al. 2011, 2017; Gunkel 2018; Nyholm 2020), which is concerned with questions of how human beings design, construct and use robots; and computer ethics (for example, Johnson 1985/2009; Johnson and Nissenbaum 1995; Himma and Tavani 2008), which is concerned with =commercial behaviour involving computers and information (for example, data security, privacy issues).

The first ethical code for AI systems was introduced by the famed science fiction writer Isaac Asimov, who presented his Three Laws of Robotics in Runaround (Asimov 1942). These three were later supplemented by a fourth law, called the Zeroth Law of Robotics, in Robots and Empire (Asimov 1986). The four laws are as follows:

  1. A robot may not injure a human being or, through inaction, allow a human being to be harmed;
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the first law;
  3. A robot must protect its own existence as long as such protection does not conflict with the first or second law;
  4. A robot may not harm humanity or, by inaction, allow humanity to suffer harm.

Asimov’s four laws have played a major role in machine ethics for many decades and have been widely discussed by experts. The standard view regarding the four laws is that they are important but insufficient to deal with all the complexities related to moral machines. This seems to be a fair evaluation, since Asimov never claimed that his laws could cope with all issues. If that was really the case, then Asimov would perhaps not have written his fascinating stories about problems caused partly by the four laws.

The early years of the twenty-first century saw the proposal of numerous approaches to implementing ethics within machines, to provide AI systems with ethical principles that the machines could use in making moral decisions (Gordon 2020a). We can distinguish at least three types of approaches: bottom-up, top-down, and mixed. An example of each type is provided below (see also Gordon 2020a: 147).

i. Bottom-up Approaches: Casuistry

Guarini’s (2006) system is an example of a bottom-up approach. It uses a neural network which bases its ethical decisions on a learning process in which the neural network is presented with known correct answers to ethical dilemmas. After the initial learning process, the system is supposed to be able to solve new ethical dilemmas on its own. However, Guarini’s system generates problems concerning the reclassification of cases, caused by the lack of adequate reflection and exact representation of the situation. Guarini himself admits that casuistry alone is insufficient for machine ethics.

ii. Top-down Approaches: The MoralDM Approach

The system conceived by Dehghani et al. (2011) combines two main ethical theories, utilitarianism and deontology, along with analogical reasoning. Utilitarian reasoning applies until ‘sacred values’ are concerned, at which point the system operates in a deontological mode and becomes less sensitive to the utility of actions and consequences. To align the system with human moral decisions, Dehghani et al. evaluate it against psychological studies of how the majority of human beings decide particular cases.

The MoralDM approach is particularly successful in that it pays proper respect to the two main ethical theories (deontology and utilitarianism) and combines them in a fruitful and promising way. However, their additional strategy of using empirical studies to mirror human moral decisions by considering as correct only those decisions that align with the majority view is misleading and seriously flawed. Rather, their system should be seen as a model of a descriptive study of ethical behaviour but not a model for normative ethics.

iii. Mixed Approaches: The Hybrid Approach

The hybrid model of human cognition (Wallach et al. 2010; Wallach and Allen 2010) combines a top-down component (theory-driven reasoning) and a bottom-up (shaped by evolution and learning) component that are considered the basis of both moral reasoning and decision-making. The result thus far is LIDA, an AGI software offering a comprehensive conceptual and computational model that models a large portion of human cognition. The hybrid model of moral reasoning attempts to re-create human decision-making by appealing to a complex combination of top-down and bottom-up approaches leading eventually to a descriptive but not a normative model of ethics. In addition, its somewhat idiosyncratic understanding of both approaches from moral philosophy does not in fact match how moral philosophers understand and use them in normative ethics. The model presented by Wallach et al. is not necessarily inaccurate with respect to how moral decision-making works in an empirical sense, but their approach is descriptive rather than normative in nature. Therefore, their empirical model does not solve the normative problem of how moral machines should act. Descriptive ethics and normative ethics are two different things. The former tells us how human beings make moral decisions; the latter is concerned with how we should act.

b. Autonomous Systems

The proposals for a system of machine ethics discussed in section 2.a. are increasingly being discussed in relation to autonomous systems the operation of which poses a risk of harm to human life. The two most-often discussed examples—which are at times discussed together and contrasted and compared with each other—are autonomous vehicles (also known as self-driving cars) and autonomous weapons systems (sometimes dubbed ‘killer robots’) (Purves et al. 2015; Danaher 2016; Nyholm 2018a).

Some authors think that autonomous weapons might be a good replacement for human soldiers (Müller and Simpson 2014). For example, Arkin (2009, 2010) argues that having machines fight our wars for us instead of human soldiers could lead to a decrease in war crimes if the machines were equipped with an ‘ethical governor’ system that would consistently follow the rules of war and engagement. However, others worry about the widespread availability of AI-driven autonomous weapons systems, because they think the availability of such systems might tempt people to go to war more often, or because they are sceptical about the possibility of an AI system that could interpret and apply the ethical and legal principles of war (see, for example, Royakkers and van Est 2015; Strawser 2010). There are also worries that ‘killer robots’ might be hacked (Klincewicz 2015).

Similarly, while acknowledging the possible benefits of self-driving cars—such as increased traffic safety, more efficient use of fuel and better-coordinated traffic—many authors have also noted the possible accidents that could occur (Goodall 2014; Lin 2015; Gurney 2016; Nyholm 2018b, 2018c; Keeling 2020). The underlying idea is that autonomous vehicles should be equipped with ‘ethics settings’ that would help to determine how they should react to accident scenarios where people’s lives and safety are at stake (Gogoll and Müller 2017). This is considered another real-life application of machine ethics that society urgently needs to grapple with.

The concern for self-driving cars being involved in deadly accidents for which the AI system may not have been adequately prepared has already been realised, tragically, as some people have died in such accidents (Nyholm 2018b). The first instance of death while riding in an autonomous vehicle—a Tesla Model S car in ‘autopilot’ mode—occurred in May 2016. The first pedestrian was hit and killed by an experimental self-driving car, operated by the ride-hailing company Uber, in March 2018. In the latter case, part of the problem was that the AI system in the car had difficulty classifying the object that suddenly appeared in its path. It initially classified the victim as ‘unknown’, then as a ‘vehicle’, and finally as a ‘bicycle’. Just moments before the crash, the system decided to apply the brakes, but by then it was too late (Keeling 2020: 146). Whether the AI system in the car functions properly can thus be a matter of life and death.

Philosophers discussing such cases may propose that, even when it cannot brake in time, the car might swerve to one side (for example, Goodall 2014; Lin 2015). But what if five people were on the only side of the road the car could swerve onto? Or what if five people appeared on the road and one person was on the curb where the car might swerve? These scenarios are similar to the much-discussed ‘trolley problem’: the choice would involve killing one person to save five, and the question would become under what sorts of circumstances that decision would or would not be permissible. Several papers have discussed relevant similarities and differences between the ethics of crashes involving self-driving cars, on the one hand, and the philosophy of the trolley problem, on the other (Lin 2015; Nyholm and Smids 2016; Goodall 2016; Himmelreich 2018; Keeling 2020; Kamm 2020).

One question that has occupied ethicists discussing autonomous systems is what ethical principles should govern their decision-making process in situations that might involve harm to human beings. A related issue is whether it is ever acceptable for autonomous machines to kill or harm human beings, particularly if they do so in a manner governed by certain principles that have been programmed into or made part of the machines in another way. Here, a distinction is made between deaths caused by self-driving cars—which are generally considered a deeply regrettable but foreseeable side effect of their use—and killing by autonomous weapons systems, which some consider always morally unacceptable (Purves et al. 2015). Even a campaign has been launched to ‘stop killer robots’, backed by many AI ethicists such as Noel Sharkey and Peter Asaro.

One reason for arguing that autonomous weapons systems should be banned the campaign puts forward is that what they call ‘meaningful human control’ must be retained. This concept is also discussed in relation to self-driving cars (Santoni de Sio and van den Hoven 2018). Many authors have worried about the risk of creating ‘responsibility gaps’, or cases in which it is unclear who should be held responsible for harm that has occurred due to the decisions made by an autonomous AI system (Matthias 2004; Sparrow 2007; Danaher 2016). The key challenge here is to come up with a way of understanding moral responsibility in the context of autonomous systems that would allow us to secure the benefits of such systems and at the same time appropriately attribute responsibility for any undesirable consequences. If a machine causes harm, the human beings involved in the machine’s action may try to evade responsibility; indeed, in some cases it might seem unfair to blame people for what a machine has done. Of course, if an autonomous system produces a good outcome, which some human beings, if any, claim to deserve praise for, the result might be equally unclear. In general, people may be more willing to take responsibility for good outcomes produced by autonomous systems than for bad ones. But in both situations, responsibility gaps can arise. Accordingly, philosophers need to formulate a theory of how to allocate responsibility for outcomes produced by functionally autonomous AI technologies, whether good or bad (Nyholm 2018a; Dignum 2019; Danaher 2019a; Tigard 2020a).

c. Machine Bias

Many people believe that the use of smart technologies would put an end to human bias because of the supposed ‘neutrality’ of machines. However, we have come to realise that machines may maintain and even substantiate human bias towards women, different ethnicities, the elderly, people with medical impairments, or other groups (Kraemer et al. 2011; Mittelstadt et al. 2016). As a consequence, one of the most urgent questions in the context of machine learning is how to avoid machine bias (Daniels et al. 2019). The idea of using AI systems to support human decision-making is, in general, an excellent objective in view of AI’s ‘increased efficiency, accuracy, scale and speed in making decisions and finding the best answers’ (World Economic Forum 2018: 6). However, machine bias can undermine this seemingly positive situation in various ways. Some striking cases of machine bias are as follows:

  1. Gender bias in hiring (Dastin 2018);
  2. Racial bias, in that certain racial groups are offered only particular types of jobs (Sweeney 2013);
  3. Racial bias in decisions on the creditworthiness of loan applicants (Ludwig 2015);
  4. Racial bias in decisions whether to release prisoners on parole (Angwin et al. 2016);
  5. Racial bias in predicting criminal activities in urban areas (O’Neil 2016);
  6. Sexual bias when identifying a person’s sexual orientation (Wang and Kosinski 2018);
  7. Racial bias in facial recognition systems that prefer lighter skin colours (Buolamwini and Gebru 2018);
  8. Racial and social bias in using the geographic location of a person’s residence as a proxy for ethnicity or socio-economic status (Veale and Binns 2017).

We can recognise at least three reasons for machine bias: (1) data bias, (2) computational/algorithmic bias and (3) outcome bias (Springer et al. 2018: 451). First, a machine learning system that is trained using data that contain implicit or explicit imbalances reinforces the distortion in the data with respect to any future decision-making, thereby making the bias systematic. Second, a programme may suffer from algorithmic bias due to the developer’s implicit or explicit biases. The design of a programme relies on the developer’s understanding of the normative and non-normative values of other people, including the users and stakeholders affected by it (Dobbe et al. 2018). Third, outcome bias could be based on the use of historical records, for example, to predict criminal activities in certain particular urban areas; the system may allocate more police to a particular area, resulting in an increase in reported cases which would have been unnoticed before. This logic would substantiate the AI system’s decision to allocate the police to this area, even though other urban areas may have similar or even greater numbers of crimes, more of which would go unreported due to the lack of policing (O’Neil 2016).

Most AI researchers, programmers and developers as well as scholars working in the field of technology believe that we will never be able to design a fully unbiased system. Therefore, the focus is on reducing machine bias and minimising its detrimental effects on human beings. Nevertheless, various questions remain. What type of bias cannot be filtered out and when should we be satisfied with the remaining bias? What does it mean for a person in court to be subject not only to human bias but also to machine bias, with both forms of injustice potentially helping to determine the person’s sentence? Is one type of bias not enough? Should we not rather aim to eliminate human bias instead of introducing a new one?

d. The Problem of Opacity

AI systems are used to make many sorts of decisions that significantly impact people’s lives. AI can be used to make decisions about who gets a loan, who is admitted to a university, who gets an advertised job, who is likely to reoffend, and so on. Since these decisions have major impacts on people, we must be able to understand the underlying reasons for them. In other words, AI and its decision-making need to be explainable. In fact, many authors discussing the ethics of AI propose explainability (also referred to as explicability) as a basic ethical criterion, among others, for the acceptability of AI decision-making (Floridi et al. 2018). However, many decisions made by an autonomous AI system are not readily explainable to people. This came to be called the problem of opacity.

The opacity of AI decision-making can be of different kinds, depending on relevant factors. Some AI decisions are opaque to those who are affected by them because the algorithms behind the decisions, though quite easy to understand, are protected trade secrets which the companies using them do not want to share with anyone outside the company. Another reason for AI opacity is that most people lack the technical expertise to understand how an AI-based system works, even if there is nothing intrinsically opaque about the technology in question. With some forms of AI, not even the experts can understand the decision-making processes used. This has been dubbed the ‘black box’ problem (Wachter, Mittelstadt and Russell 2018).

On the individual level, it can seem to be an affront to a person’s dignity and autonomy when decisions about important aspects of their lives are made by machines if it is unclear—or perhaps even impossible to know—why machines made these decisions. On the societal level, the increasing prominence of algorithmic decision-making could become a threat to our democratic processes. Henry Kissinger, the former U.S. Secretary of State, once stated, ‘We may have created a dominating technology in search of a guiding philosophy’ (Kissinger 2018; quoted in Müller 2020). John Danaher, commenting on this idea, worries that people might be led to act in superstitious and irrational ways, like those in earlier times who believed that they could affect natural phenomena through rain dances or similar behaviour. Danaher has called this situation ‘the threat of algocracy’—that is, of rule by algorithms that we do not understand but have to obey (Danaher 2016b, 2019b).

But is AI opacity always, and necessarily, a problem? Is it equally problematic across all contexts? Should there be an absolute requirement that AI must in all cases be explainable? Scott Robbins (2019) has provided some interesting and noteworthy arguments in opposition to this idea. Robbins argues, among other things, that a hard requirement for explicability could prevent us from reaping all the possible benefits of AI. For example, he points out that if an AI system could reliably detect or predict some form of cancer in a way that we cannot explain or understand, the value of knowing the information would outweigh any concerns about not knowing how the AI system would have reached this conclusion. In general, it is also possible to distinguish between contexts where the procedure behind a decision matters in itself and those where only the quality of the outcome matters (Danaher and Robbins 2020).

Another promising response to the problem of opacity is to try to construct alternative modes of explaining AI decisions that would take into account their opacity but would nevertheless offer some form of explanation that people could act on. Sandra Wachter, Brent Mittelstadt, and Chris Russell (2019) have developed the idea of a ‘counterfactual explanation’ of such decisions, one designed to offer practical guidance for people wishing to respond rationally to AI decisions they do not understand. They state that ‘counterfactual explanations do not attempt to clarify how [AI] decisions are made internally. Instead, they provide insight into which external facts could be different in order to arrive at a desired outcome’ (Wachter et al. 2018: 880). Such an external, counterfactual way of explaining AI decisions might be a promising alternative in cases where AI decision-making is highly valuable but functions according to an internal logic that is opaque to most or all people.

e. Machine Consciousness

Some researchers think that when machines become more and more sophisticated and intelligent, they might at some point become spontaneously conscious as well (compare Russell 2019). This would be a sort of puzzling—but potentially highly significant from an ethical standpoint—side effect of the development of advanced AI. Some people are intentionally seeking to create machines with artificial consciousness. Kunihiro Asada, a successful engineer, set his goal as to create a robot that can experience pleasure and pain, on the basis that such a robot could engage in the kind of pre-linguistic learning that a human baby is capable of before it acquires language (Marchese 2020). Another example is Sophia the robot, whose developers at Hanson Robotics say that they wish to create a ‘super-intelligent benevolent being’ that will eventually become a ‘conscious, living machine’.

Others, such as Joanna Bryson, note that depending on how we define consciousness, some machines might already have some form of consciousness. Bryson argues that if we take consciousness to mean the presence of internal states and the ability to report on these states to other agents, then some machines might fulfil these criteria even now (Bryson 2012). In addition, Aïda Elamrani-Raoult and Roman Yampolskiy (2018) have identified as many as twenty-one different possible tests of machine consciousness.

Moreover, similar claims could be made about the issue of whether machines can have minds. If mind is defined, at least in part, in a functional way, as the internal processing of inputs from the external environment that generates seemingly intelligent responses to that environment, then machines could possess minds (Nyholm 2020: 145–46). Of course, even if machines can be said to have minds or consciousness in some sense, they would still not necessarily be anything like human minds. After all, the particular consciousness and subjectivity of any being will depend on what kinds of ‘hardware’ (such as brains, sense organs, and nervous systems) the being in question has (Nagel 1974).

Whether or not we think some AI machines are already conscious or that they could (either by accident or by design) become conscious, this issue is a key source of ethical controversy. Thomas Metzinger (2013), for example, argues that society should adopt, as a basic principle of AI ethics, a rule against creating machines that are capable of suffering. His argument is simple: suffering is bad, it is immoral to cause suffering, and therefore it would be immoral to create machines that suffer. Joanna Bryson contends similarly that although it is possible to create machines that would have a significant moral status, it is best to avoid doing so; in her view, we are morally obligated not to create machines to which we would have obligations (Bryson 2010, 2019). Again, this might all depend on what we understand by consciousness. Accordingly, Eric Schwitzgebel and Mara Garza (2015: 114–15) comment, ‘If society continues on the path towards developing more sophisticated artificial intelligence, developing a good theory of consciousness is a moral imperative’.

Another interesting perspective is provided by Nicholas Agar (2019), who suggests that if there are arguments both in favour of and against the possibility that certain advanced machines have minds and consciousness, we should err on the side of caution and proceed on the assumption that machines do have minds. On this basis, we should then avoid any actions that might conceivably cause them to suffer. In contrast, John Danaher (2020) states that we can never be sure as to whether a machine has conscious experience, but that this uncertainty does not matter; if a machine behaves similarly to how conscious beings with moral status behave, this is sufficient moral reason, according to Danaher’s ‘ethical behaviourism’, to treat the machine with the same moral considerations with which we would treat a conscious being. The standard approach considers whether machines do actually have conscious minds and then how this answer should influence the question of whether to grant machines moral status (see, for example, Schwitzgebel and Garza 2015; Mosakas 2020; Nyholm 2020: 115–16).

f. The Moral Status of Artificial Intelligent Machines

Traditionally, the concept of moral status has been of utmost importance in ethics and moral philosophy because entities that have a moral status are considered part of the moral community and are entitled to moral protection. Not all members of a moral community have the same moral status, and therefore they differ with respect to their claims to moral protection. For example, dogs and cats are part of our moral community, but they do not enjoy the same moral status as a typical adult human being. If a being has a moral status, then it has certain moral (and legal) rights as well. The twentieth century saw a growth in the recognition of the rights of ethnic minorities, women, and the LGBTQ+ community, and even the rights of animals and the environment. This expanding moral circle may eventually grow further to include artificial intelligent machines once they exist (as advocated by the robot rights movement).

The notion of personhood (whatever that may mean) has become relevant in determining whether an entity has full moral status and whether, depending on its moral status, it should enjoy the full set of moral rights. One prominent definition of moral status has been provided by Frances Kamm (2007: 229):

So, we see that within the class of entities that count in their own right, there are those entities that in their own right and for their own sake could give us reason to act. I think that it is this that people have in mind when they ordinarily attribute moral status to an entity. So, henceforth, I shall distinguish between an entity’s counting morally in its own right and its having moral status. I shall say that an entity has moral status when, in its own right and for its own sake, it can give us reason to do things such as not destroy it or help it.

Things can be done for X’s own sake, according to Kamm, if X is either conscious and/or able to feel pain. This definition usually includes human beings and most animals, whereas non-living parts of nature are mainly excluded on the basis of their lack of consciousness and inability to feel pain. However, there are good reasons why one should broaden their moral reasoning and decision-making to encompass the environment as well (Stone 1972, 2010; Atapattu 2015). For example, the Grand Canyon could be taken into moral account in human decision-making, given its unique form and great aesthetic value, even though it lacks personhood and therefore moral status. Furthermore, some experts have treated sentient animals such as great apes and elephants as persons even though they are not human (for example, Singer 1975; Cavalieri 2001; Francione 2009).

In addition, we can raise the important question of whether (a) current robots used in social situations or (b) artificial intelligent machines, once they are created, might have a moral status and be entitled to moral rights as well, comparable to the moral status and rights of human beings. The following three main approaches provide a brief overview of the discussion.

i. The Autonomy Approach

Kant and his followers place great emphasis on the notion of autonomy in the context of moral status and rights. A moral person is defined as a rational and autonomous being. Against this background, it has been suggested that one might be able to ascribe personhood to artificial intelligent machines once they have reached a certain level of autonomy in making moral decisions. Current machines are becoming increasingly autonomous, so it seems only a matter of time until they meet this moral threshold. A Kantian line of argument in support of granting moral status to machines based on autonomy could be framed as follows:

  1. Rational agents have the capability to decide whether to act (or not act) in accordance with the demands of morality.
    1. The ability to make decisions and to determine what is good has absolute value.
    2. The ability to make such decisions gives rational persons absolute value.
  2. A rational agent can act autonomously, including acting with respect to moral principles.
    1. Rational agents have dignity insofar as they act autonomously.
    2. Acting autonomously makes persons morally responsible.
  3. Such a being—that is, a rational agent—has moral personhood.

It might be objected that machines—no matter how autonomous and rational—are not human beings and therefore should not be entitled to a moral status and the accompanying rights under a Kantian line of reasoning. But this objection is misleading, since Kant himself clearly states in his Groundwork (2009) that human beings should be considered as moral agents not because they are human beings, but because they are autonomous agents (Altman 2011; Timmermann 2020: 94). Kant has been criticised by his opponents for his logocentrism, even though this very claim has helped him avoid the more severe objection of speciesism—of holding that a particular species is morally superior simply because of the empirical features of the species itself (in the case of human beings, the particular DNA). This has been widely viewed as the equivalent of racism at the species level (Singer 2009).

ii. The Indirect Duties Approach

The indirect duties approach is based on Kant’s analysis of our behaviour towards animals. In general, Kant argues in his Lectures on Ethics (1980: 239–41) that even though human beings do not have direct duties towards animals (because they are not persons), they still have indirect duties towards them. The underlying reason is that human beings may start to treat their fellow humans badly if they develop bad habits by mistreating and abusing animals as they see fit. In other words, abusing animals may have a detrimental, brutalising impact on human character.

Kate Darling (2016) has applied the Kantian line of reasoning to show that even current social robots should be entitled to moral and legal protection. She argues that one should protect lifelike beings such as robots that interact with human beings when society cares deeply enough about them, even though they do not have a right to life. Darling offers two arguments why one should treat social robots in this way. Her first argument concerns people who witness cases of abuse and mistreatment of robots, pointing out that they might become ‘traumatized’ and ‘desensitized’. Second, she contends that abusing robots may have a detrimental impact on the abuser’s character, causing her to start treating fellow humans poorly as well.

Indeed, current social robots may be best protected by the indirect duties approach, but the idea that exactly the same arguments should also be applied to future robots of greater sophistication that either match or supersede human capabilities is somewhat troublesome. Usually, one would expect that these future robots—unlike Darling’s social robots of today—will be not only moral patients but rather proper moral agents. In addition, the view that one should protect lifelike beings ‘when society cares deeply enough’ (2016: 230) about them opens the door to social exclusion based purely on people’s unwillingness to accept them as members of the moral community. Morally speaking, this is not acceptable. The next approach attempts to deal with this situation.

iii. The Relational Approach

Mark Coeckelbergh (2014) and David Gunkel (2012), the pioneers of the relational approach to moral status, believe that robots have a moral status based on their social relation with human beings. In other words, moral status or personhood emerges through social relations between different entities, such as human beings and robots, instead of depending on criteria inherent in the being such as sentience and consciousness. The general idea behind this approach comes to the fore in the following key passage (Coeckelbergh 2014: 69–70):

We may wonder if robots will remain “machines” or if they can become companions. Will people start saying, as they tend to say of people who have “met their dog” … , that someone has “met her robot”? Would such a person, having that kind of relation with that robot, still feel shame at all in front of the robot? And is there, at that point of personal engagement, still a need to talk about the “moral standing” of the robot? Is not moral quality already implied in the very relation that has emerged here? For example, if an elderly person is already very attached to her Paro robot and regards it as a pet or baby, then what needs to be discussed is that relation, rather than the “moral standing” of the robot.

The personal experience with the Other, that is, the robot, is the key component of this relational and phenomenological approach. The relational concept of personhood can be fleshed out in the following way:

  1. A social model of autonomy, under which autonomy is not defined individually but stands in the context of social relations;
  2. Personhood is absolute and inherent in every entity as a social being; it does not come in degrees;
  3. An interactionist model of personhood, according to which personhood is relational by nature (but not necessarily reciprocal) and defined in non-cognitivist terms.

The above claims are not intended as steps in a conclusive argument; rather, they portray the general line of reasoning regarding the moral importance of social relations. The relational approach does not require the robot to be rational, intelligent or autonomous as an individual entity; instead, the social encounter with the robot is morally decisive. The moral standing of the robot is based on exactly this social encounter.

The problem with the relational approach is that the moral status of robots is thus based completely on human beings’ willingness to enter into social relations with a robot. In other words, if human beings (for whatever reasons) do not want to enter into such relations, they could deny robots a moral status to which the robots might be entitled on more objective criteria such as rationality and sentience. Thus, the relational approach does not actually provide a strong foundation for robot rights; rather, it supports a pragmatic perspective that would make it easier to welcome robots (who already have moral status) in the moral community (Gordon 2020c).

iv. The Upshot

The three approaches discussed in sections 2.f.i-iii. all attempt to show how one can make sense of the idea of ascribing moral status and rights to robots. The most important observation is, however, that robots are entitled to moral status and rights independently of our opinion, once they have fulfilled the relevant criteria. Whether human beings will actually recognise their status and rights are a different matter.

g. Singularity and Value Alignment

Some of the theories of the potential moral status of artificial intelligent agents discussed in section 2.f. have struck some authors as belonging to science fiction. The same can be said about the next topic to be considered: singularity. The underlying argument regarding technological singularity was introduced by statistician I. J. Good in ‘Speculations Concerning the First Ultraintelligent Machine’ (1965):

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion”, and the intelligence of man would be left far behind. Thus, the first ultraintelligent machine is the last invention that man need ever make.

The idea of an intelligence explosion involving self-replicating, super-intelligent AI machines seems inconceivable to many; some commentators dismiss such claims as a myth about the future development of AI (for example, Floridi 2016). However, prominent voices both inside and outside academia are taking this idea very seriously—in fact, so seriously that they fear the possible consequence of the so-called ‘existential risks’ such as the risk of human extinction. Among those voicing such fears are philosophers like Nick Bostrom and Toby Ord, but also prominent figures like Elon Musk and the late Stephen Hawking.

Authors discussing the idea of technological singularity differ in their views about what might lead to it. The famous futurist Ray Kurzweil is well-known for advocating the idea of singularity with exponentially increasing computing power, associated with ‘Moore’s law’, which points out that the computing power of transistors, at the time of writing, had been doubling every two years since the 1970s and could reasonably be expected to continue to do so in future (Kurzweil 2005). This approach sees the path to superintelligence as likely to proceed through a continuing improvement of the hardware Another take on what might lead to superintelligence—favoured by the well-known AI researcher Stuart Russell—focuses instead on algorithms. From Russell’s (2019) point of view, what is needed for singularity to occur are conceptual breakthroughs in such areas as the studies of language and common-sense processing as well as learning processes.

Researchers concerned with singularity approach the issue of what to do to guard humanity against such existential risks in several different ways, depending in part on what they think these existential risks depend on. Bostrom, for example, understands superintelligence as consisting of a maximally powerful capacity to achieve whatever aims might be associated with artificial intelligent systems. In his much-discussed example (Bostrom 2014), a super-intelligent machine threatens the future of human life by becoming optimally efficient at maximising the number of paper clips in the world, a goal whose achievement might be facilitated by removing human beings so as to make more space for paper clips. From this point of view, it is crucial to equip super-intelligent AI machines with the right goals, so that when they pursue these goals in maximally efficient ways, there is no risk that they will extinguish the human race along the way. This is one way to think about how to create a beneficial super-intelligence.

Russell (2019) presents an alternative picture, formulating three rules for AI design, which might perhaps be viewed as an updated version of or suggested replacement for Asimov’s fictional laws of robotics (see section 2.a.):

  1. The machine’s only objective is to maximise the realisation of human preferences.
  2. The machine is initially uncertain about what those preferences are.
  3. The ultimate source of information about human preferences is human behaviour.

The theories discussed in this section represent different ideas about what is sometimes called ‘value alignment’—that is, the concept that the goals and functioning of AI systems, especially super-intelligent future AI systems, should be properly aligned with human values. AI should be tracking human interests and values, and its functioning should benefit us and not lead to any existential risks, according to the ideal of value alignment. As noted in the beginning of this section, to some commentators, the idea that AI could become super-intelligent and pose existential threats is simply a myth that needs to be busted. But according to others, thinkers such as Toby Ord, AI is among the main reasons why humanity is in a critical period where its very future is at stake. According to such assessments, AI should be treated on a par with nuclear weapons and other potentially highly destructive technologies that put us all at great risk unless proper value alignment happens (Ord 2020).

A key problem concerning value alignment—especially if understood along the lines of Russell’s three principles—is whose values or preferences AI should be aligned with. As Iason Gabriel (2020) notes, reasonable people may disagree on what values and interests are the right ones with which to align the functioning of AI (whether super-intelligent or not). Gabriel’s suggestion for solving this problem is inspired by John Rawls’ (1999, 2001) work on ‘reasonable pluralism’. Rawls proposes that society should seek to identify ‘fair principles’ that could generate an overlapping consensus or widespread agreement despite the existence of more specific, reasonable disagreements about values among members of society. But how likely is it that this kind of convergence in general principles would find widespread support? (See section 3.)

h. Other Debates

In addition to the topics highlighted above, other issues that have not received as much attention are beginning to be discussed within AI ethics. Five such issues are discussed briefly below.

i. AI as a form of Moral Enhancement or a Moral Advisor

AI systems tend to be used as ‘recommender systems’ in online shopping, online entertainment (for example, music and movie streaming), and other realms. Some ethicists have discussed the advantages and disadvantages of AI systems whose recommendations could help us to make better choices and ones more consistent with our basic values. Perhaps AI systems could even, at some point, help us improve our values. Works on these and related questions include Borenstein and Arkin (2016), Giubilini et al. (2015, 2018), Klincewicz (2016), and O’Neill et al. (2021).

ii. AI and the Future of Work

Much discussion about AI and the future of work concerns the vital issue of whether AI and other forms of automation will cause widespread ‘technological unemployment’ by eliminating large numbers of human jobs that would be taken over by automated machines (Danaher 2019a). This is often presented as a negative prospect, where the question is how and whether a world without work would offer people any prospects for fulfilling and meaningful activities, since certain goods achieved through work (other than income) are hard to achieve in other contexts (Gheaus and Herzog 2016). However, some authors have argued that work in the modern world exposes many people to various kinds of harm (Anderson 2017). Danaher (2019a) examines the important question of whether a world with less work might actually be preferable. Some argue that existential boredom would proliferate if human beings can no longer find a meaningful purpose in their work (or even their life) because machines have replaced them (Bloch 1954). In contrast, Jonas (1984) criticises Bloch, arguing that boredom will not be a substantial issue at all. Another related issue—perhaps more relevant in the short and medium-term—is how we can make increasingly technologised work remain meaningful (Smids et al. 2020).

iii. AI and the Future of Personal Relationships

Various AI-driven technologies affect the nature of friendships, romances and other interpersonal relationships and could impact them even more in future. Online ‘friendships’ arranged through social media have been investigated by philosophers who disagree as to whether relationships that are partly curated by AI algorithms, could be true friendships (Cocking et al. 2012; McFall 2012; Kaliarnta 2016; Elder 2017). Some philosophers have sharply criticised AI-driven dating apps, which they think might reinforce negative stereotypes and negative gender expectations (Frank and Klincewicz 2018). In more science-fiction-like philosophising, which might nevertheless become increasingly present in real life, there has also been discussion about whether human beings could have true friendships or romantic relationships with robots and other artificial agents equipped with advanced AI (Levy 2008; Sullins 2012; Elder 2017; Hauskeller 2017; Nyholm and Frank 2017; Danaher 2019c; Nyholm 2020).

iv. AI and the Concern About Human ‘Enfeeblement’

If more and more aspects of our lives are driven by the recommendations of AI systems (since we do not understand its functioning and we might question the propriety of its functioning), the results could include ‘a crisis in moral agency’ (Danaher 2019d), human ‘enfeeblement’ (Russell 2019), or ‘de-skilling’ in different areas of human life (Vallor 2015, 2016). This scenario becomes even more likely should technological singularity be attained, because at that point all work, including all research and engineering, could be done by intelligent machines. After some generations, human beings might indeed be completely dependent on machines in all areas of life and unable to turn the clock back. This situation is very dangerous; hence it is of utmost importance that human beings remain skilful and knowledgeable while developing AI capacities.

v. Anthropomorphism

The very idea of artificial intelligent machines that imitate human thinking and behaviour might incorporate, according to some, a form of anthropomorphising that ought to be avoided. In other words, attributing humanlike qualities to machines that are not human might pose a problem. A common worry about many forms of AI technologies (or about how they are presented to the general public) is that they are deceptive (for example, Boden et al. 2017). Many have objected that companies tend to exaggerate the extent to which their products are based on AI technology. For example, several prominent AI researchers and ethicists have criticised the makers of Sophia the robot for falsely presenting her as much more humanlike than she really is (for example, Sharkey 2018; Bryson 2010, 2019), and as being designed to prompt anthropomorphising responses in human beings that are somehow problematic or unfitting. The related question of whether anthropomorphising responses to AI technologies are always problematic requires further consideration, which it is increasingly receiving (for example, Coeckelbergh 2010; Darling 2016, 2017; Gunkel 2018; Danaher 2020; Nyholm 2020; Smids 2020).

This list of emerging topics within AI ethics is not exhaustive, as the field is very fertile, with new issues arising constantly. This is perhaps the fastest-growing field within the study of ethics and moral philosophy.

3. Ethical Guidelines for AI

As a result of widespread awareness of and interest in the ethical issues related to AI, several influential institutions (including governments, the European Union, large companies and other associations) have already tasked expert panels with drafting policy documents and ethical guidelines for AI. Such documents have proliferated to the point at which it is very difficult to keep track of all the latest AI ethical guidelines being released. Additionally, AI ethics is receiving substantial funding from various public and private sources, and multiple research centres for AI ethics have been established. These developments have mostly received positive responses, but there have also been some worries about the so-called ‘ethics washing’—that is, giving an ethical stamp of approval to something that might be, from a more critical point of view, ethically problematic (compare Tigard 2020b)—along with concerns that some efforts may be relatively toothless or too centred on the West, ignoring non-Western perspectives on AI ethics. This section, before discussing such criticisms, reviews examples of already published ethical guidelines and considers whether any consensus can emerge between these differing guidelines.

An excellent resource in this context is the overview by Jobin et al. (2019), who conducted a substantial comparative review of 84 sets of ethical guidelines issued by national or international organisations from various countries. Jobin et al. found strong convergence around five key principles—transparency, justice and fairness, non-maleficence, responsibility, and privacy, among many. Their findings are reported here to illustrate the extent of this convergence on some (but not all) of the principles discussed in the original paper. The number on the left indicates the number of ethical guideline documents, among the 84 examined, in which a particular principle was prominently featured. The codes Jobin et al. used are included so that readers can see the basis for their classification.

Ethical principle Number of documents (N = 84) Codes included
Transparency 73 Transparency, explainability, explicability, understandability, interpretability, communication, disclosure
Justice and fairness 68 Justice, fairness, consistency, inclusion, equality, equity, (non-)bias, (non-)discrimination, diversity, plurality, accessibility, reversibility, remedy, redress, challenge, access, distribution
Non-maleficence 60 Non-maleficence, security, safety, harm, protection, precaution, integrity (bodily or mental), non-subversion
Responsibility 60 Responsibility, accountability, liability, acting with integrity
Privacy 47 Privacy, personal or private information
Beneficence 41 Benefits, beneficence, well-being, peace, social good, common good
Freedom and autonomy 34 Freedom, autonomy, consent, choice, self-determination, liberty, empowerment
Trust 28 Trust
Sustainability 14 Sustainability, environment (nature), energy, resources (energy)
Dignity 13 Dignity
Solidarity 6 Solidarity, social security, cohesion

The review conducted by Jobin et al. (2019) reveals, at least with respect to the first five principles on the list, a significant degree of overlap in these attempts to create ethical guidelines for AI (see Gabriel 2020). On the other hand, the last six items on the list (beginning with beneficence) appeared as key principles in fewer than half of the documents studied. Relatedly, researchers working on the ‘moral machine’ research project, which examined people’s attitudes as to what self-driving cars should be programmed to do in various crash dilemma scenarios, also found great variation, including cross-cultural variation (Awad et al. 2018).

These ethical guidelines have received a fair amount of criticism—both in terms of their content and with respect to how they were created (for example, Metzinger 2019). For Metzinger, the very idea of ‘trustworthy AI’ is ‘nonsense’ since only human beings and not machines can be, or fail to be, trustworthy. Furthermore, the EU high-level expert group on AI had very few experts from the field of ethics but numerous industry representatives, who had an interest in toning down any ethical worries about the AI industry. In addition, the EU document ‘Ethical Guidelines for Trustworthy AI’ uses vague and non-confrontational language. It is, to use the term favoured by Resseguier and Rodrigues (2020), a mostly ‘toothless’ document. The EU ethical guidelines that industry representatives have supposedly made toothless illustrate the concerns raised about the possible ‘ethics washing’.

Another point of criticism regarding these kinds of ethical guidelines is that many of the expert panels drafting them are non-inclusive and fail to take non-Western (for example, African and Asian) perspectives on AI and ethics into account. Therefore, it would be important for future versions of such guidelines—or new ethical guidelines—to include non-Western contributions. Notably, in academic journals that focus on the ethics of technology, there has been modest progress towards publishing more non-Western perspectives on AI ethics—for example, applying Dao (Wong 2012), Confucian virtue-ethics perspectives (Jing and Doorn 2020), and southern African relational and communitarian ethics perspectives including the ‘ubuntu’ philosophy of personhood and interpersonal relationships (see Wareham 2020).

4. Conclusion

The ethics of AI has become one of the liveliest topics in philosophy of technology. AI has the potential to redefine our traditional moral concepts, ethical approaches and moral theories. The advent of artificial intelligent machines that may either match or supersede human capabilities poses a big challenge to humanity’s traditional self-understanding as the only beings with the highest moral status in the world. Accordingly, the future of AI ethics is unpredictable but likely to offer considerable excitement and surprise.

5. References and Further Reading

  • Agar, N. (2020). How to Treat Machines That Might Have Minds. Philosophy & Technology, 33(2): 269–82.
  • Altman, M. C. (2011). Kant and Applied Ethics: The Uses and Limits of Kant’s Practical Philosophy. Malden, NJ: Wiley-Blackwell.
  • Anderson, E. (2017). Private Government: How Employers Rule Our Lives (and Why We Don’t Talk about It). Princeton, NJ: Princeton University Press.
  • Anderson, M., and Anderson, S. (2011). Machine Ethics. Cambridge: Cambridge University Press.
  • Anderson, S. L. (2011). Machine Metaethics. In M. Anderson and S. L. Anderson (Eds.), Machine Ethics, 21–27. Cambridge: Cambridge University Press.
  • Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2016). Machine Bias. In ProPublica, May 23. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
  • Arkin, R. (2009). Governing Lethal Behavior in Autonomous Robots. Boca Raton, FL: CRC Press.
  • Arkin, R. (2010). The Case for Ethical Autonomy in Unmanned Systems. Journal of Military Ethics, 9(4), 332–41.
  • Asimov, I. (1942). Runaround: A Short Story. New York: Street and Smith.
  • Asimov, I. (1986). Robots and Empire: The Classic Robot Novel. New York: HarperCollins.
  • Atapattu, S. (2015). Human Rights Approaches to Climate Change: Challenges and Opportunities. New York: Routledge.
  • Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., Bonnefon, J. F., and Rahwan, I. (2018). The Moral Machine Experiment. Nature, 563, 59–64.
  • Bloch, E. (1985/1954). Das Prinzip Hoffnung, 3 vols. Frankfurt am Main: Suhrkamp.
  • Boden, M., Bryson, J., Caldwell, D., Dautenhahn, K., Edwards, L., Kember, S., Newman, P., Parry, V., Pegman, G., Rodden, T., Sorell, T., Wallis, M., Whitby, B., and Winfield, A. (2017). Principles of Robotics: Regulating Robots in the Real World. Connection Science, 29(2), 124–29.
  • Borenstein, J. and Arkin, R. (2016). Robotic Nudges: The Ethics of Engineering a More Socially Just Human Being. Science and Engineering Ethics, 22, 31–46.
  • Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford: Oxford University Press.
  • Bryson, J. (2010). Robots Should Be Slaves. In Y. Wilks (Ed.), Close Engagements with Artificial Companions, 63–74. Amsterdam: John Benjamins.
  • Bryson, J. (2012). A Role for Consciousness in Action Selection. International Journal of Machine Consciousness, 4(2), 471–82.
  • Bryson, J. (2019). Patiency Is Not a Virtue: The Design of Intelligent Systems and Systems of Ethics. Ethics and Information Technology, 20(1), 15–26.
  • Buolamwini, J., and Gebru, T. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability, and Transparency. PMLR, 81, 77–91.
  • Čapek, K. (1920). Rossum’s Universal Robots. Adelaide: The University of Adelaide.
  • Cavalieri, P. (2001). The Animal Question: Why Non-Human Animals Deserve Human Rights. Oxford: Oxford University Press.
  • Chalmers, D. (1996). The Conscious Mind: In Search of a Fundamental Theory. New York/Oxford: Oxford University Press.
  • Chalmers, D. (2010). The Singularity: A Philosophical Analysis. Journal of Consciousness Studies, 17, 7–65.
  • Cocking, D., Van Den Hoven, J., and Timmermans, J. (2012). Introduction: One Thousand Friends. Ethics and Information Technology, 14, 179–84.
  • Coeckelbergh, M. (2010). Robot Rights? Towards a Social-Relational Justification of Moral Consideration. Ethics and Information Technology, 12(3), 209–21.
  • Coeckelbergh, M. (2014). The Moral Standing of Machines: Towards a Relational and Non- Cartesian Moral Hermeneutics. Philosophy & Technology, 27(1), 61–77.
  • Coeckelbergh, M. (2020). AI Ethics. Cambridge, MA and London: MIT Press.
  • Copeland, B. J. (2020). Artificial Intelligence. Britannica.com. Retrieved from https://www.britannica.com/technology/artificial-intelligence.
  • Danaher, J. (2016a). Robots, Law, and the Retribution Gap. Ethics and Information Technology, 18(4), 299–309.
  • Danaher, J. (2016b). The Threat of Algocracy: Reality, Resistance and Accommodation. Philosophy & Technology, 29(3), 245–68.
  • Danaher, J. (2019a). Automation and Utopia. Cambridge, MA: Harvard University Press.
  • Danaher, J. (2019b). Escaping Skinner’s Box: AI and the New Era of Techno-Superstition. Philosophical Disquisitions blog: https://philosophicaldisquisitions.blogspot.com/2019/10/escaping-skinners-box-ai-and-new-era-of.html.
  • Danaher, J. (2019c). The Philosophical Case for Robot Friendship. Journal of Posthuman Studies, 3(1), 5–24.
  • Danaher, J. (2019d). The Rise of the Robots and the Crises of Moral Patiency. AI & Society, 34(1), 129–36.
  • Danaher, J. (2020). Welcoming Robots into the Moral Circle? A Defence of Ethical Behaviourism. Science and Engineering Ethics, 26(4), 2023–49.
  • Danaher, J., and Robbins, S. (2020). Should AI Be Explainable? Episode 77 of the Philosophical Disquisitions Podcast: https://philosophicaldisquisitions.blogspot.com/2020/07/77-should-ai-be-explainable.html.
  • Daniels, J., Nkonde, M. and Mir, D. (2019). Advancing Racial Literacy in Tech. https://datasociety.net/output/advancing-racial-literacy-in-tech/.
  • Darling, K. (2016). Extending Legal Protection to Social Robots: The Effects of Anthro- pomorphism, Empathy, and Violent Behavior towards Robotic Objects. In R. Calo, A. M. Froomkin and I. Kerr (eds.), Robot Law, 213–34. Cheltenham: Edward Elgar.
  • Darling, K. (2017). “Who’s Johnny?” Anthropological Framing in Human-Robot Interaction, Integration, and Policy. In P. Lin, K. Abney and R. Jenkins (Eds.), Robot Ethics 2.0: From Autonomous Cars to Artificial Intelligence, 173–92. Oxford: Oxford University Press.
  • Dastin, J. (2018). Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women. Reuters, October 10. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G.
  • Dehghani, M., Forbus, K., Tomai, E., and Klenk, M. (2011). An Integrated Reasoning Approach to Moral Decision Making. In M. Anderson and S. L. Anderson (Eds.), Machine Ethics, 422–41. Cambridge: Cambridge University Press.
  • Dignum, V. (2019). Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. Berlin: Springer.
  • Dobbe, R., Dean, S., Gilbert, T., and Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. In 2018 Workshop on Fairness, Accountability and Transparency in Machine Learning during ICMI, Stockholm, Sweden (July 18 version). https://arxiv.org/abs/1807.00553.
  • Elamrani, A., and Yampolskiy, R. (2018). Reviewing Tests for Machine Consciousness. Journal of Consciousness Studies, 26(5–6), 35–64.
  • Elder, A. (2017). Friendship, Robots, and Social Media: False Friends and Second Selves. London: Routledge.
  • Floridi, L. (2016). Should We Be Afraid of AI? Machines Seem to Be Getting Smarter and Smarter and Much Better at Human Jobs, yet True AI Is Utterly Implausible. Why? Aeon, May 9. https://aeon.co/essays/true-ai-is-both-logically-possible-and-utterly-implausible.
  • Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Vayena, E. (2018). AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
  • Francione, G. L. (2009). Animals as Persons. Essay on the Abolition of Animal Exploitation. New York: Columbia University Press.
  • Frank, L., and Klincewicz, M. (2018): Swiping Left on the Quantified Relationship: Exploring the Potential Soft Impacts. American Journal of Bioethics, 18(2), 27–28.
  • Gabriel, I. (2020). Artificial Intelligence, Values, and Alignment. Minds and Machines, available online at https://link.springer.com/article/10.1007/s11023-020-09539-2.
  • Gheaus, A., and Herzog, L. (2016). Goods of Work (Other than Money!). Journal of Social Philosophy, 47(1), 70–89.
  • Giubilini, A., and Savulescu, J. (2018). The Artificial Moral Advisor: The “Ideal Observer” Meets Artificial Intelligence. Philosophy & Technology, 1–20. https://doi.org/10.1007/s13347-017-0285-z.
  • Gogoll, J., and Müller, J. F. (2017). Autonomous Cars: In Favor of a Mandatory Ethics Setting. Science and Engineering Ethics, 23(3), 681–700.
  • Good, I. J. (1965). Speculations Concerning the First Ultraintelligent Machine. In F. Alt and M. Rubinoff (Eds.), Advances in Computers, vol. 6, 31–88. Cambridge, MA: Academic Press.
  • Goodall, N. J. (2014). Ethical Decision Making during Automated Vehicle Crashes. Transportation Research Record: Journal of the Transportation Research Board, 2424, 58–65.
  • Goodall, N. J. (2016). Away from Trolley Problems and Toward Risk Management. Applied Artificial Intelligence, 30(8), 810–21.
  • Gordon, J.-S. (2020a). Building Moral Machines: Ethical Pitfalls and Challenges. Science and Engineering Ethics, 26, 141–57.
  • Gordon, J.-S. (2020b). What Do We Owe to Intelligent Robots? AI & Society, 35, 209–23.
  • Gordon, J.-S. (2020c). Artificial Moral and Legal Personhood. AI & Society, online first at https://link.springer.com/article/10.1007%2Fs00146-020-01063-2.
  • Guarini, M. (2006). Particularism and the Classification and Reclassification of Moral Cases. IEEE Intelligent Systems, 21(4), 22–28.
  • Gunkel, D. J., and Bryson, J. (2014). The Machine as Moral Agent and Patient. Philosophy & Technology, 27(1), 5–142.
  • Gunkel, D. (2012). The Machine Question. Critical Perspectives on AI, Robots, and Ethics. Cambridge, MA: MIT Press.
  • Gunkel, D. (2018). Robot Rights. Cambridge, MA: MIT Press.
  • Gurney, J. K. (2016). Crashing into the Unknown: An Examination of Crash-Optimization Algorithms through the Two Lanes of Ethics and Law. Alabama Law Review, 79(1), 183–267.
  • Himmelreich, J. (2018). Never Mind the Trolley: The Ethics of Autonomous Vehicles in Mundane Situations. Ethical Theory and Moral Practice, 21(3), 669–84.
  • Himma, K., and Tavani, H. (2008). The Handbook of Information and Computer Ethics. Hoboken, NJ: Wiley.
  • Jobin, A., Ienca, M., and Vayena, E. (2019). The Global Landscape of AI Ethics Guidelines. Nature Machine Intelligence, 1(9), 389–399.
  • Johnson, D. (1985/2009). Computer Ethics, 4th ed. New York: Pearson.
  • Johnson, D., and Nissenbaum, H. (1995). Computing, Ethics, and Social Values. Englewood Cliffs, NJ: Prentice Hall.
  • Jonas, H. (2003/1984). Das Prinzip Verantwortung. Frankfurt am Main: Suhrkamp.
  • Kaliarnta, S. (2016). Using Aristotle’s Theory of Friendship to Classify Online Friendships: A Critical Counterpoint. Ethics and Information Technology, 18(2), 65–79.
  • Kamm, F. (2007). Intricate ethics: Rights, responsibilities, and permissible harm. Oxford, UK: Oxford University Press.
  • Kamm, F. (2020). The Use and Abuse of the Trolley Problem: Self-Driving Cars, Medical Treatments, and the Distribution of Harm. In S. M. Liao (Ed.) The Ethics of Artificial Intelligence, 79–108. New York: Oxford University Press.
  • Kant, I. (1980). Lectures on Ethics, trans. Louis Infield, Indianapolis, IN: Hackett Publishing Company.
  • Kant, I. (2009). Groundwork of the Metaphysic of Morals. New York: Harper Perennial Modern Classics.
  • Keeling, G. (2020). The Ethics of Automated Vehicles. PhD Dissertation, University of Bristol. https://research-information.bris.ac.uk/files/243368588/Pure_Thesis.pdf.
  • Kissinger, H. A. (2018). How the Enlightenment Ends: Philosophically, Intellectually—in Every Way—Human Society Is Unprepared for the Rise of Artificial Intelligence. The Atlantic, June. https://www.theatlantic.com/magazine/archive/2018/06/henry-kissinger-ai-could-mean-the-end-of-human-history/559124/.
  • Klincewicz, M. (2016). Artificial Intelligence as a Means to Moral Enhancement. In Studies in Logic, Grammar and Rhetoric. https://doi.org/10.1515/slgr-2016-0061.
  • Klincewicz, M. (2015). Autonomous Weapons Systems, the Frame Problem and Computer Security. Journal of Military Ethics, 14(2), 162–76.
  • Kraemer, F., Van Overveld, K., and Peterson, M. (2011). Is There an Ethics of Algorithms? Ethics and Information Technology, 13, 251–60.
  • Kurzweil, R. (2005). The Singularity Is Near. London: Penguin Books.
  • Levy, D. (2008). Love and Sex with Robots. London: Harper Perennial.
  • Lin, P. (2015). Why Ethics Matters for Autonomous Cars. In M. Maurer, J. C. Gerdes, B. Lenz and H. Winner (Eds.), Autonomes Fahren: Technische, rechtliche und gesellschaftliche Aspekte, 69–85. Berlin: Springer.
  • Lin, P., Abney, K. and Bekey, G. A. (Eds). (2014). Robot Ethics: The Ethical and Social Implications of Robotics. Intelligent Robotics and Autonomous Agents. Cambridge, MA and London: MIT Press.
  • Lin, P., Abney, K. and Jenkins, R. (Eds.) (2017). Robot Ethics 2.0: From Autonomous Cars to Artificial Intelligence. New York: Oxford University Press.
  • Loh, J. (2019). Roboterethik. Eine Einführung. Frankfurt am Main: Suhrkamp.
  • Ludwig, S. (2015). Credit Scores in America Perpetuate Racial Injustice: Here’s How. The Guardian, October 13. https://www.theguardian.com/commentisfree/2015/oct/13/your-credit-score-is-racist-heres-why.
  • Marchese, K. (2020). Japanese Scientists Develop “Blade Runner” Robot That Can Feel Pain. Design Boom, February 24. https://www.designboom.com/technology/japanese-scientists-develop-hyper-realistic-robot-that-can-feel-pain-02-24-2020/.
  • Matthias, A. (2004). The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata. Ethics and Information Technology, 6(3), 175–83.
  • McCarthy, J., Minsky, M. L., Rochester, N. and Shannon, C. E. (1955). A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence.http://raysolomonoff.com/dartmouth/boxa/dart564props.pdf.
  • McFall, M. T. (2012). Real Character-Friends: Aristotelian Friendship, Living Together, And Technology. Ethics and Information Technology, 14, 221–30.
  • Metzinger, T. (2013), Two Principles for Robot Ethics. In E. Hilgendorf and J.-P. Günther (Eds.), Robotik und Gesetzgebung, 263–302. Baden-Baden: Nomos.
  • Metzinger, T. (2019). Ethics Washing Made in Europe. Der Tagesspiegel. https://www.tagesspiegel.de/politik/eu-guidelines-ethics-washing-made-in-europe/24195496.html.
  • Misselhorn, C. (2018). Grundfragen der Maschinenethik. Stuttgart: Reclam.
  • Mittelstadt, B., Allo, P., Taddeo, M., Wachter, S. and Floridi, L. (2016). The Ethics of Algorithms: Mapping the Debate. 3(2). https://journals.sagepub.com/doi/full/10.1177/2053951716679679.
  • Mosakas, K. (2020). On the Moral Status of Social Robots: Considering the Consciousness Criterion. AI & Society, online first at https://link.springer.com/article/10.1007/s00146-020-01002-1.
  • Müller, V. C., and Simpson, T. W. (2014). Autonomous Killer Robots Are Probably Good News. Frontiers in Artificial Intelligence and Applications, 273, 297–305.
  • Müller, V. C. (2020). Ethics of Artificial Intelligence and Robotics. Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/ethics-ai/.
  • Nyholm, S. (2018a). Attributing Agency to Automated Systems: Reflections on Human-Robot Collaborations and Responsibility-Loci. Science and Engineering Ethics, 24(4), 1201–19.
  • Nyholm, S. (2018b). The Ethics of Crashes with Self-Driving Cars: A Roadmap, I. Philosophy Compass, 13(7), e12507.
  • Nyholm, S. (2018c). The Ethics of Crashes with Self-Driving Cars, A Roadmap, II. Philosophy Compass, 13(7), e12506.
  • Nyholm, S. (2020). Humans and Robots: Ethics, Agency, and Anthropomorphism. London: Rowman and Littlefield.
  • Nyholm, S., and Frank. L. (2017). From Sex Robots to Love Robots: Is Mutual Love with a Robot Possible? In J. Danaher and N. McArthur, Robot Sex: Social and Ethical Implications. Cambridge, MA: MIT Press.
  • Nyholm, S., and Frank, L. (2019). It Loves Me, It Loves Me Not: Is It Morally Problematic to Design Sex Robots That Appear to Love Their Owners? Techne: Research in Philosophy and Technology, 23(3), 402–24.
  • Nyholm, S., and Smids, J. (2016). The Ethics of Accident-Algorithms for Self-Driving Cars: An Applied Trolley Problem? Ethical Theory and Moral Practice, 19(5), 1275–89.
  • Okyere-Manu, B. (Ed.) (2021). African Values, Ethics, and Technology: Questions, Issues, and Approaches. London: Palgrave MacMillan.
  • O’Neil, C. (2016). Weapons of Math Destruction. London: Allen Lane.
  • O’Neill, E., Klincewicz, M. and Kemmer, M. (2021). Ethical Issues with Artificial Ethics Assistants. In C. Veliz (Ed.), Oxford Handbook of Digital Ethics. Oxford: Oxford University Press.
  • Ord, T. (2020): The Precipice: Existential Risk and the Future of Humanity. London: Hachette Books.
  • Picard, R. (1997). Affective Computing. Cambridge, MA and London: MIT Press.
  • Purves, D., Jenkins, R. and Strawser, B. J. (2015). Autonomous Machines, Moral Judgment, and Acting for the Right Reasons. Ethical Theory and Moral Practice, 18(4), 851–72.
  • Rawls, J. (1999). The Law of Peoples, with The Idea of Public Reason Revisited. Cambridge, MA: Harvard University Press.
  • Rawls, J. (2001). Justice as Fairness: A Restatement. Cambridge, MA: Harvard University Press.
  • Resseguier, A., and Rodrigues, R. (2020). AI Ethics Should Not Remain Toothless! A Call to Bring Back the Teeth of Ethics. Big Data & Society, online first at https://journals.sagepub.com/doi/full/10.1177/2053951720942541.
  • Richardson, K. (2019). Special Issue: Ethics of AI and Robotics. AI & Society, 34(1).
  • Robbins, S. (2019). A Misdirected Principle with a Catch: Explicability for AI. Minds and Machines, 29(4), 495–514.
  • Royakkers, L., and van Est, R. (2015). Just Ordinary Robots: Automation from Love to War. Boca Raton, FL: CRC Press.
  • Russell, S. (2019). Human Compatible. New York: Viking Press.
  • Ryan, M., and Stahl, B. (2020). Artificial Intelligence Guidelines for Developers and Users: Clarifying Their Content and Normative Implications. Journal of Information, Communication and Ethics in Society, online first at https://www.emerald.com/insight/content/doi/10.1108/JICES-12-2019-0138/full/html
  • Santoni de Sio, F., and Van den Hoven, J. (2018). Meaningful Human Control over Autonomous Systems: A Philosophical Account. Frontiers in Robotics and AI. https://www.frontiersin.org/articles/10.3389/frobt.2018.00015/full.
  • Savulescu, J., and Maslen, H. (2015). Moral Enhancement and Artificial Intelligence: Moral AI? In Beyond Artificial Intelligence, 79–95. Springer.
  • Schwitzgebel, E., and Garza, M. (2015). A Defense of the Rights of Artificial Intelligences. Midwest Studies in Philosophy, 39(1), 98–119.
  • Searle, J. R. (1980). Minds, Brains, and Programs. Behavioural and Brain Sciences, 3(3), 417–57.
  • Sharkey, Noel (2018), Mama Mia, It’s Sophia: A Show Robot or Dangerous Platform to Mislead? Forbes, November 17. https://www.forbes.com/sites/noelsharkey/2018/11/17/mama-mia-its- sophia-a-show-robot-or-dangerous-platform-to-mislead/#407e37877ac9.
  • Singer, P. (1975). Animal liberation. London, UK: Avon Books.
  • Singer, P. (2009). Speciesism and Moral Status. Metaphilosophy, 40(3–4), 567–81.
  • Smids, J. (2020). Danaher’s Ethical Behaviourism: An Adequate Guide to Assessing the Moral Status of a Robot? Science and Engineering Ethics, 26(5), 2849–66.
  • Smids, J., Nyholm, S. and Berkers, H. (2020). Robots in the Workplace: A Threat to—or Opportunity for—Meaningful Work? Philosophy & Technology, 33(3), 503–22.
  • Sparrow, R. (2007). Killer Robots. Journal of Applied Philosophy, 24(1), 62–77.
  • Springer, A., Garcia-Gathright, J. and Cramer, H. (2018). Assessing and Addressing Algorithmic Bias – But Before We Get There. In 2018 AAAI Spring Symposium Series, 450–54. https://www.aaai.org/ocs/index.php/SSS/SSS18/paper/viewPaper/17542.
  • Stone, C. D. (1972). Should Trees Have Standing? Toward Legal Rights for Natural Objects. Southern California Law Review, 45, 450–501.
  • Stone, C. D. (2010). Should Trees Have Standing? Law, Morality and the Environment. Oxford: Oxford University Press.
  • Strawser, B. J. (2010). Moral Predators: The Duty to Employ Uninhabited Aerial Vehicles. Journal of Military Ethics, 9(4), 342–68.
  • Sullins, J. (2012), Robots, Love, and Sex: The Ethics of Building a Love Machine. IEEE Transactions on Affective Computing, 3(4), 398–409.
  • Sweeney, L. (2013). Discrimination in Online Ad Delivery. Acmqueue, 11(3), 1–19.
  • Tigard, D. (2020a). There is No Techno-Responsibility Gap. Philosophy & Technology, online first at https://link.springer.com/article/10.1007/s13347-020-00414-7.
  • Tigard, D. (2020b). Responsible AI and Moral Responsibility: A Common Appreciation. AI and Ethics, online first at https://link.springer.com/article/10.1007/s43681-020-00009-0.
  • Timmermann, J. (2020). Kant’s “Groundwork of the Metaphysics of Morals”: A Commentary. Cambridge: Cambridge University Press.
  • Turing, A. (1950). Computing Machinery and Intelligence. Mind, 59(236), 433–60.
  • Vallor, S. (2015). Moral Deskilling and Upskilling in a New Machine Age: Reflections on the Ambiguous Future of Character. Philosophy & Technology, 28(1), 107–24.
  • Vallor, S. (2016). Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. New York: Oxford University Press.
  • Veale, M., and Binns, R. (2017). Fairer Machine Learning in the Real World: Mitigating Discrimination without Collecting Sensitive Data. Big Data & Society, 4(2).
  • Vinge, V. (1983). First Word. Omni, January, 10.
  • Vinge, V. (1993). The Coming Technological Singularity. How to Survive in the Post-Human Era. Whole Earth Review, Winter.
  • Wachter, S., Mittelstadt, B. and Russell, C. (2018). Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology, 31(2), 841–87.
  • Wallach, W., and Allen, C. (2010). Moral Machines. Teaching Robots Right from Wrong. Oxford: Oxford University Press.
  • Wallach, W., Franklin, S. and Allen, C. (2010). A Conceptual and Computational Model of Moral Decision Making in Human and Artificial Agents. Topics in Cognitive Science, 2(3), 454–85.
  • Wang, Y., and Kosinski, M. (2018). Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation from Facial Images, Journal of Personality and Social Psychology, 114(2), 246–57.
  • Wareham, C. S. (2020): Artificial Intelligence and African Conceptions of Personhood. Ethics and Information Technology, online first at https://link.springer.com/article/10.1007/s10676-020-09541-3
  • Wong, P. H. (2012). Dao, Harmony, and Personhood: Towards a Confucian Ethics of Technology. Philosophy & Technology, 25(1), 67–86.
  • World Economic Forum and Global Future Council on Human Rights (2018). How to Prevent Discriminatory Outcomes in Machine Learning (white paper). http://www3.weforum.org/docs/WEF_40065_White_Paper_How_to_Prevent_Discriminatory_Outcomes_in_Machine_Learning.pdf.

 

Author Information

John-Stewart Gordon
Email: johnstgordon@pm.me
Vytautas Magnus University
Lithuania

and

Sven Nyholm
Email: s.r.nyholm@uu.nl
The University of Utrecht
The Netherlands

Abstractionism in Mathematics

Abstractionism is a philosophical account of the ontology of mathematics according to which abstract objects are grounded in a process of abstraction (although not every view that places abstraction front and center is a version of abstractionism, as we shall see). Abstraction involves arranging a domain of underlying objects into classes and then identifying an object corresponding to each class—the abstract of that class. While the idea that the ontology of mathematics is obtained, in some sense, via abstraction has its origin in ancient Greek thought, the idea found new life, and a new technical foundation, in the late 19th century due to pioneering work by Gottlob Frege. Although Frege’s project ultimately failed, his central ideas were reborn in the late 20th century as a view known as neo-logicism.

This article surveys abstractionism in five stages. §1 looks at the pre-19th century history of abstraction and its role in the philosophy of mathematics. §2 takes some time to carefully articulate what, exactly, abstractionism is, and to provide a detailed description of the way that abstraction is formalized, within abstractionist philosophy of mathematics, using logical formulas known as abstraction principles. §3 looks at the first fully worked out version of abstractionism—Frege’s logicist reconstruction of mathematics—and explores the various challenges that such a view faces. The section also examines the fatal flaw in Frege’s development of this view: Russell’s paradox. §4 presents a survey of the 20th century neo-logicist revival of Frege’s abstractionist program, due to Crispin Wright and Bob Hale, and carefully explicates the way in which this new version of an old idea deals with various puzzles and problems. Finally, §5 takes a brief tour of a re-development of Frege’s central ideas: Øystein Linnebo’s dynamic abstractionist account.

Table of Contents

  1. A Brief History of Abstractionism
  2. Defining Abstractionism
  3. Frege’s Logicism
    1. Hume’s Principle and Frege’s Theorem
    2. Hume’s Principle and the Caesar Problem
    3. Hume’s Principle and Basic Law V
    4. Basic Law V and Russell’s Paradox
  4. Neo-Logicism
    1. Neo-Logicism and Comprehension
    2. Neo-Logicism and the Bad Company Problem
    3. Extending Neo-Logicism Beyond Arithmetic
    4. Neo-Logicism and the Caesar Problem
  5. Dynamic Abstraction
  6. References and Further Reading

1. A Brief History of Abstractionism

Abstractionism, very broadly put, is a philosophical account of the epistemology and metaphysics of mathematics (or of abstract objects more generally) according to which the nature of, and our knowledge of, the subject matter of mathematics is grounded in abstraction. More is said about the sort of abstraction that is at issue in abstractionist accounts of the foundations of mathematics below (and, in particular, more about why not every view that involves abstraction is an instance of abstractionism in the sense of the term used here), but first, something needs to be said about what, exactly, abstraction is.

Before doing so, a bit of mathematical machinery is required. Given a domain of entities \Delta (these could be objects, or properties, or some other sort of “thing”), one says that a relation R is an equivalence relation on \Delta if and only if the following three conditions are met:

  1. R is reflexive (on \Delta):

For any \alpha in \Delta, R(\alpha, \alpha).

  1. R is symmetric (on \Delta):

For any \alpha, \beta in \Delta, if R(\alpha, \beta) then R(\beta, \alpha).

  1. R is transitive (on \Delta):

For any \alpha, \beta, \delta in \Delta, if R(\alpha, \beta) and R(\beta, \delta), then R(\alpha, \delta).

Intuitively, an equivalence relation R partitions a collection of entities \Delta into sub-collections X_1, X_2, \dots, where each X_i is a subset of \Delta; the X_is are exclusive (no entity in \Delta is a member of more than one of the classes X_1, X_2, \dots); the X_is are exhaustive (every entity in \Delta is in one of the classes X_1, X_2, \dots); and an object x in one of the sub-collections X_i is related by R to every other object in that same sub-collection, and is related by R to no other objects in \Delta. The classes X_i are known as the equivalence classes generated by R on \Delta.

Abstraction is a process that begins via the identification of an equivalence relation on a class of entities—that is, a class of objects (or properties, or other sorts of “thing”) is partitioned into equivalence classes based on some shared trait. To make things concrete, let us assume that the class with which we begin is a collection of medium-sized physical objects, and let us divide this class into sub-classes of objects based on whether they are the same size (that is, the equivalence relation in question is sameness of size). We then (in some sense) abstract away the particular features of each object that distinguishes it from the other objects in the same equivalence class, identifying (or creating?) an object (the abstract) corresponding to each equivalence class (and hence corresponding to or codifying the trait had in common by all and only the members of that equivalence class). Thus, in our example, we abstract away all properties, such as color, weight, or surface texture, that vary amongst objects in the same equivalence class. The novel objects arrived at by abstraction—sizes—capture what members of each equivalence class have in common, and thus we obtain a distinct size corresponding to each equivalence class of same-sized physical objects.

Discussions of abstraction, and of the nature of the abstracts so obtained, can be found throughout the history of Western philosophy, going back to Aristotle’s Prior Analytics (Aristotle 1975). Another well-discussed ancient example is provided by (one way of interpreting) Definition 5 of Book V of Euclid’s Elements. In that definition, Euclid introduces the notion of ratio as follows:

Magnitudes are said to be in the same ratio, the first to the second and the third to the fourth, when, if any equimultiples whatever be taken of the first and third, and any equimultiples whatever of the second and fourth, the former equimultiples alike exceed, are alike equal to, or alike fall short of, the latter equimultiples respectively taken in corresponding order. (Euclid 2012, V.5)

Simply put, Euclid is introducing a complicated equivalence relation:

being in the same ratio

that holds (or not) between pairs of magnitudes. Two pairs of magnitudes (a, b) and (c, d) stand in the being in the same ratio relation if and only if, for any numbers e and f we have:

a \times e > b \times f if and only if c \times e > d \times f;

a \times e = b \times f if and only if c \times e = d \times f;

a \times e < b \times f if and only if c \times e < d \times f.

Taken literally, it is not clear that Euclid’s Definition 5 is a genuine instance of the process of abstraction, since Euclid does not seem to explicitly take the final step: introducing individual objects—that is, ratios—to “stand for” the relationship that holds between pairs of magnitudes that instantiate the being in the same ratio relation. But, to take that final step, we need merely introduce the following (somewhat more modern) notation:

a : b = c : d

where a : b = c : d if and only if a and b stand in the same ratio to one another as c and d. If we take the logical form of this equation at face value—that is, as asserting the identity of the ratio a : b and the ratio c : dthen we now have our new objects, ratios, and the process of abstraction is complete.

We can (somewhat anachronistically, but nevertheless helpfully) reformulate this reconstruction of the abstraction involved in the introduction of ratios as the following Ratio Principle:

    \begin{align*}{\sf RP}: (\forall a)(\forall b)(\forall c)(\forall d)[a : b = c : d \leftrightarrow (\forall e)(\forall f)(&(a \times e > b \times f \leftrightarrow c \times e > d \times f) \\ \land \ &(a \times e = b \times f \leftrightarrow c \times e = d \times f) \\ \land \ &(a \times e < b \times f \leftrightarrow c \times e < d \times f))] \end{align*}

The new objects, ratios, are introduced in the identity statement on the left-hand side of the biconditional, and their behavior (in particular, identity conditions for ratios) is governed by the equivalence relation occurring on the right-hand side of the biconditional.

As this discussion of Euclid illustrates, it is often unclear (especially prior to the late 19th century, see below) whether a particular definition or discussion is meant to be an application of abstraction, since it is unclear which of the following is intended:

  1. The definition or discussion merely introduces a new relation that holds between various sorts of object (for example, it introduces the relation being in the same ratio), but does nothing more.
  2. The definition or discussion is meant to explicate the relationships that hold between previous identified and understood objects (for example, it precisely explains when two ratios are identical, where it is assumed that we already know, in some sense, what ratios are).
  3. The definition or discussion is meant to introduce a new sort of object defined in terms of a relation that holds between objects of a distinct, and previously understood, sort (for example, it introduces ratios as novel objects obtained via application of the process of abstraction to the relation being in the same ratio).

Only the last of these counts as abstraction, properly understood (at least, in terms of the understanding of abstraction mobilized in the family of views known as abstractionism).

With regard to those cases that are explicit applications of abstraction—that is, cases where an equivalence relation on a previously understood class of entities is used to introduce new objects (abstracts) corresponding to the resulting equivalence classes—there are three distinct ways that the objects so introduced can be understood:

  1. The abstract corresponding to each equivalence class is identified with a canonical representative member of that equivalence class (for example, we identify the ratio 1 : 2 with the particular pair of magnitudes ⟨1 meter, 2 meters⟩).
  2. The abstract corresponding to each equivalence class is identified with that equivalence class (for example, we identity the ratio 1 : 2 with the equivalence class of pairs of magnitudes that are in the same ratio as ⟨1 meter, 2 meters⟩).
  3. The abstract corresponding to each equivalence class is taken to be a novel abstract.

Historically, uses of abstraction within number theory have taken the first route, since the abstract corresponding to an equivalence class of natural numbers (or of any sub-collection of a collection of mathematical objects with a distinguished well-ordering) can always be taken to be the least number in that equivalence class. Somewhat surprisingly, perhaps, the second option—the identification of abstracts with the corresponding equivalence classes themselves—was somewhat unusual before Frege’s work. The fact that it remains unusual after Frege’s work, however, is less surprising, since the dangers inherent in this method were made clear by the set-theoretic paradoxes that plagued his work. The third option—taking the abstracts to be novel abstract objects—was relatively common within geometry by the 19th century, and it is this method that is central to the philosophical view called neo-logicism, discussed in §4 below.

This brief summary of the role of abstraction in the history of mathematics barely scratches the surface, of course, and the reader interested in a more detailed presentation of the history of abstraction prior to Frege’s work is encouraged to consult the early chapters of the excellent (Mancosu 2016). But it is enough for our purposes, since our primary target is not abstraction in general, but its use in abstractionist approaches to the philosophy of mathematics (and, as noted earlier, of abstract objects more generally).

2. Defining Abstractionism

Abstractionism, as we will understand the term here, is an account of the foundations of mathematics that involves the use of abstraction principles (or of principles equivalent to, or derived from, abstraction principles, see the discussion of dynamic abstraction in §5 below). An abstraction principle is a formula of the form:

{\sf A}_E : (\forall \alpha)(\forall \beta)[@(\alpha) = @(\beta) \leftrightarrow E(\alpha, \beta)]

where \alpha and \beta range over the same type (typically objects, concepts, n-ary relations, or sequences of such), E is an equivalence relation on entities of that type, and @ is a function from that type to objects. “@” is the abstraction operator, and terms of the form “@(\alpha)” are abstraction terms. The central idea underlying all forms of abstractionism is that abstraction principles serve to introduce mathematical concepts by providing identity conditions for the abstract objects falling under those concepts (that is, objects in the range of @_E) in terms of the equivalence relation E(X, Y).

Since this all might seem a bit esoteric at first glance, a few examples will be useful. One of the most well-discussed abstraction principles—one that we will return to when discussing the Caesar problem in §3 below—is the Directions Principle:

{\sf DP}: (\forall l_1)(\forall l_2)[d(l_1) = d(l_2) \leftrightarrow l_1 \parallel l_2]

where l_1 and l_2 are variables ranging over (straight) lines, x \parallel y is the parallelism relation, and d(\xi) is an abstraction operator mapping lines to their directions. Thus, a bit more informally, this principle says something like:

For any two lines l_1 and l_2, the direction of l_1 is identical to the direction of l_2 if and only if l_1 is parallel to l_2.

On an abstractionist reading, the Directions Principle introduces the concept direction, and it provides access to new objects falling under this concept—that is, directions—via abstraction. We partition the class of straight lines into equivalence classes, where each equivalence class is a collection of parallel lines (and any line parallel to a line in one of these classes is itself in that class), and then we obtain new objects—directions—by applying the abstraction operator d(x) to a line, resulting in the direction of that line (which will be the same object as the direction of any other line in the same equivalence class of parallel lines).

It should now be apparent that the Directions Principle is not the first abstraction principle that we have seen in this essay: the Ratio Principle is also an abstraction principle which serves, on an abstractionist reading, to introduce the concept ratio and whose abstraction operator x : y provides us with new objects falling under this concept.

The Directions Principle involves a unary objectual abstraction operator d(x): that is, the abstraction operator in the Directions Principle maps individual objects (that is, individual lines) to their abstracts (that is, their directions). The Ratio Principle is a bit more complicated. It involves a binary objectual abstraction operator: the abstraction operator maps pairs of objects (that is, pairs of magnitudes) to their abstracts (that is, the ratio of that pair). But the Directions Principle and the Ratio Principle have this much in common: the argument or arguments of the abstraction operator are objectual—they are objects.

It turns out, however, that much of the philosophical discussion of abstraction principles has focused on a different, and much more powerful, kind of abstraction principleconceptual abstraction principles. In a conceptual abstraction principle, the abstraction operator takes, not an object or sequence of objects, but a concept (or relation, or a sequence of concepts and relations, and so forth) as its argument. Here, we will be using the term “concept” in the Fregean sense, where concepts are akin to properties and are whatever it is that second-order unary variable range over, since, amongst other reasons, this is the terminology used by most of the philosophical literature on abstractionism. The reader uncomfortable with this usage can uniformly substitute “property” for “concept” throughout the remainder of this article.

Thus, a conceptual abstraction principle requires higher-order logic for its formation—for a comprehensive treatment of second- and higher-order logic, see (Shapiro 1991). The simplest kind of conceptual abstraction principle, and the kind to which we will restrict our attention in the remainder of this article, are unary conceptual abstraction principles of the form:

{\sf A}_E : (\forall X)(\forall Y)[@(X) = @(Y) \leftrightarrow E(X, Y)]

where X and Y are second-order variables ranging over unary concepts, and E(X, Y) is an equivalence relation on concepts.

The two most well-known and well-studied conceptual abstraction principles are Hume’s Principle and Basic Law V. Hume’s Principle is:

{\sf HP}: (\forall X)(\forall Y)[\#(X) = \#(Y) \leftrightarrow X \approx Y]

where X \approx Y abbreviates the purely logical second-order claim that there is a one-to-one onto mapping from X to Y, that is:

    \begin{align*}F \approx G =_{df} (\exists R)[&(\forall x)(F(x) \rightarrow (\exists ! y)(R(x,y) \land G(y))) \\ \land \ &(\forall x)(G(x) \rightarrow (\exists ! y)(F(y) \land R(y, x)))]\end{align*}

Hume’s Principle introduces the concept cardinal number and the cardinal numbers that fall under that concept. Basic Law V is:

{\sf BLV}: (\forall X)(\forall Y)[\S(X) = \S(Y) \leftrightarrow (\forall z)(X(z) \leftrightarrow Y(z))]

which (purports to) introduce the concept set or extension. As we shall see in the next section (and as is hinted in the parenthetical comment in the previous sentence), one of these abstraction principles does a decidedly better job than the other.

As already noted, although the process of abstraction has been a central philosophical concern since philosophers began thinking about mathematics, abstractionism only arose once abstraction principles were introduced. And, although he was not the first to use them—again, see (Mancosu 2016)—it is in the work of Gottlob Frege in the late 19th century that made abstraction principles first become a central concern in the philosophy of mathematics, and Frege’s logicism is the first defense of a full-blown version of abstractionism. Thus, we now turn to Frege.

3. Frege’s Logicism

Frege’s version of abstractionism is (appropriately enough, as we shall see) known as logicism. The primary motivation behind the project was to defend arithmetic and real and complex analysis (but interestingly, not geometry) from Kant’s charge that these areas of mathematics were a priori yet synthetic (Kant 1787/1999). The bulk of Frege’s defense of logicism occurs in his three great books, which can be summarized as follows:

  • Begriffsschrift, or Concept Script (Frege 1879/1972): Frege invents modern higher-order logic.
  • Die Grundlagen Der Arithmetic, or The Foundations of Arithmetic (Frege 1884/1980): Frege criticizes popular accounts of the nature of mathematics, and provides an informal exposition of his logicism.
  • Grundgesetze der Arithmetik, or Basic Laws of Arithmetic (Frege 1893/1903/2013): Frege further develops the philosophical details of his logicism, and carries out the formal derivations of the laws of arithmetic in an extension of the logic of Begriffsscrift.

Here we will examine a reconstruction of Frege’s logicism based on both the Grundlagen and Grundesetze. It should be noted, however, that there are subtle differences between the project informally described in the Grundlagen and the project carried out formally in Grundgesetze, differences we will for the most part ignore here. For discussion of some of these differences, see (Heck 2013) and (Cook & Ebert 2016). We will also carry out this re-construction in contemporary logical formalism, but it should also be noted that Frege’s logical system differs from contemporary higher-order logic in a number of crucial respects. For discussion of some of these differences, see (Heck 2013) and (Cook 2013).

As noted, Frege’s main goal was to argue that arithmetic was analytic. Frege’s understanding of the analytic/synthetic distinction, much like his account of the apriori/a posteriori distinction, has a decidedly epistemic flavor:

Now these distinctions between a priori and a posteriori, synthetic and analytic, concern not the content of the judgement but the justification for making the judgement. Where there is no such justification, the possibility of drawing the distinctions vanishes. An a priori error is thus as complete a nonsense as, say, a blue concept. When a proposition is called a posteriori or analytic in my sense, this is not a judgement about the conditions, psychological, physiological and physical, which have made it possible to form the content of the proposition in our consciousness; nor is it a judgement about the way in which some other man has come, perhaps erroneously, to believe it true; rather, it is a judgement about the ultimate ground upon which rests the justification for holding it to be true. (Frege 1884/1980, §3)

In short, on Frege’s view, whether or not a claim is analytic or synthetic, a priori or a posteriori, depends on the kind of justification that it would be appropriate to give for that judgment (or judgments of that kind). Frege fills in the details regarding exactly what sorts of justification are required for analyticity and aprioricity later in the same section:

The problem becomes, in fact, that of finding the proof of the proposition, and of following it up right back to the primitive truths. If, in carrying out this process, we come only on general logical laws and on definitions, then the truth is an analytic one, bearing in mind that we must take account also of all propositions upon which the admissibility of any of the definitions depends. If, however, it is impossible to give the proof without making use of truths which are not of a general logical nature, but belong to some special science, then the proposition is a synthetic one. For a truth to be a posteriori, it must be impossible to construct a proof of it without including an appeal to facts, that is, to truths which cannot be proved and are not general, since they contain assertions about particular objects. But if, on the contrary, its proof can be derived exclusively from general laws, which themselves neither need not admit of proof, then the truth is a priori. (Frege 1884/1980, §3)

Thus, for Frege, a judgment is analytic if and only if it has a proof that depends solely upon logical laws and definitions, and a judgment is a priori if and only if it has a proof that depends only upon self-evident, general truths. All logical laws and definitions are self-evident general truths, but not vice versa. This explains the fact mentioned earlier, that Frege did not think his logicism applicable to geometry. For Frege, geometry relied on self-evident general truths about the nature of space, but these truths were neither logical truths nor definitions—hence geometry was a priori, but not analytic.

Thus, Frege’s strategy for refuting Kant’s claim that arithmetic was synthetic was simple: logic (and anything derivable from logic plus definitions) is analytic, hence, if we reduce arithmetic to logic, then we will have shown that arithmetic is analytic after all (and similarly for real and complex analysis, and so forth).

Before digging in to the details of Frege’s attempt to achieve this reduction of arithmetic to logic, however, a few points of clarification are worth making. First, as we shall see below, not all versions of abstractionism are versions of logicism, since not all versions of abstractionism will take abstraction principles to be truths of logic. The converse fails as well: Not all versions of logicism are versions of abstractionism: (Tennant 1987) contains a fascinating constructivist, proof-theoretically oriented attempt to reduce arithmetic to logic that, although it involves operators that are typeset similarly to our abstraction operator \#(X), nevertheless involves no abstraction principles. Second, Frege’s actual primary target was neither to show that arithmetic was logical nor to show that it could be provided a foundation via abstraction generally or via abstraction principles in particular. His primary goal was to show that arithmetic was, contra Kant, analytic, and both the use of abstraction principles and the defense of these principles as logical truths were merely parts of this project. These distinctions are important to note, not only because they are, after all, important, but also because the terminology for the various views falling under the umbrella of abstractionism is not always straightforwardly accurate (for example neo-logicism is not a “new” version of logicism).

The first half of Grundlagen is devoted to Frege’s unsparing refutation of a number of then-current views regarding the nature of mathematical entities and the means by which we obtain mathematical knowledge, including the views put forth by Leibniz, Mill, and Kant. While these criticisms are both entertaining and, for the most part, compelling, it is Frege’s brief comments on Hume that are most relevant for our purposes. In his discussion of Hume, Frege misattributes a principle to him that becomes central both to his own project and to the later neo-logicist programs discussed below—the abstraction principle known (rather misleadingly) as Hume’s Principle.

a. Hume’s Principle and Frege’s Theorem

Frege begins by noting that Hume’s Principle looks rather promising, in many ways, as a potential definition of the concept cardinal number. First, despite the fact that this abstraction principle is likely not what Hume had in mind when he wrote that:

When two numbers are so combined as that the one has always an unit answering to every unit of the other we pronounce them equal; and it is for want of such a standard of equality in extension that geometry can scarce be esteemed a perfect and infallible science. (Hume 1888)[i.3.1]

Hume’s Principle nevertheless seems to codify a plausible idea regarding the nature of cardinal number: two numbers n and m are the same if and only if, for any two concepts X and Y where the number of Xs is n and the number of Ys is m, there is a one-one onto mapping from the Xs to the Ys. Second, and much more importantly for our purposes, Hume’s Principle, plus some explicit definitions formulated in terms of higher-order logic plus the abstraction operator \#, allows us to prove all of the second-order axioms of Peano Arithmetic:

Dedekind-Peano Axioms:

  1. \mathbb{N}(0)
  2. \neg(\exists x)(P(x, 0))
  3. (\forall x)(\mathbb{N}(x) \rightarrow (\exists y)(\mathbb{N}(y) \land P(x, y)))
  4. (\forall x)(\forall y)(\forall z)((P(x, z) \land P(y, z)) \rightarrow x = y)
  5. (\forall F)[F(0) \land (\forall x)(\forall y)((F(x) \land P(x, y)) \rightarrow F(y)) \rightarrow (\forall x)(\mathbb{N}(x) \rightarrow F(x))]

We can express the Peano Axioms a bit more informally as:

  1. Zero is a natural number.
  2. No natural number is the predecessor of zero.
  3. Every natural number is the predecessor of some natural number.
  4. If two natural numbers are the predecessor of the same natural number, then they are identical.
  5. Any property that holds of zero, and holds of a natural number if it holds of the predecessor of that natural number, holds of all natural numbers.

The definitions of zero, the predecessor relation, and the natural number predicate are of critical importance to Frege’s reconstruction of arithmetic. The definitions of zero and of the predecessor relation P(x, y) are relatively simple. Zero is just the cardinal number of the empty concept:

0 =_{df} \#(x \neq x)

The predecessor relation is defined as:

P(a, b) =_{df} (\exists F)(\exists y)[b = \#(F(x)) \land F(y) \land a = \#(F(x) \land x \neq y)]

Thus, P holds between two objects a and b (that is, a is the predecessor of b) just in case there is some concept F and object y falling under F such that b is the cardinal number of F (that is, it is the number of Fs) and a is the cardinal number of the concept that holds of exactly the objects that F holds of, except for y (that is, it is the number of the Fs that are not y).

Constructing the definition of the natural number concept \mathbb{N} is somewhat more complicated, however. First, we need to define the notion of a concept F(x) being hereditary on a relation R(x, y):

{\sf Her}[F(x), R(x, y)] =_{df} (\forall x)(\forall y)((F(x) \land R(x, y)) \rightarrow F(y))

Intuitively, F(x) is hereditary on R(x, y) if and only if, whenever we have two objects a and b, if a falls under the concept F(x), and a is related by R to b, then b must fall under F(x) as well.

Next, Frege uses hereditariness to define the strong ancestral of a relation R(x, y):

R^*(a, b) =_{df} (\forall F)[({\sf Her}[F(x), R(x, y)] \land (\forall x)(R(a, x) \rightarrow F(x))) \rightarrow F(b)]

The definition of the anscestral is imposing, but the idea is straightforward: given a relation R, the strong ancestral of R is a second relation R^* such that R^* holds between two objects a and b if and only if these is a sequence of objects:

a, c_1, c_2, \dots, c_n, b

such that:

R(a, c_1), R(c_1, c_2), R(c_2, c_3), \dots, R(c_{n-1}, c_n), R(c_n, b)

This operation is called the ancestral for a reason: the relation that holds between oneself and one’s ancestors is the ancestral of the parenthood relation.

For Frege’s purposes, a slightly weaker notion—the weak ancestral—turns out to be a bit more convenient:

R^{*=}(a, b) =_{df} R^*(a, b) \lor (a = b)

The weak ancestral of a relation R holds between two objects a and b  just in case either the strong ancestral does, or a and b are identical. Returning to our intuitive genealogical example, the difference between the weak ancestral and the strong ancestral of the parenthood relation is that the weak ancestral holds between any person and themselves. Thus, it is the strong ancestral that most closely corresponds to the everyday notion of ancestor, since we do not usually say that someone is their own ancestor.

Finally, we can define the natural numbers as those objects a such that the weak ancestral of the predecessor relation holds between zero and a:

\mathbb{N}(a) =_{df} P^{*=}(0, a)

In other words, an object is a natural number if and only if either it is 0, or 0 is its predecessor (that is, it is 1), or zero is the predecessor of its predecessor (that is, it is 2), or 0 is the predecessor of the predecessor of its predecessor (that is, it is 3), and so forth.

It is worth noting that all of this work defining the concept of natural number is, in fact, necessary. One might think at first glance that we could just take the following notion of cardinal number:

C(a) =_{df} (\exists Y)(a = \#(Y(x))

and use that instead of the much more complicated \mathbb{N}(x). This, however, won’t work: Since Hume’s Principle entails all of the Peano Axioms for arithmetic, it thereby entails that there are infinitely many objects (since there are infinitely many natural numbers). Hence there is a cardinal number—that is, an object falling under C(x)—that is not a finite natural number, namely anti-zero, the number of the universal concept (the term “anti-zero” is due to (Boolos 1997)):

\Omega =_{df} \#(x=x)

Infinite cardinals numbers like anti-zero do not satisfy the Peano Axioms (anti-zero is its own predecessor, for example), thus, if we are to do arithmetic based on Hume’s Principle, we need to restrict our attention to those numbers falling in \mathbb{N}(x).

In the Grundlagen Frege sketches a proof that, given these definitions, we can prove the Peano Axioms, and he carries it out in full formal detail in Grundgesetze. This result, which is a significant mathematical result independently of its importance to abstractionist accounts of the foundations of mathematics, has come to be known as Frege’s Theorem. The derivation of the Peano Axioms from Hume’s Principle plus these definitions is long and complicated, and we will not present it here. The reader interested in reconstructions of, and discussions of, the proof of Frege’s Theorem should consult (Wright 1983), (Boolos 1990a), (Heck 1993), and (Boolos & Heck 1998).

b. Hume’s Principle and the Caesar Problem

This all looks quite promising so far. We have an abstraction principle that introduces the concept cardinal number (and, as our definitions above demonstrate, the sub-concept natural number), and this abstraction principle entails a quite strong (second-order) version of the standard axioms for arithmetic. In addition, although Frege did not prove this, Hume’s Principle is consistent. We can build a simple model as follows. Let the domain be the natural numbers \mathbb{N}, and then interpret the abstraction operator \# as follows:

\#(P) =\begin{cases} n + 1, & \text{If $P$ holds of $n$-many objects in $\mathbb{N}$} \\ 0, & \text{Otherwise (that is, if $P$ holds of infinitely many objects in $\mathbb{N}$).} \end{cases}

This simple argument can be extended to show that Hume’s Principle has models whose domains are of size \kappa for any infinite cardinal \kappa (Boolos 1987). Thus, Hume’s Principle seems like a good candidate for an abstractionist definition of the concept cardinal number.

Frege, however, rejected the idea that Hume’s Principle could serve as a definition of cardinal number. This was not because he was worried that Hume’s Principle failed to be true, or even that it failed to be analytic. On the contrary, as we shall see below, Frege eventually proves a version of Hume’s Principle from other principles that he takes to be logical truths, and hence analytic. Thus, the proved version of Hume’s Principle (were Frege’s project successful) would inherit the analyticity of the principles used to prove it.

Frege instead rejects Hume’s Principle as a definition of the concept cardinal number because it does not settle questions regarding which particular objects the numbers are—questions that, on Frege’s view, an adequate definition should settle. In particular, although abstraction principles provide us with a criterion for determining whether or not two abstracts of the same kind—that is, two abstracts introduced by the same abstraction principle—are identical, they are silent with regard to whether, or when, an abstract introduced by an abstraction principle might be identical to an object introduced by some other means. Frege raises this problem with respect to Hume’s Principle as follows:

. . . but we can never—to take a crude example—decide by means of our definitions whether any concept has the number Julius Caesar belonging to it, or whether that conqueror of Gaul is a number or is not. (Frege 1884/1980, §55)

and he returns to the problem again, pointing out that the Directions Principle fares no better:

It will not, for instance, decide for us whether England is the same as the direction of the Earth’s axis—if I may be forgiven an example which looks nonsensical. Naturally no one is going to confuse England with the direction of the Earth’s axis; but that is no thanks to our definition of direction. (Frege 1884/1980, 66)

The former passage has led to this problem being known as the Caesar Problem.

The root of the Caesar Problem is this. Although abstraction principles provide criteria for settling identities between pairs of abstraction terms of the same type—hence Hume’s Principles provides a criterion for settling identities of the form:

\#(F) = \#(G)

for any concepts F and G—abstraction principles do not provide any guidance for settling identities where one of the terms is not an abstraction term. In short, and using our favorite example, Hume’s Principle provides no guidance for settling any identities of the form:

t = \#(F)

where t is not an abstraction term (hence t might be an everyday name like “England” or “Julius Caesar”). Both:

t = \#(F)

and:

t \neq \#(F)

can be consistently added to Hume’s Principle (although obviously not both at once).

Frege’s worry here is not that, as a result of this, we are left wondering whether the number seven really is identical to Julius Caesar. As he notes, we know that it is not. The problem is that an adequate definition of the concept natural number should tell us this, and Hume’s Principle fails to weigh in on the matter.

That being said, Frege’s worry does not stem from thinking that a definition of a mathematical concept should answer all questions about that concept (after all, the definition of cardinal number should not be expected to tell us what Frege’s favorite cardinal number was). Rather, Frege is concerned here with the idea that a proper definition of a concept should, amongst other things, draw a sharp line between those things that fall under the concept and those that do not—that is, a definition of a mathematical concept should determine the kinds of objects that fall under that concept. Hume’s Principle does not accomplish this, and thus it cannot serve as a proper definition of the concept in question. We will return to the Caesar Problem briefly in our discussion of neo-logicism below. But first, we need to look at Frege’s response.

c. Hume’s Principle and Basic Law V

Since Frege rejected the idea that Hume’s Principle could serve as a definition of cardinal number, but appreciated the power and simplicity that the reconstruction of Peano Arithmetic based on Hume’s Principle provided, he devised a clever strategy: to provide an explicit definition of cardinal number that depended on previously accepted and understood principles, and then derive Hume’s Principle using those principles and the explicit definition in question.

As a result, there are two main ingredients in Frege’s final account of the concept cardinal number. The first is the following explicit definition of the concept in question (noting that “equal” here indicates equinumerosity, not identity):

My definition is therefore as follows:

The number which belongs to the concept F is the extension of the concept “equal to the concept F”. (Frege 1884/1980, §68)

Thus, Frege’s definition of cardinal numbers specifies that the cardinal numbers are a particular type of extension. But of course, this isn’t very helpful until we know something about extensions. Thus, the second central ingredient in the account is a principle that governs extensions of concepts generally—a principle we have already seen: Basic Law V.

We should pause here to note that the version of Basic Law V that Frege utilized in Grundgesetze did not assign extensions to sets, but instead assigned value ranges to functions. Thus, a better (but still slightly anachronistic) way to represent Frege’s version of this principle would be something like:

(\forall f)(\forall g)[\S(f) = \S(g) \leftrightarrow (\forall z)(f(z) = g(z))]

where f and g range over unary functions from objects to objects. Since Frege thought that concepts were a special case of functions (in particular, a concept is a function that maps each object to either the true or the false), the conceptual version of Basic Law V given in §1 above is a special case of Frege’s basic law. Hence, we will work with the conceptual version here and below, since (i) this allows our discussion of Frege to align more neatly with our discussion of neo-logicism in the next section, and (ii) any derivation of a contradiction from a special case of a general principle is likewise a derivation of a contradiction from the general principle itself.

Given Basic Law V, we can formalize Frege’s definition of cardinal number as follows:

\#(F) =_{df} \S((\exists Y)(x = \S(Y) \land Y \approx F))

where \S is the abstraction operator found in Basic Law V, which maps each concept to its extension. In other words, on Frege’s account the cardinal number corresponding to a concept F is the extension (or “value-range”, in Frege’s terminology) of the concept which holds of an object x just in case it is the extension of a concept that is equinumerous to F.

Frege informally sketches a proof that Hume’s Principle follows from Basic Law V plus this definition in Grundlagen, and he provides complete formal proofs in Grundgesetze. For a careful discussion of this result, see (Heck 2013). Thus, Basic Law V plus this definition of cardinal number entails Hume’s Principle, which then (with a few more explicit definitions) entails full second-order Peano Arithmetic. So what went wrong? Why aren’t we all Fregean logicists?

d. Basic Law V and Russell’s Paradox

Before looking at what actually did go wrong, it is worth heading off a potential worry that one might have at this point. As already noted, Frege rejected Hume’s Principle as a definition of cardinal number because of the Caesar Problem. But Basic Law V, like Hume’s Principle, is an abstraction principle. And, given any abstraction principle:

{\sf A}_E : (\forall X)(\forall Y)[@(X) = @(Y) \leftrightarrow E(X, Y)]

if {\sf A}_E is consistent, then {\sf A}_E will entail neither:

t = \S(F)

nor:

t \neq \S(F)

(where t is not an abstraction term). Since Frege obviously believed that Basic Law V was consistent, he should have also realized that it fails to settle the very sorts of identity claims that led to his rejection of Hume’s Principle. Thus, shouldn’t Frege have rejected Basic Law V for the same reasons?

The answer is “no”, and the reason is simple: Frege did not take Basic Law V to be a definition of Extension. As just noted, he couldn’t, due to the Caesar Problem. Instead, Frege merely claims that Basic Law V is exactly that—a basic law, or a basic axiom of the logic that he develops in Grundgesetze. Frege never provides a definition of extension, and he seems to think that a definition of this concept is not required. For example, at the end of a footnote in Grundlagen suggesting that, in the definition of cardinal number given above, we could replace “extension of a concept” with just “concept”, he says that:

I assume that it is known what the extension of a concept is. (Frege 1884/1980, §69)

Thus, this is not the reason that Frege’s project failed.

The reason that Frege’s logicism did ultimately fail, however, is already hinted at in our discussion of Basic Law V and the Caesar Problem. Note that we took a slight detour through an arbitrary (consistent) abstraction principle in order to state that (non-)worry. The reason for this complication is simple: Basic Law V does prove one or the other of:

t = \S(F)

and:

t \neq \S(F)

In fact, it proves both (and any other formula, for that matter), because it is inconsistent.

In 1902, just as the second volume of Grundgesetze was going to press, Frege received a letter from a young British logician by the name of Bertrand Russell. In the letter Russell sketched a derivation of a contradiction within the logical system of Grundgesetze—one which showed the inconsistency of Basic Law V in particular. We can reconstruct the reasoning as follows. First, consider the (“Russell”) concept expressed by the following predicate:

R(x) =_{df} (\exists Y)(x = \S(Y) \land \neg Y(x))

Simply put, the Russell concept R holds of an object a just in case that object is the extension of a concept that does not hold of a. Now, clearly, if extensions are coherent at all, then the extension of this concept should be self-identical—that is:

\S(R(x)) = \S(R(x))

which, by the definition of R, gives us:

\S(R(x)) = \S(\exists Y)(x = \S(Y) \land \neg Y(x))

We then apply Basic Law V to obtain:

(\forall x)[R(x) \leftrightarrow (\exists Y)(x = \S(Y) \land \neg Y(x))]

An application of universal instantiation, replacing the variable x with \S(R(x)), provides:

R(\S(R(x))) \leftrightarrow (\exists Y)(\S(R(x)) = \S(Y) \land \neg Y(\S(R(x))))

The following is a truth of higher-order logic:

\neg R(\S(R(x))) \leftrightarrow (\exists Y)((\forall z)(R(z) \leftrightarrow Y(z)) \land \neg Y(\S(R(x))))

Given Basic Law V, however, the preceding claim is equivalent to:

\neg R(\S(R(x))) \leftrightarrow (\exists Y)(\S(R(x)) = \S(Y) \land \neg Y(\S(R(x))))

But now we combine this with the formula three lines up, to get:

R(\S(R(x))) \leftrightarrow \neg R(\S(R(x)))

an obvious contradiction.

This paradox is known as Russell’s Paradox, and is often presented in a somewhat different context—naïve set theory—where it involves, not Frege’s abstraction-principle based extension operator, but consideration of the set of all sets that are not members of themselves.

After receiving Russell’s letter, Frege added an Afterword to the second volume of Grundgesetze, where he proposed an amended version of Basic Law V that stated, roughly put, that two concepts receive the same extension if and only if they hold of exactly the same objects except possibly disagreeing on their (shared) extension. This version turned out to have similar problems. For a good discussion, see (Cook 2019).

Eventually, however, Frege abandoned logicism. Other efforts to reduce all of mathematics to logic were attempted, the most notable of which was Bertrand Russell and Alfred North Whitehead’s attempted reduction of arithmetic to a complicated logical theory known as ramified type theory in their three-volume Principia Mathematica (Russell & Whitehead 1910/1912/1913). But while the system of Principia Mathematica adopted Frege’s original idea of reducing mathematics to logic, it did not do so via the mobilization of abstraction principles, and hence is somewhat orthogonal to our concerns. The next major chapter in abstractionist approaches to mathematics would not occur for almost a century.

4. Neo-Logicism

The revival of abstractionism in the second half of the 20th century is due in no small part to the publication of Crispin Wright’s Frege’s Conception of Numbers as Objects (Wright 1983), although other publications from around this time, such as (Hodes 1984), explored some of the same ideas. In this work Wright notes that Hume’s Principle, unlike Basic Law V, is consistent. Thus, given Frege’s Theorem, which ensures that full second-order Peano Arithmetic follows from Hume’s Principle plus the definitions covered in the last section, we can arrive at something like Frege’s original logicist project if we can defend Hume’s Principle as (or as something much like) an implicit definition of the concept cardinal number. In a later essay Wright makes the point as follows:

Frege’s Theorem will ensure . . . that the fundamental laws of arithmetic can be derived within a system of second order logic augmented by a principle whose role is to explain, if not exactly to define, the general notion of identity of cardinal number. . . If such an explanatory principle . . . can be regarded as analytic, then that should suffice . . . to demonstrate the analyticity of arithmetic. Even if that term is found troubling, as for instance by George Boolos, it will remain that Hume’s Principle—like any principle serving to implicitly define a certain concept—will be available without significant epistemological presupposition . . . Such an epistemological route would be an outcome still worth describing as logicism. (Wright 1997, 210—211)

Subsequent work on neo-logicism has focused on a number of challenges.

The first, and perhaps most obvious, is to fully develop the story whereby abstraction principles are implicit definitions of mathematical concepts that not only provide us with terminology for talking about the abstract objects in question, but somehow guarantee that those objects exist. The account in question has been developed for the most part in individual and joint essays by Crispin Wright and Bob Hale—many of these essays are contained in the excellent collection (Hale & Wright 2001a). The central idea underlying the approach is a principle called the syntactic priority thesis, which, although it has its roots in Frege’s work, finds perhaps its earliest explicit statement in Wright’s Frege’s Conception of Numbers as Objects (but see also (Dummett 1956)):

When it has been established . . . that a given class of terms are functioning as singular terms, and when it has been verified that certain appropriate sentences containing them are, by ordinary criteria, true, then it follows that those terms do genuinely refer. (Wright 1983, 14)

This principle turns the intuitive account of the connection between singular terms and the objects to which they purport to refer on its head. Instead of explaining when a singular term refers, and to what it refers, in terms of (in some sense) prior facts regarding the existence of certain objects (in particular, the objects to which the terms in question purport to refer), the syntactic priority thesis instead explains what it is for certain sorts of object to exist in terms of (in some sense) prior facts regarding whether or not appropriate singular terms appear in true (atomic) sentences.

Wright and Hale then argue that, first, the apparent singular terms (that is, abstraction terms) appearing on the left-hand side of abstraction principles such as Hume’s Principle are genuine singular terms, and, second, that Hume’s Principle serves as a genuine definition of these terms, guaranteeing that there are true atomic sentences that contain those terms. In particular, since for any concept P:

(\forall z)(P(z) \leftrightarrow P(z))

is a logical truth, Hume’s Principle entails that any identity claim of the form:

\#(P) = \#(P)

is true. As a result, terms of the form \#(P) refer (and refer to the abstract objects known as cardinal numbers). Hence, both the existence of the abstract objects that serve as the subject matter of arithmetic, and our ability to obtain knowledge of such objects, is guaranteed.

a. Neo-Logicism and Comprehension

Another problem that the neo-logicist faces involves responding to Russell’s Paradox. Neo-logicism involves the claim that abstraction principles are implicit definitions of mathematical concepts. But, as Russell’s Paradox makes clear, it would appear that not every abstraction principle can play this role. Thus, the neo-logicist owes us an account of the line that divides the acceptable abstraction principles—that is, the ones that serve as genuine definitions of mathematical concepts—from those that are unacceptable.

Before looking at ways we might draw such a line between acceptable and unacceptable abstraction principles, it is worth noting that proceeding in this fashion is not forced upon the neo-logicist. In our presentation of Russell’s Paradox in the previous section, a crucial ingredient of the argument was left implicit. The second-order quantifiers in an abstraction principle such as Basic Law V range over concepts, and hence Basic Law V tells us, in effect, that each distinct concept receives a distinct extension. But, in order to get the Russell’s Paradox argument going, we need to know that there is a concept corresponding to the Russell predicate R(x).

Standard accounts of second-order logic ensure that there is a concept corresponding to each predicate by including the comprehension scheme:

Comprehension : For any formula \Phi(x) where X does not occur free in \Phi(y):

(\exists X)(\forall y)(X(y) \leftrightarrow \Phi(y))

Frege did not have an explicit comprehension principle in his logic, but instead had inference rules that amounted to the same thing. If we substitute R(x) in for \Phi(y) in the comprehension scheme, it follows that there is a concept corresponding to R(x), and hence we can run the Russell’s Paradox reasoning.

But now that we have made the role of comprehension explicit, another response to Russell’s Paradox becomes apparent. Why not reject comprehension, rather than rejecting Basic Law V? In other words, maybe it is the comprehension scheme that is the problem, and Basic Law V (and in fact any abstraction principle) is acceptable.

Of course, we don’t want to just drop comprehension altogether, since then we have no guarantee that any concepts exist, and as a result there is little point to the second-order portion of our second-order logic. Instead, the move being suggested is to replace comprehension with some restricted version that entails the existence of enough concepts that abstraction principles such as Hume’s Principle and Basic Law V can do significant mathematical work for us, but does not entail the existence of concepts, like the one corresponding to the Russell predicate, that lead to contradictions. A good bit of work has been done exploring such approaches. For example, we might consider reformulating the comprehension scheme so that it only applies to predicates \Phi(x) that are predicative (that is, contain no bound second-order variables) or are \Delta^1_1 (that is, are equivalent both to a formula all of whose second-order quantifiers are universal and appear at the beginning of the formula, and to a formula all of whose second-order quantifiers are existential and appear at the beginning of the formula). (Heck 1996) shows that Basic Law V is consistent with the former version of comprehension, and (Wehmeier 1999) and (Ferreira & Wehmeier 2002) show that Basic Law V is consistent with the latter (considerably stronger) version.

One problem with this approach is that if we restrict the comprehension principles used in our neo-logicist reconstruction of mathematical theories, then the quantifiers that occur in the theories so reconstructed are weakened as well. Thus, if we adopt comprehension restricted to some particular class of predicates, then even if we can prove the induction axiom for arithmetic:

(\forall F)[F(0) \land (\forall x)(\forall y)((F(x) \land P(x, y)) \rightarrow F(y)) \rightarrow (\forall x)(\mathbb{N}(x) \rightarrow F(x))]

it is not clear that we have what we want. The problem is that, in this situation, we have no guarantee that induction will hold of all predicates that can be formulated in our (second-order) language, but instead are merely guaranteed that induction will hold for those predicates that are in the restricted class to which our favored version of comprehension applies. It is not clear that this should count as a genuine reconstruction of arithmetic, since induction is clearly meant to hold for any meaningful condition whatsoever (and presumably any condition that can be formulated within second-order logic is meaningful). As a result, most work on neo-logicism has favored the other approach: retain full comprehension, accept that Basic Law V is inconsistent, and then search for philosophically well-motivated criteria that separate the good abstraction principles from the bad.

b. Neo-Logicism and the Bad Company Problem

At first glance, one might think that the solution to this problem is obvious: Can’t we just restrict our attention to the consistent abstraction principles? After all, isn’t that the difference between Hume’s Principle and Basic Law V—the former is consistent, while the latter is not? Why not just rule out the inconsistent abstraction principles, and be done with it?

Unfortunately, things are not so simple. First off, it turns out that there is no decision procedure for determining which abstraction principles are consistent and which are not. In other words, there is no procedure or algorithm that will tell us, of an arbitrary abstraction principle, whether that abstraction principle implies a contradiction (like Basic Law V) or not (like Hume’s Principle). See (Heck 1992) for a simple proof.

Second, and even more worrisome, is the fact that the class of individually consistent abstraction principles is not itself consistent. In other words, there are pairs of abstraction principles such that each of them is consistent, but they are incompatible with each other. A simple example is provided by the Nuisance Principle:

{\sf NP}: (\forall A)(\forall Y)[\mathcal{N}(X) = \mathcal{N}(y) \leftrightarrow {\sf Fin}((X \setminus Y) \cup (Y \setminus X))]

where {\sf Fin}(X) abbreviates the purely logical second-order claim that there are only finitely many Xs. This abstraction principle, first discussed in (Wright 1997), is a simplification of a similar example given in (Boolos 1990a). Informally, this principle says that the nuisance of X is identical to the nuisance of Y if and only if the collection of things that either fall under X but not Y, or fall under Y but not X, is finite. Even more simply, the nuisance of X is identical to the nuisance of Y if and only if X and Y differ on at most finitely many objects.

Now, the Nuisance Principle is consistent—in fact, it has models of size n for any finite number n. The problem, however, is that it has no infinite models. Since, as we saw in our discussion of Frege’s Theorem, Hume’s Principle entails the existence of infinitely many cardinal numbers, and thus all of its models have infinite domains, there is no model that makes both the Nuisance Principle and Hume’s Principle true. Thus, restricting our attention to consistent abstraction principles won’t do the job.

Unsurprisingly, Wright did not leave things there, and in the same essay in which he presents the Nuisance principle he proposes a solution to the problem:

A legitimate abstraction, in short, ought to do no more than introduce a concept by fixing truth conditions for statements concerning instances of that concept . . . How many sometime, someplace zebras there are is a matter between that concept and the world. No principle which merely assigns truth conditions to statements concerning objects of a quite unrelated, abstract kind—and no legitimate second-order abstraction can do any more than that—can possibly have any bearing on the matter. What is at stake . . . is, in effect, conservativeness in (something close to) the sense of that notion deployed in Hartry Field’s exposition of his nominalism. (Wright 1997, 296)

The reason that Wright invokes the version of conservativeness mobilized in (Field 2016) is that the standard notion of conservativeness found in textbooks on model theory won’t do the job. That notion is formulated as follows:

A formula \Phi in a language \mathcal{L}_1 is conservative over a theory \mathcal{T} in a language \mathcal{L}_2 where \mathcal{L}_2 \subseteq \mathcal{L}_1 if any only if, for any formula \Psi \in \mathcal{L}_2, if:

\Phi, \mathcal{T} \models \Psi

then:

\mathcal{T} \models \Psi

In other words, given a theory \mathcal{T}, a formula \Phi (usually involving new vocabulary not included in \mathcal{T}) is conservative over \mathcal{T} if and only if any formula in the language of \mathcal{T} that follows from the conjunction of \Phi and \mathcal{T} follows from \mathcal{T} alone. In other words, if \Phi is conservative over \mathcal{T}, then although \Phi may entail new things not entailed by \mathcal{T}, it entails no new things that are expressible in the language of \mathcal{T}.

Now, while this notion of conservativeness is extremely important in model theory, it is, as Wright realized, too strong to be of use here, since even Hume’s Principle is not conservative in this sense. Take any theory that is compatible with the existence of only finitely many things (that is, has finite models), and let {\sf Inf} abbreviate the purely logical second-order claim expressing that the universe contains infinitely many objects. Then:

{\sf HP}, \mathcal{T} \models {\sf Inf}

but:

\mathcal{T} \not\models {\sf Inf}

This example makes the problem easy to spot: Acceptable abstraction principles, when combined with our favorite theories, may well entail new claims not entailed by those theories. For example, Hume’s Principle entails that there are infinitely many objects. What we want to exclude are abstraction principles that entail new claims about the subject matter of our favorite (non-abstractionist) theories. Thus, Hume’s Principle should not entail that the subject matter of \mathcal{T} involves infinitely many objects unless \mathcal{T} already entails that claim. Hence, what we want is something like the following: An abstraction principle {\sf A}_E is conservative in the relevant sense if and only if, given any theory \mathcal{T} and formula \Phi about some domain of objects \Delta, if {\sf A}_E combined with \mathcal{T} restricted to its intended, non-abstract domain entails \Phi restricted to its intended, non-abstract domain, then \mathcal{T} (unrestricted) should entail \Phi (unrestricted). This will block the example above, since, if \mathcal{T} is our theory of zebras (to stick with Wright’s example), then although Hume’s Principle plus \mathcal{T} entails the existence of infinitely many objects, it does not entail the existence of infinitely many zebras (unless our zebra theory does).

We can capture this idea more precisely via the following straightforward adaptation of Field’s criterion to the present context:

{\sf A}_E is Field-conservative if and only if, for any theory \mathcal{T} and formula \Phi not containing @_E, if:

{\sf A}_E, \mathcal{T}^{\neg (\exists Y)(x = @_E(Y))} \models \Phi^{\neg (\exists Y)(x = @_E(Y))}

then:

\mathcal{T} \models \Phi

The superscripts indicate that we are restricting each quantifier in the formula (or set of formulas) in question to the superscripted predicate. Thus, given a formula \Phi and a predicate \Psi(x), we obtain \Phi^{\Psi(x)} by replacing each quantifier in \Phi with a new quantifier whose range is restricted to \Psi(x) along the following pattern:

(\forall x) \dots becomes (\forall x)(\Psi(x) \rightarrow \dots

(\exists x) \dots becomes (\exists x)(\Psi(x) \land \dots

(\forall X) \dots becomes (\forall X)((\forall x)(X(x) \rightarrow \Psi(x)) \rightarrow \dots

(\exists X) \dots becomes (\exists X)((\forall x)(X(x) \rightarrow \Psi(x)) \land \dots

Thus, according to this variant of conservativeness, an abstraction principle {\sf A}_E is conservative if, whenever that abstraction principle plus a theory \mathcal{T} whose quantifiers have been restricted to those objects that are not abstracts governed by {\sf A}_E entails a formula \Phi whose quantifiers have been restricted to those objects that are not abstracts governed by {\sf A}_E, then the theory \mathcal{T} (without such restriction) entails \Phi (also without such restriction).

Hume’s Principle (and many other abstraction principles) are conservative in this sense. Further, the idea that Field-conservativeness is a necessary condition on acceptable abstraction principles has been widely accepted in the neo-logicist literature. But Field-conservativeness, even combined with consistency, cannot be sufficient for acceptability, for a very simple (and now familiar-seeming) reason: It turns out that there are pairs of abstraction principles that are each both consistent and Field conservative, but which are incompatible with each other.

The first such pair of abstraction principles is presented in (Weir 2003). Here is a slight variation on his construction. First, we define a new equivalence relation:

    \begin{align*}X \Leftrightarrow Y =_{df} (\forall z)(X(z) \leftrightarrow Y(z)) \lor (&(\exists z)(\exists w)(X(z) \land X(w) \land z \neq w)\\ \land \ &(\exists z)(\exists w)(X(z) \land X(w) \land z \neq w))\end{align*}

In other words, ⇔ holds between two concepts X and Y if and only if either they both hold of no more than one object, and they hold of the same objects, or they both hold of more than one object. Next, let {\sf Limit} abbreviate the purely logical second-order formula expressing the claim that the size of the universe is a limit cardinal, and {\sf Succ} abbreviate the purely logical second-order formula expressing the claim that the size of the universe is a successor cardinal. (Limit cardinals and successor cardinals are types of infinite cardinal numbers. The following facts are all that one needs for the example to work: Every cardinal number is either a limit cardinal or a successor cardinal (but not both); given any limit cardinal, there is a larger successor cardinal; and given any successor cardinal, there is a larger limit cardinal. For proofs of these result, and much more information on infinite cardinal numbers, the reader is encouraged to consult (Kunen 1980)). Now consider:

    \begin{align*} {\sf A}_{E_1} : \ & (\forall X)(\forall Y)[@_1(X) = @_1(Y) \leftrightarrow ({\sf Limit} \land X \Leftrightarrow Y) \lor (\forall z)(X(z) \leftrightarrow Y(z))]\\ {\sf A}_{E_2} : \ & (\forall X)(\forall Y)[@_2(X) = @_2(Y) \leftrightarrow ({\sf Succ} \land X \Leftrightarrow Y) \lor (\forall z)(X(z) \leftrightarrow Y(z))] \end{align*}

Both {\sf A}_{E_1} and {\sf A}_{E_2} are consistent: {\sf A}_{E_1} is satisfiable on domains whose cardinality is an infinite limit cardinal and is not satisfiable on domains whose cardinality is finite or an infinite successor cardinal (on the latter sort of domains it behaves analogously to Basic Law V). Things stand similarly for {\sf A}_{E_2}, except with the roles of limit cardinals and successor cardinals reversed. Further, both principles are Field-conservative. The proof of this fact is complex, but depends essentially on the fact that ⇔ generates equivalence classes in such a way that, on any infinite domain, the number of equivalence classes of concepts is identical to the number of concepts. See (Cook & Linnebo 2018) for more discussion. But, since no cardinal number is both a limit cardinal and a successor cardinal, there is no domain that makes both principles true simultaneously. Thus, Field-conservativeness is not enough to guarantee that an abstraction principle is an acceptable neo-logicist definition of a mathematical concept.

The literature on Bad Company has focused on developing more nuanced criteria that we might impose on acceptable abstraction principles, and most of these have focused on three kinds of consideration:

  • Satisfiability: On what sizes of domain is the principle satisfiable?
  • Fullness: On what sizes of domain does the principle in question generate as many abstracts as there are objects in the domain?
  • Monotonicity: Is it the case that, if we move from one domain to a larger domain, the principle generates at least as many abstracts on the latter as it did on the former?

The reader is encouraged to consult (Cook & Linnebo 2018) for a good overview of the current state of the art with regard to proposals for dealing with the Bad Company problems that fall under one of (or a combination of) these three types of approach.

c. Extending Neo-Logicism Beyond Arithmetic

The next issue facing neo-logicism is extending the account to other branches of mathematics. The reconstruction of arithmetic from Hume’s Principle is (at least, in a technical sense), the big success story of neo-logicism, but if this is as far as it goes, then the view is merely an account of the nature of arithmetic, not an account of the nature of mathematics more generally. Thus, if the neo-logicist is to be successful, then they need to show that the approach can be extended to all (or at least much of) mathematics.

The majority of work done in this regard has focused on the two areas of mathematics that tend, in addition to arithmetic, to receive the most attention in the foundations of mathematics: set theory and real analysis. Although this might seem at first glance to be somewhat limited, it is well-motivated. The neo-logicist has already reconstructed arithmetic using Hume’s Principle, which shows that neo-logicism can handle (countably) infinite structures. If the neo-logicist can reconstruct real analysis, then this would show that the account can deal with continuous mathematical structures. And if the neo-logicist can reconstruct set theory as well, then this would show that the account can handle arbitrarily large transfinite structures. These three claims combined would make a convincing case for the claim that most if not all of modern mathematics could be so reconstructed. Neo-logicist reconstructions of real analysis have followed the pattern of Dedekind-cut-style set-theoretic treatments of the real numbers. They begin with the natural numbers as given to us by Hume’s Principle. We then use the (ordered) Pairing Principle:

{\sf Pair} : (\forall x)(\forall y)(\forall z)(\forall w)[\langle x, y \rangle = \langle z, w \rangle \leftrightarrow (x = z \land y = w)]

to obtain pairs of natural numbers, and then apply another principle that provides us with a copy of the integers as equivalence classes of pairs of natural numbers. We then use the Pairing Principle again, to obtain ordered pairs of these integers, and then apply another principle to obtain a copy of the rational numbers as equivalence classes of ordered pairs of our copy of the integers. Finally, we use another principle to obtain an abstract corresponding to each “cut” on the natural ordering of the rational numbers, obtaining a collection of abstracts isomorphic to the standard real numbers.

Examples of this sort of reconstruction of the real numbers can be found in (Hale 2000) and (Shapiro 2000). There is, however, a significant difference between the two approaches found in these two papers. Shapiro’s construction halts when he has applied the abstraction principle that provides an abstract for each “cut” on the copy of the rationals, since at this point we have obtained a collection of abstracts whose structure is isomorphic to the standard real numbers. Hale’s, construction, however, involves one more step: he applies a version of the Ratio Principle discussed earlier to this initial copy of the reals, and claims that the structure that results consists of the genuine real numbers (and the abstracts from the prior step, while having the same structure, were merely a copy—not the genuine article).

The difference between the two approaches stems from a deeper disagreement with regard to what, exactly, is required for a reconstruction of a mathematical theory to be successful. The disagreement traces back directly to Frege, who writes in Grundgesetze that:

So the path to be taken here steers between the old approach, still preferred by H. Hankel, of a geometrical foundation for the theory of irrational numbers and the approaches pursued in recent times. From the former we retain the conception of a real number as a magnitude-ratio, or measuring number, but separate it from geometry and indeed form all specific kinds of magnitudes, thereby coming closer to the more recent efforts. At the same time, however, we avoid the emerging problems of the latter approaches, that either measurement does not feature at all, or that it features without any internal connection grounded in the nature of the number itself, but is merely tacked on externally, from which it follows that we would, strictly speaking, have to state specifically for each kind of magnitude how it should be measured, and how a number is thereby obtained. Any general criteria for where the numbers can be used as measuring numbers and what shape their application will then take, are here entirely lacking.

So we can hope, on the one hand, not to let slip away from us the ways in which arithmetic is applied in specific areas of knowledge, without, on the other hand, contaminating arithmetic with the objects, concepts, relations borrowed from these sciences and endangering its special nature and autonomy. One may surely expect arithmetic to present the ways in which arithmetic is applied, even though the application itself is not its subject matter. (Frege 1893/1903/2013, §159)

Wright sums up Frege’s idea here nicely:

This is one of the clearest passages in which Frege gives expression to something that I propose we call Frege’s Constraint: that a satisfactory foundation for a mathematical theory must somehow build its applications, actual and potential, into its core—into the content it ascribes to the statements of the theory—rather than merely “patch them on from the outside.” (Wright 2000, 324)

The reason for Hale’s extra step should now be apparent. Hale accepts Frege’s constraint, and further, he agrees with Frege that a central part of the explanation for the wide-ranging applicability of the real numbers within science is the fact that they are ratios of magnitudes. At the penultimate step of his construction (the one corresponding to Shapiro’s final step) we have obtained a manifold of magnitudes, but the final step is required in order to move from the magnitudes themselves to the required ratios. Shapiro, on the other hand, is not committed to Frege’s Constraint, and as a result is satisfied with merely obtaining a collection of abstract objects whose structure is isomorphic to the structure of the real numbers. As a result, he halts a step earlier than Hale does. This disagreement with regard to the role that Frege’s constraint should play within neo-logicism remains an important point of contention amongst various theorists working on the view.

The other mathematical theory that has been a central concern for neo-logicism is set theory. The initially most obvious approach to obtaining a powerful neo-logicist theory of sets—Basic Law V—is of course inconsistent, but the approach is nevertheless attractive, and as a result the bulk of work on neo-logicist set theory has focused on various ways that we might restrict Basic Law V so that the resulting principle is both powerful enough to reconstruct much or all of contemporary work in set theory yet also, of course, consistent. The principle along these lines that has received by far the most attention is the following principle proposed in (Boolos 1989):

{{\sf NewV} : (\forall X)(\forall Y)[\S_{\sf NewV}(X) = \S_{\sf NewV}(Y) \leftrightarrow ({\sf Big}(X) \land {\sf Big}(Y)) \lor (\forall z)(X(z) \leftrightarrow Y(z))]}

where {\sf Big}(X) is an abbreviation for the purely logical second-order claim that there is a mapping from the Xs onto the universe—that is:

{{\sf Big}(X) =_{df} (\exists R)(\forall y)(\exists z)(X(z) \land R(z, y)) \land (\forall y)(\forall z)(\forall w)(R(y, w) \land R(y, w) \rightarrow z = w)}

{\sf NewV} behaves like Basic Law V on concepts that hold of fewer objects than are contained in the domain as a whole, providing each such concept with its own unique extension, but it maps all concepts that hold of as many objects as there are in the domain as a whole to a single, “dummy” object. This principle is meant to capture the spirit of Georg Cantor’s analysis of the set-theoretic paradoxes. According to Cantor, those concepts that do not correspond to a set (for example, the concept corresponding to the Russell predicate) fail to do so because they are in some sense “too big” (Hallett 1986).

{\sf NewV} is consistent, and, given the following definitions:

    \begin{align*}{\sf Set}(x) &=_{df} (\exists Y)(x = \S_{\sf NewV}(Y) \land \neg{\sf Big}(Y))\\ x \in y &=_{df} (\exists Z)(Z(x) \land y = \S_{\sf NewV}(Z))\end{align*}

it entails the extensionality, empty set, pairing, separation, and replacement axioms familiar from Zermelo-Fraenkel set theory (ZFC), and it also entails a slightly reformulated version of the union axiom. It does not entail the axioms of infinity, powerset, or foundation.

{\sf NewV} is not Field-conservative, however, since it implies that there is a well-ordering on the entire domain—see (Shapiro & Weir 1999) for details. Since, as we saw earlier, there is wide agreement that acceptable abstraction principles ought to be conservative in exactly this sense, neo-logicists will likely need to look elsewhere for their reconstruction of set theory.

Thus, while current debates regarding the reconstruction of the real numbers concern primarily philosophical issues, or which of various technical reconstructions is to be preferred based on philosophical considerations such as Frege’s Constraint, there remains a very real question regarding whether anything like contemporary set theory can be given a mathematically adequate reconstruction on the neo-logicist approach.

d. Neo-Logicism and the Caesar Problem

The final problem that the neo-logicist is faced with is one that is already familiar: the Caesar Problem. Frege, of course, side-stepped the Caesar Problem by denying, in the end, that abstraction principles such as Hume’s Principle or Basic Law V were definitions. But the neo-logicist accepts that these abstraction principles are (implicit) definitions of the mathematical concepts in question. An adequate definition of a mathematical concept should meet the following two desiderata:

  • Identity Conditions: An adequate definition should explicate the conditions under which two entities falling under that definition are identical or distinct.
  • Demarcation Conditions: An adequate definition should explicate the conditions under which an entity falls under that definition or not.

In short, if Hume’s Principle is to serve as a definition of the concept cardinal number, then it should tell us when two cardinal numbers are the same, and when they are different, and it should tell us when an object is a cardinal number, and when it is not. As we have already seen, Hume’s Principle (and other abstraction principles) do a good job on the first task, but fall decidedly short on the second.

Neo-logicist solutions to the Caesar Problem typically take one of three forms. The first approach is to deny the problem, arguing that it does not matter if the object picked out by the relevant abstraction term of the form \#(P) really is the number two, so long as that object plays the role of two in the domain of objects that makes Hume’s Principle true (that is, so long as it is appropriately related to the other objects referred to by other abstraction terms of the form \#(Q)). Although this is not the target of the essay, the discussion of the connections between logicism and structuralism about mathematics in (Wright 2000) touches on something like this idea. The second approach is to argue that, although abstraction principles as we have understood them here do not settle identity claims of the form t = @(P) (where t is not an abstraction term), we merely need to reformulate them appropriately. Again, although the Caesar Problem is not the main target of the essay, this sort of approach is pursued in (Cook 2016), where versions of abstraction principles involving modal operators are explored. Finally, the third approach involves admitting that abstraction principles alone are susceptible to the Caesar Problem, but arguing that abstraction principles alone need not solve it. Instead, identities of the form t = @(P) (where t is not an abstraction term) are settled via a combination of the relevant abstraction principle plus additional metaphysical or semantic principles. This is the approach taken in (Hale & Wright 2001b), where the Caesar Problem is solved by mobilizing additional theoretical constraints regarding categories—that is, maximal sortal concepts with uniform identity conditions—arguing that objects from different categories cannot be identical.

Before moving on to other versions of abstractionism, it is worth mentioning a special case of the Caesar Problem. Traditionally, the Caesar Problem is cast as a puzzle about determining the truth conditions of claims of the form:

t = @(P)

where t is not an abstraction term. But there is a second sort of worry that arises along these lines, one that involves identities where each term is an abstraction term, but they are abstraction terms governed by distinct abstraction principles. For concreteness, consider two distinct (consistent) conceptual abstraction principles:

    \begin{align*} {\sf A}_{E_1} : \ &(\forall X)(\forall Y)[@_1(X) = @_1(Y) \leftrightarrow E_1(X, Y)]\\ {\sf A}_{E_2} : \ &(\forall X)(\forall Y)[@_2(X) = @_2(Y) \leftrightarrow E_2(X, Y)] \end{align*}

For reasons similar to those that underlie the original Caesar Problem, the conjunction of these two principles fails to settle any identities of the form:

@_1(P) = @_2(Q)

This problem, which has come to be called the \mathbb{C}\text{-}\mathbb{R} Problem (since one particular case would be when {\sf A}_{E_1} introduces the complex numbers, and {\sf A}_{E_2} introduces the real numbers) is discussed in (Fine 2002) and (Cook & Ebert 2005). The former suggests (more for reasons of technical convenience than for reasons of philosophical principle) that we settle such identities by requiring that identical abstracts correspond to identical equivalence classes. Thus, given the two abstraction principles above, we would adopt the following additional Identity Principle:

{\sf IP}: (\forall X)(\forall Y)[@_1(X) = @_2(Y) \leftrightarrow (\forall Z)(E_1(Z, X) \leftrightarrow E_2(Z, Y))]

If, for example, we apply the Identity Principle to the abstracts governed by {\sf NewV} and those governed by Hume’s Principle, then we can conclude that:

\S_{\sf NewV}(x \neq x) = \#(x \neq x)

That is, \varnothing = 0. After all, the equivalence class of concepts containing the empty concept according to the equivalence relation mobilized in {\sf NewV} is identical to the equivalence class of concepts containing the empty concept according to the equivalence relation mobilized in Hume’s Principle (both contain the empty concept and no other concepts). But the following claim would turn out to be false (where a is any term):

\S_{\sf NewV}(x = a) = \#(x = a)

That is, for any object \alpha, \{\alpha\} \neq 1. Again, the reason is simple. The equivalence class given by the equivalence relation from {\sf NewV}, applied to the concept that holds of a and a alone, gives us an equivalence class that contains only that concept, while the equivalence class given by the equivalence relation from Hume’s Principle, applied to the concept that holds of a and a alone, gives us an equivalence class that contains any concept that holds of exactly one object.

While this solution is technically simple and elegant, (Cook & Ebert 2005) raises some objections. The most striking of which is a generalization of the examples above: Cook and Ebert suggest that any account that makes some numbers (in particular, zero) identical to some sets (in particular, the empty set), but does not either entail that all numbers are sets, or that all sets are numbers, is metaphysically suspect at best.

5. Dynamic Abstraction

Now that we’ve looked closely at both Frege’s logicist version of abstractionism and contemporary neo-logicism, we’ll finish up this essay by taking a brief look at another variation on the abstractionist theme.

Øystein Linnebo has formulated a version of abstractionism—dynamic abstraction—that involves modal notions, but in a way very different from the way in which these notions are mobilized in more traditional work on neo-logicism (Linnebo 2018). Before summarizing this view, however, we need to note that this account presupposes a rather different reading of the second-order variable involved in conceptual abstraction principles—the plural reading. Thus, a formula of the form:

(\exists X) \Phi(X)

should not be read as:

There is a concept X such that \Phi holds of X.

but rather as:

There are objects—the Xs—such that those objects are \Phi.

We will continue to use the same notation as before, but the reader should keep this difference in mind.

Linnebo begins the development of his novel version of abstractionism by pointing out that Basic Law V can be recast as a pair of principles. The first:

(\forall X)(\exists y)(y = \S(X))

says that every plurality of objects has an extension, and the second:

{(\forall X)(\forall Y)(\forall z)(\forall w)[(z = \S(X) \land w = \S(Y)) \rightarrow (z = w \leftrightarrow (\forall v)(X(v) \leftrightarrow Y(v)))]}

says that given any two pluralities and their extensions, the latter are identical just in case the former are co-extensive.

Linnebo then reformulates these principles, replacing identities of the form x = \S(Y) with a relational claim of the form {\sf Set}(Y, x) (this is mostly for technical reasons, involving the desire to avoid the need to mobilize free logic within the framework). {\sf Set}(X, y) should be read as:

y is the set of Xs.

We then obtain what he calls the principle of Collapse:

{\sf Coll}: (\forall X)(\exists y)({\sf Set}(X, y))

and the principle of Extensionality:

{{\sf Ext}: (\forall X)(\forall Y)(\forall z)(\forall w)[({\sf Set}(X, z) \land {\sf Set}(Y, w)) \rightarrow (z = w \leftrightarrow (\forall v)(X(v) \leftrightarrow Y(v)))]}

which says that given any two pluralities and the corresponding sets, the latter are identical just in case the former are co-extensive. Now, these principles are jointly just as inconsistent as the original formulation of Basic Law V was. But Linnebo suggests a new way of conceiving of the process of abstraction: We understand the universal quantifiers in these principles to range over a given class of entities, and the existential quantifiers then give us new entities that are abstracted off of this prior ontology. As a result, one gets a dynamic picture of abstraction: instead of a abstraction principle describing the abstracts that arise as a result of consideration of all objects—including all abstracts—in a static, unchanging universal domain, we instead conceive of our ontology in terms of an ever-expanding series of domains, obtained via application of the extensions-forming abstraction operation on each domain to obtain a new, more encompassing domain.

Linnebo suggests that we can formalize these ideas precisely via adopting a somewhat non-standard application of the modal operators \Box and \Diamond. Loosely put, we read \Box \Phi as saying “on any domain, \Phi” and \Diamond \Phi as saying “the domain can be expanded such that \Phi”. Using these operators, we can formulate new, dynamic versions of Collapse and Extension. The modalized version of Collapse

{\sf Coll^\Diamond}: \Box(\forall X)\Diamond(\exists y)({\sf Set}(X, y))

says that, given any domain and any plurality of objects from that domain, there is a (possibly expanded) domain where the set containing the members of that plurality exists, and the modalized version of Extension:

{{\sf Ext^\Diamond}: (\forall X)(\forall Y)(\forall z)(\forall w)[({\sf Set}(X, z) \land {\sf Set}(Y, w)) \rightarrow (z = w \leftrightarrow \Box(\forall v)(X(v) \leftrightarrow Y(v)))]}

says that, given any pluralities and the sets corresponding to them, the latter are identical if and only if the former are necessarily coextensive (note that a plurality, unlike a concept, has the same instances in every world). This version of Basic Law V, which entails many of the standard set-theoretic axioms, is consistent. In fact, it is consistent with a very strong, modal version of comprehension for pluralities (Linnebo 2018, 68). Thus, the dynamic abstraction approach, unlike the neo-logicism of Wright and Hale, allows for a particularly elegant abstractionist reconstruction of set theory.

Of course, if the dynamic version of Basic Law V is consistent on this approach, then the dynamic version of any abstraction principle is. As a result, given any neo-logicist abstraction principle:

{\sf A}_E : (\forall X)(\forall Y)[@_E(X) = @_E(Y) \leftrightarrow E(X, Y)]

there will be a corresponding pair of dynamic principles:

{\sf Coll}^\Diamond_E : \Box(\forall X)\Diamond(\exists y)({y = @_E(X)}\sf Abs(X, y))

and:

{{\sf Ext}^\Diamond_E: (\forall X)(\forall Y)(\forall z)(\forall w)[({\sf Abs}_E(X, z) \land {\sf Abs}_E(Y, w)) \rightarrow (z = w \leftrightarrow \Box E(X, Y))]}

where {\sf Abs}_E(X, y) says something like:

y is the E-abstract of the Xs.

And {\sf Coll}^\Diamond_E and {\sf Ext}^\Diamond_E, unlike {\sf A}_E, are guaranteed to be (jointly) consistent.

Thus, although Linnebo must still grapple with the Caesar Problem and many of the other issues that plague neo-logicism—and the reader is encouraged to consult the relevant chapters of (Linnebo 2018) to see what he says in this regard—his dynamic abstraction account does not suffer from the Bad Company Problem: all forms of abstraction, once they are re-construed dynamically, are in Good Company.

6. References and Further Reading

  • Aristotle, (1975), Posterior Analytics J. Barnes (trans.), Oxford: Oxford University Press.
  • Bueno, O. & Ø. Linnebo (eds.) (2009), New Waves in Philosophy of Mathematics, Basingstoke UK: Palgrave.
  • Boolos, G. (1987), ‘The Consistency of Frege’s Foundations of Arithmetic, in (Thompson 1987): 211—233.
  • Boolos, G. (1989), “Iteration Again”, Philosophical Topics 17(2): 5—21.
  • Boolos, G. (1990a), “The Standard of Equality of Numbers”, in (Boolos 1990b): 3—20.
  • Boolos, G. (ed.) (1990b), Meaning and Method: Essays in Honor of Hilary Putnam, Cambridge: Cambridge University Press.
  • Boolos, G. (1997) “Is Hume’s Principle Analytic?”, in (Heck 1997): 245—261.
  • Boolos, G. (1998), Logic, Logic, and Logic, Cambridge MA: Harvard University Press.
  • Boolos, G. & R. Heck (1998), “Die Grundlagen der Arithmetik §82—83”, in (Boolos 1998): 315—338.
  • Cook, R. (ed.) (2007), The Arché Papers on the Mathematics of Abstraction, Dordrecht: Springer.
  • Cook, R. (2009), “New Waves on an Old Beach: Fregean Philosophy of Mathematics Today”, in (Bueno & Linnebo 2009): 13—34.
  • Cook, R. (2013), “How to Read Frege’s Grundgesetze” (Appendix to (Frege 1893/1903/2013): A1—A41.
  • Cook, R. (2016), “Necessity, Necessitism, and Numbers” Philosophical Forum 47: 385—414.
  • Cook, R. (2019), “Frege’s Little Theorem and Frege’s Way Out”, in (Ebert & Rossberg 2019): 384—410.
  • Cook, R. & P. Ebert (2005), “Abstraction and Identity”, Dialectica 59(2): 121—139.
  • Cook, R. & P. Ebert (2016), “Frege’s Recipe”, The Journal of Philosophy 113(7): 309—345.
  • Cook, R. & Ø. Linnebo (2018), “Cardinality and Acceptable Abstraction”, Notre Dame Journal of Formal Logic 59(1): 61—74.
  • Dummett, M. (1956), “Nominalism”, Philosophical Review 65(4):491—505.
  • Dummett, M. (1991), Frege: Philosophy of Mathematics. Cambridge MA: Harvard University Press.
  • Ebert, P. & M. Rossberg (eds.) (2016), Abstractionism: Essays in Philosophy of Mathematics, Oxford: Oxford University Press.
  • Ebert, P. & M. Rossberg (eds.) (2019), Essays on Frege’s Basic Laws of Arithmetic, Oxford: Oxford University Press.
  • Euclid (2012), The Elements, T. Heath (trans.), Mineola, New York: Dover.
  • Ferreira, F. & K. Wehmeier (2002), “On the Consistency of the \Delta^1_1CA Fragment of Frege’s Grundgesetze”, Journal of Philosophical Logic 31: 301—311.
  • Field, H (2016), Science Without Numbers, Oxford: Oxford University Press.
  • Fine, K. (2002), The Limits of Abstraction, Oxford: Oxford University Press.
  • Frege, G. (1879/1972) Conceptual Notation and Related Articles (T. Bynum trans.), Oxford: Oxford University Press.
  • Frege, G. (1884/1980), Die Grundlagen der Arithmetik (The Foundations of Arithmetic) 2nd Ed., J. Austin (trans.), Chicago: Northwestern University Press.
  • Frege, G. (1893/1903/2013) Grundgesetze der Arithmetik Band I & II (The Basic Laws of Arithmetic Vols. I & II) P. Ebert & M. Rossberg (trans.), Oxford: Oxford University Press.
  • Hale, B. (2000), “Reals by Abstraction”, Philosophia Mathematica 8(3): 100—123.
  • Hale, B. & C. Wright (2001a), The Reason’s Proper Study, Oxford: Oxford University Press.
  • Hale, B. & C. Wright (2001b), “To Bury Caesar. . . ”, in (Hale & Wright 2001a): 335—396.
  • Hallett, M. (1986), Cantor Set Theory and Limitation of Size, Oxford: Oxford University Press.
  • Heck, R. (1992), “On the Consistency of Second-Order Contextual Definitions”, Nous 26: 491—494.
  • Heck, R. (1993), “The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik ”, Journal of Symbolic Logic 10: 153—174.
  • Heck, R. (1996), “The Consistency of Predicative Fragments of Frege’s Grundgesetze der Arithmetik ”, History and Philosophy of Logic 17: 209—220.
  • Heck, R. (ed.) (1997), Language, Thought, and Logic: Essays in Honour of Michael Dummett, Oxford: Oxford University Press.
  • Heck, R. (2013), Reading Frege’s Grundgesetze, Oxford: Oxford University Press.
  • Hodes, H. (1984), “Logicism and the Ontological Commitments of Artihmetic”, The Journal of Philosophy 81: 123—149.
  • Hume, D. (1888), A Treatise of Human Nature, Oxford: Clarendon Press.
  • Kant, I. (1987/1999), Critique of Pure Reason, P. Guyer & A. Wood (trans.), Cambridge: Cambridge University Press.
  • Kunen, K. (1980), Set Theory: An Introduction to Independence Proofs, Amsterdam: North Holland.
  • Linnebo, Ø. (2018), Thin Objects: An Abstractionist Account, Oxford: Oxford University Press.
  • Mancosu, P. (2016), Abstraction and Infinity, Oxford: Oxford University Press.
  • Russell, B. & A. Whitehead (1910/1912/1913) Principia Mathematica Volumes 1—3, Cambridge, Cambridge University Press.
  • Shapiro, S. (1991), Foundations without Foundationalism: The Case for Second-Order Logic, Oxford: Oxford University Press.
  • Shapiro, S. (2000), “Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis”, Notre Dame Journal of Formal Logic 41(4): 335—364.
  • Shapiro, S. & A. Weir (1999), “New V, ZF, and Abstraction”, Philosophia Mathematica 7(3): 293—321.
  • Tennant, N. (1987) Anti-Realism and Logic: Truth as Eternal, Oxford: Oxford University Press.
  • Thompson, J. ed. (1987), On Being and Saying: Essays in Honor of Richard Cartwright, Cambridge MA: MIT Press.
  • Wehmeier, K. (1999), “Consistent Fragments of Grundgesetze and the Existence of Non-Logical Objects”, Synthese 121: 309—328.
  • Weir, A., (2003), “Neo-Fregeanism: An Embarrassment of Riches?” Notre Dame Journal of Formal Logic 44(1): 13—48.
  • Wright, C. (1983), Frege’s Conception of Numbers as Objects, Aberdeen: Aberdeen University Press.
  • Wright, C. (1997), “On the Philosophical Significance of Frege’s Theorem”, in (Heck 1997): 201—244.
  • Wright, C. (2000), “Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint”, Notre Dame Journal of Formal Logic 41(4): 317—344.

 

Author Information

Roy T. Cook
Email: cookx432@umn.edu
University of Minnesota
U. S. A.